Skip to main content

Start Using Ollama + Python (Phi4)

1. Introduction to Ollama

Ollama is a powerful tool for locally hosted AI models, offering an easy way to work with machine learning models on your own hardware. Its focus on privacy, accessibility, and performance makes it a great choice for developers building AI-powered applications.

Why Use Ollama?

  • Privacy: Data never leaves your machine.
  • Performance: Optimized for local execution with minimal latency.
  • Flexibility: Compatible with various Python workflows and frameworks.

2. Prerequisites

Before starting, ensure you have the following:

  1. Python: Install Python 3.10 or newer.

  2. Pip: Ensure Pip, the Python package manager, is installed.

    • Check Pip version:
      pip --version
  3. Hardware Requirements:

    • A modern CPU with at least 4 cores.
    • GPU acceleration (optional but recommended for large models).
  4. Install Ollama:


3. Setting Up Your Python Environment

Step 1: Create a Virtual Environment

Using a virtual environment ensures your dependencies are isolated:

python -m venv ollama-env
source ollama-env/bin/activate # On Windows, use `ollama-env\Scripts\activate`

Step 2: Install Required Libraries

Install the Python bindings and dependencies for Ollama:

pip install ollama

Step 3: Verify Installation

Run a quick check to ensure proper installation and display the help menu:

ollama -h

4. Running Your First Ollama Model

Step 1: Download a Model

Ollama supports various pre-trained models. Use the Ollama CLI to download one:

ollama pull phi4

Step 2: Load and Chat with Phi4

Here’s a simple command to load a model and run inference:

ollama run phi4

Step 3: Explore Model Outputs

Experiment with different inputs and observe the outputs. For example:

  • Generate summaries.
  • Answer questions.
  • Translate text.

5. Interacting With Phi4 via Python Scripts

Basic Interactive Chat

Create an interactive chat session with the model using a simple Python script

  1. Create a new file and name it "chat.py"
import ollama
import time
from threading import Thread
import sys

def show_status(stop_event):
try:
while not stop_event.is_set():
for status in ["Generating response.", "Generating response..", "Generating response..."]:
if stop_event.is_set():
break
sys.stdout.write(f"\r{status} ")
sys.stdout.flush()
time.sleep(0.5)
except KeyboardInterrupt:
pass

def chat_with_model():
from threading import Event
try:
client = ollama.Client()

print("Chat started! Type 'exit' to end the conversation.")

while True:
user_input = input("\nYou: ")
if user_input.lower() == 'exit':
print("Goodbye!")
break


messages = [
{"role": "user", "content": user_input}
]


stop_event = Event()
status_thread = Thread(target=show_status, args=(stop_event,), daemon=True)
status_thread.start()

try:

response = client.chat(model="phi4", messages=messages)
finally:

stop_event.set()
status_thread.join()
sys.stdout.write("\r" + " " * 30 + "\r")
sys.stdout.flush()

print("\nAI:", response["message"]["content"] if 'message' in response else "No response available.")

except KeyboardInterrupt:
print("\nConversation ended.")
except Exception as e:
print("An unexpected error occurred:", str(e))

if __name__ == "__main__":
chat_with_model()
  1. Run the script:
python chat.py

7. Troubleshooting Common Issues

  1. Ollama CLI Not Found:

    • Ensure Ollama is installed and added to your system PATH.
  2. Model Loading Errors:

    • Check if the model is fully downloaded.
    • Verify the model name in your Python script matches the downloaded model.
  3. Performance Issues:

    • Use a machine with a dedicated GPU.
    • Optimize model parameters like batch size and max tokens.
  4. Dependency Conflicts:

    • Resolve version conflicts by recreating the virtual environment.

8. Next Steps

  1. Explore More Models:

    • Browse Ollama’s model repository for specific use cases (e.g., text generation, summarization).
  2. Build a Flask API:

    • Wrap your Ollama model with Flask to create a REST API for broader use.
  3. Integrate with Applications:

    • Use Ollama in web apps, chatbots, or data analysis pipelines.
  4. Contribute to the Community:

    • Share your insights and improvements on Ollama’s GitHub repository.

9. Recap

Get started chatting and developing with Phi4 via Ollama which then opens up exciting possibilities for locally hosted AI applications.

With a focus on privacy and performance, Ollama is an excellent choice for developers aiming to build innovative, AI-powered solutions.