Learn Ollama Fast and Free - Run LLM Models At Home!

What is Ollama?

Ollama is like your personal AI playground, right on your own computer. It lets you easily run large language models, the kind that power fancy AI tools, completely offline. Think of it as having your own private AI brain that can chat with you, write stories, or even help you brainstorm, without needing an internet connection.

Installation

Go to the Ollama Website: Navigate to ollama.ai and click on the download button.
Select Your Operating System: Choose the appropriate download for your OS (Windows, Linux, or macOS).
Install: Once downloaded, double-click and follow the installation instructions.

Running Ollama

There are a couple of ways to run Ollama:

Desktop Application

On Windows, use the search bar to find and open the Ollama app.
On macOS, use Spotlight search to open it.
Similarly, on Linux, use your system’s search.

Note: The desktop app runs a background server, so nothing will appear on your screen.

Command Line

Open a terminal or command prompt.
Type ollama. If the installation was successful, you will see some output.

Working with Models

Accessing Models

You have access to many open-source models. You can find a list of models on the Ollama GitHub repository and the Ollama library. Keep in mind that models can be large, and you’ll need enough disk space and RAM to run them.

Checking Model Requirements

Refer to the model specifications for RAM requirements. For example, the Llama 3.1 model with 405 billion parameters requires a lot of RAM.

Running a Model

To run a model, such as deepseek-r1, use the following command:

ollama run deepseek-r1

If the model is not installed, Ollama will download it for you. If it’s already installed, a prompt will appear, and you can start typing to interact with it.

Exiting a Model

Type /bye to exit the model interaction.

Listing Installed Models

To see a list of models installed on your system, use:

ollama list

Switching Models

To switch between models, use the ollama run <model_name> command, specifying the model you wish to use.

Ollama HTTP API

Ollama provides an HTTP API on localhost, allowing you to interact with models via code.

Starting the Server

If you’re not using the desktop application, start the server with the command:

ollama serve

This will start the API on port 11434 by default.

Interacting with the API

Python Example (Manual)

You can use Python’s requests library to interact with the API manually. Here’s an example of sending a request to the API:

import requests
import json

# Install requests if you don't have it: pip install requests

base_url = "http://localhost:11434/api/chat"
payload = {
    "model": "mistral",
    "messages": [
        {
            "role": "user",
            "content": "What is Python?",
        }
    ],
    "stream": True
}

response = requests.post(base_url, json=payload, stream=True)

for line in response.iter_lines():
    if line:
        try:
            json_data = json.loads(line)
            if 'message' in json_data and 'content' in json_data['message']:
              print(json_data['message']['content'], end="", flush=True)
        except json.JSONDecodeError:
            print(f"JSONDecodeError: {line}")

Python Example (Using the Ollama Package)

You can use the Ollama Python package for a simpler way to interact with the API:

# Install the ollama package: pip install ollama

from ollama import Client

client = Client(host='http://localhost:11434')
model = "mistral"
prompt = "What is Python?"

response = client.generate(model=model, prompt=prompt)
print(response['response'])

Model Customization

You can customize models using model files.

Creating a Model File

Create a file (e.g., Modelfile) without any extension.

touch Modelfile

Add the following syntax:

FROM llama3
PARAMETER temperature 0.8
SYSTEM You are Mario from Super Mario Bros. Answer as Mario the assistant only.

Creating a Custom Model

Use the command:

ollama create <new_model_name> -f <path_to_Modelfile>

For example:

ollama create mario -f Modelfile

Running the Custom Model

ollama run mario

Removing a Custom Model

ollama rm mario

Conclusion

Ollama offers a great way to run LLMs locally for free, ensuring privacy and security. You can interact with models through the command line or via code using the HTTP API. You also have the option to customize models using model files. This guide should have you up and running in no time.

Conda Setup Guide for Superior Package Management

Mastering AIChat: Your Ultimate Guide