Skip to main content
Open In ColabOpen on GitHub

ChatOllama

Ollama allows you to run open-source large language models, such as Llama 2, locally.

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

It optimizes setup and configuration details, including GPU usage.

For a complete list of supported models and model variants, see the Ollama model library.

Overview

Integration details

ClassPackageLocalSerializableJS supportPackage downloadsPackage latest
ChatOllamalangchain-ollamaPyPI - DownloadsPyPI - Version

Model features

Tool callingStructured outputJSON modeImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs

Setup

First, follow these instructions to set up and run a local Ollama instance:

  • Download and install Ollama onto the available supported platforms (including Windows Subsystem for Linux)
  • Fetch available LLM model via ollama pull <name-of-model>
    • View a list of available models via the model library
    • e.g., ollama pull llama3
  • This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.

On Mac, the models will be download to ~/.ollama/models

On Linux (or WSL), the models will be stored at /usr/share/ollama/.ollama/models

  • Specify the exact version of the model of interest as such ollama pull vicuna:13b-v1.5-16k-q4_0 (View the various tags for the Vicuna model in this instance)
  • To view all pulled models, use ollama list
  • To chat directly with a model from the command line, use ollama run <name-of-model>
  • View the Ollama documentation for more commands. Run ollama help in the terminal to see available commands too.

To enable automated tracing of your model calls, set your LangSmith API key:

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

Installation

The LangChain Ollama integration lives in the langchain-ollama package:

%pip install -qU langchain-ollama

Make sure you're using the latest Ollama version for structured outputs. Update by running:

%pip install -U ollama

Instantiation

Now we can instantiate our model object and generate chat completions:

from langchain_ollama import ChatOllama

llm = ChatOllama(
model="llama3.1",
temperature=0,
# other params...
)
API Reference:ChatOllama

Invocation

from langchain_core.messages import AIMessage

messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
API Reference:AIMessage
AIMessage(content='The translation of "I love programming" from English to French is:\n\n"J\'adore programmer."', response_metadata={'model': 'llama3.1', 'created_at': '2024-08-19T16:05:32.81965Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2167842917, 'load_duration': 54222584, 'prompt_eval_count': 35, 'prompt_eval_duration': 893007000, 'eval_count': 22, 'eval_duration': 1218962000}, id='run-0863daa2-43bf-4a43-86cc-611b23eae466-0', usage_metadata={'input_tokens': 35, 'output_tokens': 22, 'total_tokens': 57})
print(ai_msg.content)
The translation of "I love programming" from English to French is:

"J'adore programmer."

Chaining

We can chain our model with a prompt template like so:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
),
("human", "{input}"),
]
)

chain = prompt | llm
chain.invoke(
{
"input_language": "English",
"output_language": "German",
"input": "I love programming.",
}
)
API Reference:ChatPromptTemplate
AIMessage(content='Das Programmieren ist mir ein Leidenschaft! (That\'s "Programming is my passion!" in German.) Would you like me to translate anything else?', response_metadata={'model': 'llama3.1', 'created_at': '2024-08-19T16:05:34.893548Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2045997333, 'load_duration': 22584792, 'prompt_eval_count': 30, 'prompt_eval_duration': 213210000, 'eval_count': 32, 'eval_duration': 1808541000}, id='run-d18e1c6b-50e0-4b1d-b23a-973fa058edad-0', usage_metadata={'input_tokens': 30, 'output_tokens': 32, 'total_tokens': 62})

Tool calling

We can use tool calling with an LLM that has been fine-tuned for tool use:

ollama pull llama3.1

Details on creating custom tools are available in this guide. Below, we demonstrate how to create a tool using the @tool decorator on a normal python function.

from typing import List

from langchain_core.tools import tool
from langchain_ollama import ChatOllama


@tool
def validate_user(user_id: int, addresses: List[str]) -> bool:
"""Validate user using historical addresses.

Args:
user_id (int): the user ID.
addresses (List[str]): Previous addresses as a list of strings.
"""
return True


llm = ChatOllama(
model="llama3.1",
temperature=0,
).bind_tools([validate_user])

result = llm.invoke(
"Could you validate user 123? They previously lived at "
"123 Fake St in Boston MA and 234 Pretend Boulevard in "
"Houston TX."
)
result.tool_calls
API Reference:tool | ChatOllama
[{'name': 'validate_user',
'args': {'addresses': '["123 Fake St, Boston, MA", "234 Pretend Boulevard, Houston, TX"]',
'user_id': '123'},
'id': '40fe3de0-500c-4b91-9616-5932a929e640',
'type': 'tool_call'}]

Multi-modal

Ollama has support for multi-modal LLMs, such as bakllava and llava.

ollama pull bakllava

Be sure to update Ollama so that you have the most recent version to support multi-modal.

import base64
from io import BytesIO

from IPython.display import HTML, display
from PIL import Image


def convert_to_base64(pil_image):
"""
Convert PIL images to Base64 encoded strings

:param pil_image: PIL image
:return: Re-sized Base64 string
"""

buffered = BytesIO()
pil_image.save(buffered, format="JPEG") # You can change the format if needed
img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
return img_str


def plt_img_base64(img_base64):
"""
Disply base64 encoded string as image

:param img_base64: Base64 string
"""
# Create an HTML img tag with the base64 string as the source
image_html = f'<img src="data:image/jpeg;base64,{img_base64}" />'
# Display the image by rendering the HTML
display(HTML(image_html))


file_path = "../../../static/img/ollama_example_img.jpg"
pil_image = Image.open(file_path)

image_b64 = convert_to_base64(pil_image)
plt_img_base64(image_b64)
<img src="" /> 
from langchain_core.messages import HumanMessage
from langchain_ollama import ChatOllama

llm = ChatOllama(model="bakllava", temperature=0)


def prompt_func(data):
text = data["text"]
image = data["image"]

image_part = {
"type": "image_url",
"image_url": f"data:image/jpeg;base64,{image}",
}

content_parts = []

text_part = {"type": "text", "text": text}

content_parts.append(image_part)
content_parts.append(text_part)

return [HumanMessage(content=content_parts)]


from langchain_core.output_parsers import StrOutputParser

chain = prompt_func | llm | StrOutputParser()

query_chain = chain.invoke(
{"text": "What is the Dollar-based gross retention rate?", "image": image_b64}
)

print(query_chain)
90%

Reasoning models and custom message roles

Some models, such as IBM's Granite 3.2, support custom message roles to enable thinking processes.

To access Granite 3.2's thinking features, pass a message with a "control" role with content set to "thinking". Because "control" is a non-standard message role, we can use a ChatMessage object to implement it:

from langchain_core.messages import ChatMessage, HumanMessage
from langchain_ollama import ChatOllama

llm = ChatOllama(model="granite3.2:8b")

messages = [
ChatMessage(role="control", content="thinking"),
HumanMessage("What is 3^3?"),
]

response = llm.invoke(messages)
print(response.content)
Here is my thought process:
This question is asking for the result of 3 raised to the power of 3, which is a basic mathematical operation.

Here is my response:
The expression 3^3 means 3 raised to the power of 3. To calculate this, you multiply the base number (3) by itself as many times as its exponent (3):

3 * 3 * 3 = 27

So, 3^3 equals 27.

Qwen3 Extended Thinking Mode (Optional)

The Qwen3 model supports an "extended thinking" mode, which generates intermediate reasoning steps, often wrapped in <think> tags.

To disable this behavior, you can pass the model_kwargs={"think": False} parameter when initializing the ChatOllama object.

from langchain_community.chat_models import ChatOllama

llm = ChatOllama(model="qwen3:8b", model_kwargs={"think": False})

response = llm.invoke("Why do cats purr?")
print(response)
API Reference:ChatOllama
content="<think>\nOkay, the user is asking why cats purr. Let me start by recalling what I know about cat purring. I remember that purring is a common behavior in cats, but the exact reasons might be multifaceted.\n\nFirst, I think about the physiological aspect. Cats have a structure called the larynx, which vibrates when they exhale. The diaphragm and muscles in the throat might work together to create the purring sound. This is different from other sounds cats make, like meowing, which is more for communication.\n\nThen there's the behavioral aspect. Purring is often associated with contentment, like when a cat is being petted or is in a comfortable environment. But I also recall that cats might purr when they're in pain or stressed. So the reasons aren't just about happiness. Maybe it's a self-soothing mechanism?\n\nI should also consider the evolutionary angle. Some theories suggest that purring has a healing effect. The vibrations might have therapeutic properties, helping with bone healing or reducing stress. This could be an adaptation that helped cats survive in the wild, maybe by promoting recovery from injuries.\n\nAnother point is the social aspect. Purring might serve as a way to communicate with other cats or humans. For example, a mother cat might purr to comfort her kittens. Or a cat might purr to show submission to a more dominant cat.\n\nWait, I need to make sure I'm not mixing up different behaviors. Also, some sources say that purring is a way to bond or show affection. But it's important to distinguish between different contexts—like when they're happy versus when they're in pain.\n\nI should also mention the scientific research. Studies have shown that the frequency of purring (around 25-150 Hz) might have health benefits, such as promoting tissue repair and reducing inflammation. This could explain why cats purr even when they're not happy, as a way to aid their own recovery.\n\nBut I need to present all these points clearly without confusing the user. Maybe start with the physiological mechanism, then move to the behavioral and emotional reasons, and then touch on the evolutionary and health aspects. Also, clarify that purring isn't always a sign of happiness, as it can also be a coping mechanism.\n\nWait, are there any other theories? I think some researchers believe that purring is a way to communicate with humans, showing that the cat is in a good mood. But again, the context matters. The user might be interested in both the positive and the negative reasons.\n\nI should structure the answer to cover the main points: physiological process, emotional states (contentment, stress, pain), social communication, evolutionary advantages, and possible health benefits. Make sure to explain that purring isn't a single behavior but can have multiple functions depending on the situation.\n</think>\n\nCats purr for a variety of reasons, and the behavior is more complex than it might seem. Here's a breakdown of the key factors:\n\n### 1. **Physiological Mechanism**  \n   - **Vibrations from the Larynx and Diaphragm**: Purring is produced when a cat inhales and exhales, causing the larynx (voice box) and diaphragm to vibrate. This creates the characteristic rhythmic sound, typically ranging from 25 to 150 vibrations per second. The exact mechanism is still studied, but it involves coordinated muscle movements in the throat and respiratory system.\n\n### 2. **Emotional and Behavioral Contexts**  \n   - **Contentment**: Purring is often associated with relaxation, such as when a cat is being petted, fed, or resting comfortably.  \n   - **Stress or Pain**: Cats may purr when they’re anxious, fearful, or in discomfort. This could be a self-soothing behavior or a way to signal to humans or other animals that they’re not a threat.  \n   - **Communication**: Purring can serve as a social signal. For example, mother cats may purr to comfort kittens, and cats might purr to show submission to a dominant individual.\n\n### 3. **Evolutionary and Survival Benefits**  \n   - **Healing Properties**: Research suggests that the vibrations from purring (at frequencies of 25–150 Hz) might promote tissue repair, reduce inflammation, and aid in bone healing. This could have evolutionary advantages, helping cats recover from injuries or illnesses.  \n   - **Bonding**: Purring may reinforce social bonds between cats and humans or other cats, fostering trust and comfort.\n\n### 4. **Context Matters**  \n   - **Not Always a Sign of Happiness**: While purring is often linked to contentment, it can also occur in stressful or painful situations. For example, a cat in pain might purr to cope with discomfort, similar to how humans might hum or sigh when stressed.\n\n### 5. **Variability in Purring**  \n   - **Individual Differences**: Cats may purr differently based on their personality, breed, or health. Some cats purr loudly, while others do so softly or intermittently.\n\nIn summary, purring serves multiple purposes—physiological, emotional, and social—and its meaning can vary depending on the context. It’s a fascinating behavior that reflects both the cat’s state of mind and its evolutionary adaptations. 🐾" additional_kwargs={} response_metadata={'model': 'qwen3:8b', 'created_at': '2025-06-23T19:04:06.4536071Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 282532581900, 'load_duration': 46711200, 'prompt_eval_count': 14, 'prompt_eval_duration': 215192400, 'eval_count': 1101, 'eval_duration': 282269412200} id='run--2c195ff4-7390-45ea-9529-158325789620-0'

Note that the model exposes its thought process in addition to its final response.

API reference

For detailed documentation of all ChatOllama features and configurations head to the API reference: https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html


Was this page helpful?