如何用OpenAI SDK调用Ollama LLM

的兼容端点，用户可以用OpenAI SDK访问本地Ollama模型，这里示例整个访问过程。Ollama目前内置了OpenAI。假设Ollama已安装，过程参考。OpenAI 兼容性。

liliangcsdn

726人浏览 · 2025-08-09 17:20:31

liliangcsdn · 2025-08-09 17:20:31 发布

Ollama目前内置了OpenAI Chat Completions API 的兼容端点，用户可以用OpenAI SDK访问本地Ollama模型，这里示例整个访问过程。

假设Ollama已安装，过程参考

在mac m1基于ollama运行deepseek r1_mac m1 ollama-CSDN博客

1 下载OpenAI SDK和模型

pip install openai

ollama pull qwen3:4b # chat 模型

ollama pull bge-m3:latest # embbeding 模型

2 开发测试例子

ollama目前支持openat的chat、emb等llm的访问、支持通过环境变量设置无感访问。

1）chat接口

python chat例子

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",  # ollama openai addr
    api_key="ollama"  # 本地未设置api key随意填
)
response = client.chat.completions.create(
    model="qwen3:4b",  # mac m1算力小使用4b小模型
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The LA Dodgers won in 2020."},
        {"role": "user", "content": "Where was it played?"}
      ],
    temperature=0.7,  
    max_tokens=512  
)
print(response.choices[0].message.content)

curl chat例子

curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3:4b",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'

2) embedding接口

python emb 例子

import torch
from openai import OpenAI
client = OpenAI(
    base_url="http://localhost:11434/v1", # ollama openai addr
     api_key="ollama"
)
texts = ["你好，什么是大模型？", "大模型是什么", "告诉我什么是大模型"]
def impl(texts):
    response = client.embeddings.create(model="bge-m3:latest", input=texts)
    embeddings = [e.embedding for e in response.data]
    return torch.tensor(embeddings)
embeddings = impl(texts)
print(embeddings.shape)

3）环境变量

OpenAI SDK支持通过环境变量设置部署地址和令牌，支持无感访问ollama模型。

示例程序如下

import os
os.environ['OPENAI_API_KEY'] = "ollama" # 令牌
os.environ['OPENAI_BASE_URL'] = "http://localhost:11434/v1" # 部署地址
 
from openai import OpenAI
 
client = OpenAI()
response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The LA Dodgers won in 2020."},
        {"role": "user", "content": "Where was it played?"}
      ],
    temperature=0.7,  
    max_tokens=512  
)
print(response.choices[0].message.content)

3 使用ollama原生接口访问模型

参考在mac m1基于ollama运行deepseek r1_mac m1 ollama-CSDN博客