Ollama

North America Est. 2023 Free and open source (MIT + Apache licences).

local-inferenceopen-sourcemodel-library

Run large language models locally on Mac, Linux, or Windows with a single command.

What They Do

Ollama bundles model weights, quantisation, and a local REST server into a single desktop application. Users run `ollama pull llama3` and immediately get a local API on localhost:11434 compatible with the OpenAI chat format.

Mission

Make running powerful AI models locally as easy as running any other software.

Available Models

Model	Family	Context	Input /M	Output /M
cogito-2.1	cogito-2.1	—	—	—
deepseek-v3.1	deepseek-v3.1	—	—	—
deepseek-v3.2	deepseek-v3.2	—	—	—
deepseek-v4-flash	deepseek-v4-flash	—	—	—
deepseek-v4-pro	deepseek-v4-pro	—	—	—
devstral-2	devstral-2	—	—	—
devstral-small-2	devstral-small-2	—	—	—
gemini-3-flash-preview	gemini-3-flash-preview	—	—	—
gemma3	gemma3	—	—	—
gemma4	gemma4	—	—	—
glm-4.6	glm-4.6	—	—	—
glm-4.7	glm-4.7	—	—	—
glm-5	glm-5	—	—	—
glm-5.1	glm-5.1	—	—	—
glm-5.2	glm-5.2	—	—	—
gpt-oss	gpt-oss	—	—	—
kimi-k2	kimi-k2	—	—	—
kimi-k2-thinking	kimi-k2-thinking	—	—	—
kimi-k2.5	kimi-k2.5	—	—	—
kimi-k2.6	kimi-k2.6	—	—	—
kimi-k2.7-code	kimi-k2.7-code	—	—	—
minimax-m2	minimax-m2	—	—	—
minimax-m2.1	minimax-m2.1	—	—	—
minimax-m2.5	minimax-m2.5	—	—	—
minimax-m2.7	minimax-m2.7	—	—	—
minimax-m3	minimax-m3	—	—	—
ministral-3	ministral-3	—	—	—
mistral-large-3	mistral-large-3	—	—	—
nemotron-3-nano	nemotron-3-nano	—	—	—
nemotron-3-super	nemotron-3-super	—	—	—
nemotron-3-ultra	nemotron-3-ultra	—	—	—
qwen3-coder	qwen3-coder	—	—	—
qwen3-coder-next	qwen3-coder-next	—	—	—
qwen3-next	qwen3-next	—	—	—
qwen3-vl	qwen3-vl	—	—	—
qwen3.5	qwen3.5	—	—	—
rnj-1	rnj-1	—	—	—

FAQ

: Yes. Ollama itself is open source under MIT licence. The model weights it downloads are subject to each model's individual licence.
: Yes. Since Ollama v0.1.24 the /api/chat endpoint accepts the same JSON format as OpenAI's chat.completions, making it a drop-in local replacement for most apps.