Gene Library Courses Download Pricing Contact Sign in
Together AI

Together AI

North America Est. 2022 Pay-per-token for inference and fine-tuning; hourly for dedicated GPU clusters.
inference-providerfine-tuningopen-sourcecloud

AI-native cloud providing fast inference, fine-tuning, and training for open-source foundation models.

What They Do

Together AI operates a full-stack platform for production AI: serverless inference, dedicated GPU clusters, fine-tuning pipelines, and custom model evaluations. Founded by academic researchers (including FlashAttention co-author Tri Dao).

Mission

Build the AI-native cloud that makes open-source model research and production accessible to everyone.

Available Models

ModelFamilyContextInput /MOutput /M
BAAI/bge-base-en-v1.5
ByteDance-Seed/Seedream-3.0
ByteDance-Seed/Seedream-4.0
ByteDance/Seedance-1.0-lite
ByteDance/Seedance-1.0-pro
ByteDance/Seedance-2.0
ByteDance/Seedream-5.0-lite
Hcompany/Holo3-35B-A3B
HiDream-ai/HiDream-I1-Dev
HiDream-ai/HiDream-I1-Fast
HiDream-ai/HiDream-I1-Full
LiquidAI/LFM2-24B-A2B
Lykon/DreamShaper
MiniMaxAI/MiniMax-M1-40k
MiniMaxAI/MiniMax-M1-80k
MiniMaxAI/MiniMax-M2
MiniMaxAI/MiniMax-M2.5-FP4
MiniMaxAI/MiniMax-M2.7
NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO
Qwen/QwQ-32B
Qwen/Qwen-Image
Qwen/Qwen-Image-2.0
Qwen/Qwen-Image-2.0-Pro
Qwen/Qwen2-1.5B
Qwen/Qwen2-1.5B-Instruct
Qwen/Qwen2-72B
Qwen/Qwen2-72B-Instruct
Qwen/Qwen2-7B
Qwen/Qwen2-VL-72B-Instruct
Qwen/Qwen2.5-1.5B
Qwen/Qwen2.5-1.5B-Instruct
Qwen/Qwen2.5-14B
Qwen/Qwen2.5-14B-Instruct
Qwen/Qwen2.5-32B
Qwen/Qwen2.5-32B-Instruct
Qwen/Qwen2.5-3B-Instruct
Qwen/Qwen2.5-72B
Qwen/Qwen2.5-72B-Instruct
Qwen/Qwen2.5-72B-Instruct-Turbo
Qwen/Qwen2.5-7B
Qwen/Qwen2.5-7B-Instruct
Qwen/Qwen2.5-7B-Instruct-Turbo
Qwen/Qwen2.5-Coder-32B-Instruct
Qwen/Qwen2.5-VL-72B-Instruct
Qwen/Qwen3-0.6B
Qwen/Qwen3-0.6B-Base
Qwen/Qwen3-1.7B
Qwen/Qwen3-1.7B-Base
Qwen/Qwen3-14B
Qwen/Qwen3-14B-Base
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8
Qwen/Qwen3-235B-A22B-Instruct-2507-tput
Qwen/Qwen3-30B-A3B
Qwen/Qwen3-30B-A3B-Base
Qwen/Qwen3-30B-A3B-Instruct-2507-Lora
Qwen/Qwen3-32B
Qwen/Qwen3-4B-Base
Qwen/Qwen3-4B-Instruct-2507
Qwen/Qwen3-8B
Qwen/Qwen3-8B-Base
Qwen/Qwen3-8B-Lora
Qwen/Qwen3-Coder-30B-A3B-Instruct
Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8
Qwen/Qwen3-Coder-Next-FP8
Qwen/Qwen3-Next-80B-A3B-Instruct
Qwen/Qwen3-Next-80B-A3B-Instruct-FP8
Qwen/Qwen3-Next-80B-A3B-Thinking
Qwen/Qwen3-VL-235B-A22B-Instruct-FP8
Qwen/Qwen3-VL-32B-Instruct
Qwen/Qwen3-VL-8B-Instruct
Qwen/Qwen3.5-122B-A10B-FP8
Qwen/Qwen3.5-35B-A3B
Qwen/Qwen3.5-397B-A17B
Qwen/Qwen3.5-9B
Qwen/Qwen3.5-9B-FP8
Qwen/Qwen3.6-35B-A3B-FP8
Qwen/Qwen3.6-Plus
Qwen/Qwen3.7-Max
RunDiffusion/Juggernaut-pro-flux
Rundiffusion/Juggernaut-Lightning-Flux
Salesforce/Llama-Rank-V1
Wan-AI/Wan2.2-I2V-A14B
Wan-AI/Wan2.2-T2V-A14B
Wan-AI/Wan2.6-image
Wan-AI/wan2.7-i2v
Wan-AI/wan2.7-r2v
Wan-AI/wan2.7-t2v
agentica-org/DeepCoder-14B-Preview
alibaba/happyhorse-1.0-i2v
alibaba/happyhorse-1.0-r2v
alibaba/happyhorse-1.0-t2v
allenai/Molmo-7B-D-0924
arcee-ai/trinity-mini
arize-ai/qwen-2-1.5b-instruct
black-forest-labs/FLUX.1-kontext-max
black-forest-labs/FLUX.1-kontext-pro
black-forest-labs/FLUX.1-schnell
black-forest-labs/FLUX.1.1-pro
black-forest-labs/FLUX.2-dev
black-forest-labs/FLUX.2-flex
black-forest-labs/FLUX.2-max
black-forest-labs/FLUX.2-pro
canopylabs/orpheus-3b-0.1-ft
cartesia/sonic
cartesia/sonic-2
cartesia/sonic-3
deepcogito/cogito-v1-preview-llama-70B
deepcogito/cogito-v1-preview-llama-70B-Turbo
deepcogito/cogito-v1-preview-llama-8B
deepcogito/cogito-v1-preview-qwen-14B
deepcogito/cogito-v1-preview-qwen-32B
deepcogito/cogito-v2-1-671b
deepgram/aura-2
deepgram/flux
deepgram/nova-3-en
deepgram/nova-3-multi
deepseek-ai/DeepSeek-OCR-2
deepseek-ai/DeepSeek-R1-0528
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
deepseek-ai/DeepSeek-V3.1
deepseek-ai/DeepSeek-V4-Pro
deepseek-ai/deepseek-coder-33b-instruct
essentialai/rnj-1-instruct
facebook/cwm
google/flash-image-2.5
google/flash-image-3.1
google/gemini-3-pro-image
google/gemma-2-27b-it
google/gemma-2-9b-it
google/gemma-2b-it
google/gemma-3-1b-it
google/gemma-3-1b-pt
google/gemma-3-270m-it
google/gemma-3-270m-it-lora
google/gemma-3-27b-it
google/gemma-3-27b-it-lora
google/gemma-3-27b-pt
google/gemma-3-4b-it
google/gemma-3n-E4B-it
google/gemma-4-26B-A4B-it
google/gemma-4-31B-it
google/gemma-4-31B-it-lora
google/gemma-4-E2B-it
google/gemma-4-E4B-it
google/imagen-4.0-fast
google/imagen-4.0-preview
google/imagen-4.0-ultra
google/medgemma-27b-text-it
google/veo-2.0
google/veo-3.0
google/veo-3.0-audio
google/veo-3.0-fast
google/veo-3.0-fast-audio
google/veo-3.1
google/veo-3.1-lite
google/veo-3.1-test-debug
hexgrad/Kokoro-82M
ideogram/ideogram-3.0
ideogram/ideogram-4.0
intfloat/multilingual-e5-large-instruct
kwaivgI/kling-1.6-pro
kwaivgI/kling-1.6-standard
kwaivgI/kling-2.0-master
kwaivgI/kling-2.1-master
kwaivgI/kling-2.1-pro
kwaivgI/kling-2.1-standard
meta-llama/Llama-2-7b-chat-hf
meta-llama/Llama-3-8b-chat-hf
meta-llama/Llama-3.1-405B
meta-llama/Llama-3.1-405B-Instruct
meta-llama/Llama-3.2-1B
meta-llama/Llama-3.2-1B-Instruct
meta-llama/Llama-3.2-3B
meta-llama/Llama-3.2-3B-Instruct
meta-llama/Llama-3.3-70B-Instruct
meta-llama/Llama-3.3-70B-Instruct-FP8-Lora
meta-llama/Llama-3.3-70B-Instruct-Turbo
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP4
meta-llama/Llama-4-Scout-17B-16E
meta-llama/Llama-4-Scout-17B-16E-Instruct
meta-llama/Llama-4-Scout-17B-16E-Instruct-FP8-Lora
meta-llama/Llama-Guard-4-12B
meta-llama/Meta-Llama-3-70B-Instruct-Turbo
meta-llama/Meta-Llama-3-8B-Instruct
meta-llama/Meta-Llama-3-8B-Instruct-Lite
meta-llama/Meta-Llama-3.1-70B
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
meta-llama/Meta-Llama-3.1-8B
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
minimax/hailuo-02
minimax/speech-2.6-turbo
minimax/speech-2.8-turbo
minimax/video-01-director
mistralai/Devstral-Small-2505
mistralai/Magistral-Small-2506
mistralai/Ministral-3-14B-Instruct-2512
mistralai/Mistral-7B-Instruct-v0.1
mistralai/Mistral-7B-Instruct-v0.3
mistralai/Mistral-7B-v0.1
mistralai/Mistral-Small-24B-Instruct-2501
mistralai/Mixtral-8x22B-Instruct-v0.1
mistralai/Mixtral-8x7B-Instruct-v0.1
mistralai/Mixtral-8x7B-Instruct-v0.1-FP8-Lora
mistralai/Mixtral-8x7B-v0.1
mixedbread-ai/mxbai-rerank-large-v2
moonshotai/Kimi-K2.5-fp4
moonshotai/Kimi-K2.6
nim/meta/llama-3.1-70b-instruct
nim/meta/llama-3.1-8b-instruct
nim/meta/llama-3.2-11b-vision-instruct
nim/meta/llama-3.2-90b-vision-instruct
nim/meta/llama-3.3-70b-instruct
nim/mistralai/mixtral-8x22b-instruct-v01
nim/mistralai/mixtral-8x7b-instruct-v01
nim/nv-mistralai/mistral-nemo-12b-instruct
nim/nvidia/llama-3.1-nemotron-70b-instruct
nim/nvidia/llama-3.3-nemotron-super-49b-v1
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8
nvidia/NVIDIA-Nemotron-Nano-9B-v2
nvidia/nemotron-3-asr-streaming-0.6b
nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8
nvidia/nemotron-3-ultra-550b-a55b
nvidia/nemotron-3.5-asr-streaming-0.6b
nvidia/parakeet-tdt-0.6b-v3
openai/gpt-image-1.5
openai/gpt-oss-120b
openai/gpt-oss-20b
openai/sora-2
openai/sora-2-pro
openai/whisper-large-v3
pearl-ai/gemma-4-31b-it
pixverse/pixverse-v5
pixverse/pixverse-v5.6
pixverse/pixverse-v6
rime-labs/rime-arcana-v2
rime-labs/rime-arcana-v3
rime-labs/rime-arcana-v3-turbo
rime-labs/rime-mist-v2
rime-labs/rime-mist-v3
rime-labs/rime-mist-v3-omni
sarvamai/sarvam-m
stabilityai/stable-diffusion-3-medium
stabilityai/stable-diffusion-xl-base-1.0
togethercomputer/EssentialAI-RNJ-1-Instruct
togethercomputer/meta-llama-3.1-8B-Instruct-AWQ-INT4
vidu/vidu-2.0
vidu/vidu-q1
vidu/vidu-q3
vidu/vidu-q3-turbo
zai-org/GLM-4.5-Air-FP8
zai-org/GLM-4.5V
zai-org/GLM-4.6
zai-org/GLM-4.7
zai-org/GLM-4.7-FP8
zai-org/GLM-4.7-fp4
zai-org/GLM-5
zai-org/GLM-5-FP4
zai-org/GLM-5.1
zai-org/GLM-OCR

FAQ