AI-native cloud providing fast inference, fine-tuning, and training for open-source foundation models.
What They Do
Together AI operates a full-stack platform for production AI: serverless inference, dedicated GPU clusters, fine-tuning pipelines, and custom model evaluations. Founded by academic researchers (including FlashAttention co-author Tri Dao).
Mission
Build the AI-native cloud that makes open-source model research and production accessible to everyone.
Available Models
| Model | Family | Context | Input /M | Output /M |
|---|---|---|---|---|
| BAAI/bge-base-en-v1.5 | — | — | — | |
| ByteDance-Seed/Seedream-3.0 | — | — | — | |
| ByteDance-Seed/Seedream-4.0 | — | — | — | |
| ByteDance/Seedance-1.0-lite | — | — | — | |
| ByteDance/Seedance-1.0-pro | — | — | — | |
| ByteDance/Seedance-2.0 | — | — | — | |
| ByteDance/Seedream-5.0-lite | — | — | — | |
| Hcompany/Holo3-35B-A3B | — | — | — | |
| HiDream-ai/HiDream-I1-Dev | — | — | — | |
| HiDream-ai/HiDream-I1-Fast | — | — | — | |
| HiDream-ai/HiDream-I1-Full | — | — | — | |
| LiquidAI/LFM2-24B-A2B | — | — | — | |
| Lykon/DreamShaper | — | — | — | |
| MiniMaxAI/MiniMax-M1-40k | — | — | — | |
| MiniMaxAI/MiniMax-M1-80k | — | — | — | |
| MiniMaxAI/MiniMax-M2 | — | — | — | |
| MiniMaxAI/MiniMax-M2.5-FP4 | — | — | — | |
| MiniMaxAI/MiniMax-M2.7 | — | — | — | |
| NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO | — | — | — | |
| Qwen/QwQ-32B | — | — | — | |
| Qwen/Qwen-Image | — | — | — | |
| Qwen/Qwen-Image-2.0 | — | — | — | |
| Qwen/Qwen-Image-2.0-Pro | — | — | — | |
| Qwen/Qwen2-1.5B | — | — | — | |
| Qwen/Qwen2-1.5B-Instruct | — | — | — | |
| Qwen/Qwen2-72B | — | — | — | |
| Qwen/Qwen2-72B-Instruct | — | — | — | |
| Qwen/Qwen2-7B | — | — | — | |
| Qwen/Qwen2-VL-72B-Instruct | — | — | — | |
| Qwen/Qwen2.5-1.5B | — | — | — | |
| Qwen/Qwen2.5-1.5B-Instruct | — | — | — | |
| Qwen/Qwen2.5-14B | — | — | — | |
| Qwen/Qwen2.5-14B-Instruct | — | — | — | |
| Qwen/Qwen2.5-32B | — | — | — | |
| Qwen/Qwen2.5-32B-Instruct | — | — | — | |
| Qwen/Qwen2.5-3B-Instruct | — | — | — | |
| Qwen/Qwen2.5-72B | — | — | — | |
| Qwen/Qwen2.5-72B-Instruct | — | — | — | |
| Qwen/Qwen2.5-72B-Instruct-Turbo | — | — | — | |
| Qwen/Qwen2.5-7B | — | — | — | |
| Qwen/Qwen2.5-7B-Instruct | — | — | — | |
| Qwen/Qwen2.5-7B-Instruct-Turbo | — | — | — | |
| Qwen/Qwen2.5-Coder-32B-Instruct | — | — | — | |
| Qwen/Qwen2.5-VL-72B-Instruct | — | — | — | |
| Qwen/Qwen3-0.6B | — | — | — | |
| Qwen/Qwen3-0.6B-Base | — | — | — | |
| Qwen/Qwen3-1.7B | — | — | — | |
| Qwen/Qwen3-1.7B-Base | — | — | — | |
| Qwen/Qwen3-14B | — | — | — | |
| Qwen/Qwen3-14B-Base | — | — | — | |
| Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 | — | — | — | |
| Qwen/Qwen3-235B-A22B-Instruct-2507-tput | — | — | — | |
| Qwen/Qwen3-30B-A3B | — | — | — | |
| Qwen/Qwen3-30B-A3B-Base | — | — | — | |
| Qwen/Qwen3-30B-A3B-Instruct-2507-Lora | — | — | — | |
| Qwen/Qwen3-32B | — | — | — | |
| Qwen/Qwen3-4B-Base | — | — | — | |
| Qwen/Qwen3-4B-Instruct-2507 | — | — | — | |
| Qwen/Qwen3-8B | — | — | — | |
| Qwen/Qwen3-8B-Base | — | — | — | |
| Qwen/Qwen3-8B-Lora | — | — | — | |
| Qwen/Qwen3-Coder-30B-A3B-Instruct | — | — | — | |
| Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8 | — | — | — | |
| Qwen/Qwen3-Coder-Next-FP8 | — | — | — | |
| Qwen/Qwen3-Next-80B-A3B-Instruct | — | — | — | |
| Qwen/Qwen3-Next-80B-A3B-Instruct-FP8 | — | — | — | |
| Qwen/Qwen3-Next-80B-A3B-Thinking | — | — | — | |
| Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 | — | — | — | |
| Qwen/Qwen3-VL-32B-Instruct | — | — | — | |
| Qwen/Qwen3-VL-8B-Instruct | — | — | — | |
| Qwen/Qwen3.5-122B-A10B-FP8 | — | — | — | |
| Qwen/Qwen3.5-35B-A3B | — | — | — | |
| Qwen/Qwen3.5-397B-A17B | — | — | — | |
| Qwen/Qwen3.5-9B | — | — | — | |
| Qwen/Qwen3.5-9B-FP8 | — | — | — | |
| Qwen/Qwen3.6-35B-A3B-FP8 | — | — | — | |
| Qwen/Qwen3.6-Plus | — | — | — | |
| Qwen/Qwen3.7-Max | — | — | — | |
| RunDiffusion/Juggernaut-pro-flux | — | — | — | |
| Rundiffusion/Juggernaut-Lightning-Flux | — | — | — | |
| Salesforce/Llama-Rank-V1 | — | — | — | |
| Wan-AI/Wan2.2-I2V-A14B | — | — | — | |
| Wan-AI/Wan2.2-T2V-A14B | — | — | — | |
| Wan-AI/Wan2.6-image | — | — | — | |
| Wan-AI/wan2.7-i2v | — | — | — | |
| Wan-AI/wan2.7-r2v | — | — | — | |
| Wan-AI/wan2.7-t2v | — | — | — | |
| agentica-org/DeepCoder-14B-Preview | — | — | — | |
| alibaba/happyhorse-1.0-i2v | — | — | — | |
| alibaba/happyhorse-1.0-r2v | — | — | — | |
| alibaba/happyhorse-1.0-t2v | — | — | — | |
| allenai/Molmo-7B-D-0924 | — | — | — | |
| arcee-ai/trinity-mini | — | — | — | |
| arize-ai/qwen-2-1.5b-instruct | — | — | — | |
| black-forest-labs/FLUX.1-kontext-max | — | — | — | |
| black-forest-labs/FLUX.1-kontext-pro | — | — | — | |
| black-forest-labs/FLUX.1-schnell | — | — | — | |
| black-forest-labs/FLUX.1.1-pro | — | — | — | |
| black-forest-labs/FLUX.2-dev | — | — | — | |
| black-forest-labs/FLUX.2-flex | — | — | — | |
| black-forest-labs/FLUX.2-max | — | — | — | |
| black-forest-labs/FLUX.2-pro | — | — | — | |
| canopylabs/orpheus-3b-0.1-ft | — | — | — | |
| cartesia/sonic | — | — | — | |
| cartesia/sonic-2 | — | — | — | |
| cartesia/sonic-3 | — | — | — | |
| deepcogito/cogito-v1-preview-llama-70B | — | — | — | |
| deepcogito/cogito-v1-preview-llama-70B-Turbo | — | — | — | |
| deepcogito/cogito-v1-preview-llama-8B | — | — | — | |
| deepcogito/cogito-v1-preview-qwen-14B | — | — | — | |
| deepcogito/cogito-v1-preview-qwen-32B | — | — | — | |
| deepcogito/cogito-v2-1-671b | — | — | — | |
| deepgram/aura-2 | — | — | — | |
| deepgram/flux | — | — | — | |
| deepgram/nova-3-en | — | — | — | |
| deepgram/nova-3-multi | — | — | — | |
| deepseek-ai/DeepSeek-OCR-2 | — | — | — | |
| deepseek-ai/DeepSeek-R1-0528 | — | — | — | |
| deepseek-ai/DeepSeek-R1-Distill-Llama-70B | — | — | — | |
| deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | — | — | — | |
| deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | — | — | — | |
| deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | — | — | — | |
| deepseek-ai/DeepSeek-V3.1 | — | — | — | |
| deepseek-ai/DeepSeek-V4-Pro | — | — | — | |
| deepseek-ai/deepseek-coder-33b-instruct | — | — | — | |
| essentialai/rnj-1-instruct | — | — | — | |
| facebook/cwm | — | — | — | |
| google/flash-image-2.5 | — | — | — | |
| google/flash-image-3.1 | — | — | — | |
| google/gemini-3-pro-image | — | — | — | |
| google/gemma-2-27b-it | — | — | — | |
| google/gemma-2-9b-it | — | — | — | |
| google/gemma-2b-it | — | — | — | |
| google/gemma-3-1b-it | — | — | — | |
| google/gemma-3-1b-pt | — | — | — | |
| google/gemma-3-270m-it | — | — | — | |
| google/gemma-3-270m-it-lora | — | — | — | |
| google/gemma-3-27b-it | — | — | — | |
| google/gemma-3-27b-it-lora | — | — | — | |
| google/gemma-3-27b-pt | — | — | — | |
| google/gemma-3-4b-it | — | — | — | |
| google/gemma-3n-E4B-it | — | — | — | |
| google/gemma-4-26B-A4B-it | — | — | — | |
| google/gemma-4-31B-it | — | — | — | |
| google/gemma-4-31B-it-lora | — | — | — | |
| google/gemma-4-E2B-it | — | — | — | |
| google/gemma-4-E4B-it | — | — | — | |
| google/imagen-4.0-fast | — | — | — | |
| google/imagen-4.0-preview | — | — | — | |
| google/imagen-4.0-ultra | — | — | — | |
| google/medgemma-27b-text-it | — | — | — | |
| google/veo-2.0 | — | — | — | |
| google/veo-3.0 | — | — | — | |
| google/veo-3.0-audio | — | — | — | |
| google/veo-3.0-fast | — | — | — | |
| google/veo-3.0-fast-audio | — | — | — | |
| google/veo-3.1 | — | — | — | |
| google/veo-3.1-lite | — | — | — | |
| google/veo-3.1-test-debug | — | — | — | |
| hexgrad/Kokoro-82M | — | — | — | |
| ideogram/ideogram-3.0 | — | — | — | |
| ideogram/ideogram-4.0 | — | — | — | |
| intfloat/multilingual-e5-large-instruct | — | — | — | |
| kwaivgI/kling-1.6-pro | — | — | — | |
| kwaivgI/kling-1.6-standard | — | — | — | |
| kwaivgI/kling-2.0-master | — | — | — | |
| kwaivgI/kling-2.1-master | — | — | — | |
| kwaivgI/kling-2.1-pro | — | — | — | |
| kwaivgI/kling-2.1-standard | — | — | — | |
| meta-llama/Llama-2-7b-chat-hf | — | — | — | |
| meta-llama/Llama-3-8b-chat-hf | — | — | — | |
| meta-llama/Llama-3.1-405B | — | — | — | |
| meta-llama/Llama-3.1-405B-Instruct | — | — | — | |
| meta-llama/Llama-3.2-1B | — | — | — | |
| meta-llama/Llama-3.2-1B-Instruct | — | — | — | |
| meta-llama/Llama-3.2-3B | — | — | — | |
| meta-llama/Llama-3.2-3B-Instruct | — | — | — | |
| meta-llama/Llama-3.3-70B-Instruct | — | — | — | |
| meta-llama/Llama-3.3-70B-Instruct-FP8-Lora | — | — | — | |
| meta-llama/Llama-3.3-70B-Instruct-Turbo | — | — | — | |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP4 | — | — | — | |
| meta-llama/Llama-4-Scout-17B-16E | — | — | — | |
| meta-llama/Llama-4-Scout-17B-16E-Instruct | — | — | — | |
| meta-llama/Llama-4-Scout-17B-16E-Instruct-FP8-Lora | — | — | — | |
| meta-llama/Llama-Guard-4-12B | — | — | — | |
| meta-llama/Meta-Llama-3-70B-Instruct-Turbo | — | — | — | |
| meta-llama/Meta-Llama-3-8B-Instruct | — | — | — | |
| meta-llama/Meta-Llama-3-8B-Instruct-Lite | — | — | — | |
| meta-llama/Meta-Llama-3.1-70B | — | — | — | |
| meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo | — | — | — | |
| meta-llama/Meta-Llama-3.1-8B | — | — | — | |
| meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo | — | — | — | |
| minimax/hailuo-02 | — | — | — | |
| minimax/speech-2.6-turbo | — | — | — | |
| minimax/speech-2.8-turbo | — | — | — | |
| minimax/video-01-director | — | — | — | |
| mistralai/Devstral-Small-2505 | — | — | — | |
| mistralai/Magistral-Small-2506 | — | — | — | |
| mistralai/Ministral-3-14B-Instruct-2512 | — | — | — | |
| mistralai/Mistral-7B-Instruct-v0.1 | — | — | — | |
| mistralai/Mistral-7B-Instruct-v0.3 | — | — | — | |
| mistralai/Mistral-7B-v0.1 | — | — | — | |
| mistralai/Mistral-Small-24B-Instruct-2501 | — | — | — | |
| mistralai/Mixtral-8x22B-Instruct-v0.1 | — | — | — | |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | — | — | — | |
| mistralai/Mixtral-8x7B-Instruct-v0.1-FP8-Lora | — | — | — | |
| mistralai/Mixtral-8x7B-v0.1 | — | — | — | |
| mixedbread-ai/mxbai-rerank-large-v2 | — | — | — | |
| moonshotai/Kimi-K2.5-fp4 | — | — | — | |
| moonshotai/Kimi-K2.6 | — | — | — | |
| nim/meta/llama-3.1-70b-instruct | — | — | — | |
| nim/meta/llama-3.1-8b-instruct | — | — | — | |
| nim/meta/llama-3.2-11b-vision-instruct | — | — | — | |
| nim/meta/llama-3.2-90b-vision-instruct | — | — | — | |
| nim/meta/llama-3.3-70b-instruct | — | — | — | |
| nim/mistralai/mixtral-8x22b-instruct-v01 | — | — | — | |
| nim/mistralai/mixtral-8x7b-instruct-v01 | — | — | — | |
| nim/nv-mistralai/mistral-nemo-12b-instruct | — | — | — | |
| nim/nvidia/llama-3.1-nemotron-70b-instruct | — | — | — | |
| nim/nvidia/llama-3.3-nemotron-super-49b-v1 | — | — | — | |
| nvidia/Llama-3.1-Nemotron-70B-Instruct-HF | — | — | — | |
| nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 | — | — | — | |
| nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 | — | — | — | |
| nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8 | — | — | — | |
| nvidia/NVIDIA-Nemotron-Nano-9B-v2 | — | — | — | |
| nvidia/nemotron-3-asr-streaming-0.6b | — | — | — | |
| nvidia/nemotron-3-nano-omni-30b-a3b-reasoning-fp8 | — | — | — | |
| nvidia/nemotron-3-ultra-550b-a55b | — | — | — | |
| nvidia/nemotron-3.5-asr-streaming-0.6b | — | — | — | |
| nvidia/parakeet-tdt-0.6b-v3 | — | — | — | |
| openai/gpt-image-1.5 | — | — | — | |
| openai/gpt-oss-120b | — | — | — | |
| openai/gpt-oss-20b | — | — | — | |
| openai/sora-2 | — | — | — | |
| openai/sora-2-pro | — | — | — | |
| openai/whisper-large-v3 | — | — | — | |
| pearl-ai/gemma-4-31b-it | — | — | — | |
| pixverse/pixverse-v5 | — | — | — | |
| pixverse/pixverse-v5.6 | — | — | — | |
| pixverse/pixverse-v6 | — | — | — | |
| rime-labs/rime-arcana-v2 | — | — | — | |
| rime-labs/rime-arcana-v3 | — | — | — | |
| rime-labs/rime-arcana-v3-turbo | — | — | — | |
| rime-labs/rime-mist-v2 | — | — | — | |
| rime-labs/rime-mist-v3 | — | — | — | |
| rime-labs/rime-mist-v3-omni | — | — | — | |
| sarvamai/sarvam-m | — | — | — | |
| stabilityai/stable-diffusion-3-medium | — | — | — | |
| stabilityai/stable-diffusion-xl-base-1.0 | — | — | — | |
| togethercomputer/EssentialAI-RNJ-1-Instruct | — | — | — | |
| togethercomputer/meta-llama-3.1-8B-Instruct-AWQ-INT4 | — | — | — | |
| vidu/vidu-2.0 | — | — | — | |
| vidu/vidu-q1 | — | — | — | |
| vidu/vidu-q3 | — | — | — | |
| vidu/vidu-q3-turbo | — | — | — | |
| zai-org/GLM-4.5-Air-FP8 | — | — | — | |
| zai-org/GLM-4.5V | — | — | — | |
| zai-org/GLM-4.6 | — | — | — | |
| zai-org/GLM-4.7 | — | — | — | |
| zai-org/GLM-4.7-FP8 | — | — | — | |
| zai-org/GLM-4.7-fp4 | — | — | — | |
| zai-org/GLM-5 | — | — | — | |
| zai-org/GLM-5-FP4 | — | — | — | |
| zai-org/GLM-5.1 | — | — | — | |
| zai-org/GLM-OCR | — | — | — |
FAQ
Yes — fine-tuning via LoRA, QLoRA, and full fine-tuning is available for many open models through a managed pipeline. Training jobs are billed per GPU-hour.