Hugging Face

North America Est. 2016 Free tier for public models; pay-per-compute for Inference Endpoints; enterprise contracts.

model-hubopen-source-platforminference-providercommunity

The 'GitHub of AI' — an open platform hosting 500,000+ models, 100,000+ datasets, and Spaces for demos.

What They Do

Hugging Face hosts the world's largest public repository of machine-learning models and datasets, anchored by the Transformers library used by millions of researchers. Its Inference API lets developers call models over HTTPS without managing GPU infrastructure.

Mission

Democratise good machine learning for researchers and practitioners everywhere.

Available Models

Model	Context	Input /M	Output /M
Andycurrent/Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking_GGUF	—	—	—
Bahushruth/Qwen3.6-35B-A3B-abliterated-v4	—	—	—
Doradus-AI/RnJ-1-Instruct-FP8	—	—	—
Dream-org/Dream-v0-Instruct-7B	—	—	—
EleutherAI/gpt-neo-125m	—	—	—
EleutherAI/gpt-neo-2.7B	—	—	—
EleutherAI/gpt-neox-20b	—	—	—
EleutherAI/pythia-160m	—	—	—
EleutherAI/pythia-70m-deduped	—	—	—
GSAI-ML/LLaDA-8B-Instruct	—	—	—
HuggingFaceTB/SmolLM2-135M	—	—	—
HuggingFaceTB/SmolLM2-135M-Instruct	—	—	—
HuggingFaceTB/SmolLM3-3B	—	—	—
IlyaGusev/saiga_llama3_8b	—	—	—
KomeijiForce/bart-large-emojilm	—	—	—
LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct	—	—	—
LilaRest/gemma-4-31B-it-NVFP4-turbo	—	—	—
LiquidAI/LFM2.5-1.2B-Instruct	—	—	—
Maykeye/TinyLLama-v0	—	—	—
MiniMaxAI/MiniMax-M2.5	—	—	—
MiniMaxAI/MiniMax-M2.7	—	—	—
NexVeridian/Qwen3-Coder-Next-8bit	—	—	—
NousResearch/Hermes-3-Llama-3.1-8B	—	—	—
NousResearch/Meta-Llama-3.1-8B-Instruct	—	—	—
OBLITERATUS/gemma-4-E4B-it-OBLITERATED	—	—	—
Orion-zhen/Qwen2.5-Coder-7B-Instruct-AWQ	—	—	—
QuantTrio/DeepSeek-V3.2-AWQ	—	—	—
QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ	—	—	—
QuantTrio/Qwen3-VL-30B-A3B-Instruct-AWQ	—	—	—
Qwen/Qwen-72B	—	—	—
Qwen/Qwen2-0.5B	—	—	—
Qwen/Qwen2-0.5B-Instruct	—	—	—
Qwen/Qwen2-1.5B-Instruct	—	—	—
Qwen/Qwen2-7B-Instruct	—	—	—
Qwen/Qwen2.5-0.5B	—	—	—
Qwen/Qwen2.5-0.5B-Instruct	—	—	—
Qwen/Qwen2.5-1.5B	—	—	—
Qwen/Qwen2.5-1.5B-Instruct	—	—	—
Qwen/Qwen2.5-1.5B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-14B-Instruct	—	—	—
Qwen/Qwen2.5-14B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-32B-Instruct	—	—	—
Qwen/Qwen2.5-32B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4	—	—	—
Qwen/Qwen2.5-3B	—	—	—
Qwen/Qwen2.5-3B-Instruct	—	—	—
Qwen/Qwen2.5-72B-Instruct	—	—	—
Qwen/Qwen2.5-72B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-7B	—	—	—
Qwen/Qwen2.5-7B-Instruct	—	—	—
Qwen/Qwen2.5-7B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-Coder-1.5B-Instruct	—	—	—
Qwen/Qwen2.5-Coder-14B-Instruct	—	—	—
Qwen/Qwen2.5-Coder-14B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-Coder-32B-Instruct	—	—	—
Qwen/Qwen2.5-Coder-32B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-Coder-3B	—	—	—
Qwen/Qwen2.5-Coder-7B	—	—	—
Qwen/Qwen2.5-Coder-7B-Instruct	—	—	—
Qwen/Qwen2.5-Coder-7B-Instruct-AWQ	—	—	—
Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4	—	—	—
Qwen/Qwen2.5-Math-1.5B	—	—	—
Qwen/Qwen2.5-Math-1.5B-Instruct	—	—	—
Qwen/Qwen3-0.6B	—	—	—
Qwen/Qwen3-0.6B-Base	—	—	—
Qwen/Qwen3-0.6B-FP8	—	—	—
Qwen/Qwen3-1.7B	—	—	—
Qwen/Qwen3-1.7B-Base	—	—	—
Qwen/Qwen3-1.7B-GPTQ-Int8	—	—	—
Qwen/Qwen3-14B	—	—	—
Qwen/Qwen3-14B-AWQ	—	—	—
Qwen/Qwen3-14B-FP8	—	—	—
Qwen/Qwen3-235B-A22B	—	—	—
Qwen/Qwen3-30B-A3B	—	—	—
Qwen/Qwen3-30B-A3B-FP8	—	—	—
Qwen/Qwen3-30B-A3B-Instruct-2507	—	—	—
Qwen/Qwen3-30B-A3B-Instruct-2507-FP8	—	—	—
Qwen/Qwen3-32B	—	—	—
Qwen/Qwen3-32B-AWQ	—	—	—
Qwen/Qwen3-32B-FP8	—	—	—
Qwen/Qwen3-4B	—	—	—
Qwen/Qwen3-4B-AWQ	—	—	—
Qwen/Qwen3-4B-Base	—	—	—
Qwen/Qwen3-4B-GGUF	—	—	—
Qwen/Qwen3-4B-Instruct-2507	—	—	—
Qwen/Qwen3-4B-Instruct-2507-FP8	—	—	—
Qwen/Qwen3-4B-Thinking-2507	—	—	—
Qwen/Qwen3-8B	—	—	—
Qwen/Qwen3-8B-AWQ	—	—	—
Qwen/Qwen3-8B-Base	—	—	—
Qwen/Qwen3-8B-FP8	—	—	—
Qwen/Qwen3-Coder-30B-A3B-Instruct	—	—	—
Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8	—	—	—
Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8	—	—	—
Qwen/Qwen3-Coder-Next	—	—	—
Qwen/Qwen3-Coder-Next-FP8	—	—	—
Qwen/Qwen3-Next-80B-A3B-Instruct	—	—	—
Qwen/Qwen3-Next-80B-A3B-Instruct-FP8	—	—	—
Qwen/Qwen3Guard-Gen-0.6B	—	—	—
RedHatAI/Llama-3.2-1B-Instruct-FP8	—	—	—
RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic	—	—	—
RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8	—	—	—
RedHatAI/Qwen2.5-1.5B-quantized.w8a8	—	—	—
RedHatAI/Qwen3-Coder-Next-FP8-dynamic	—	—	—
TIGER-Lab/VLM2Vec-Full	—	—	—
TheBloke/TinyLlama-1.1B-Chat-v0.3-GPTQ	—	—	—
TinyLlama/TinyLlama-1.1B-Chat-v1.0	—	—	—
TitanML/tiny-mixtral	—	—	—
VLTX/VertaLily-1.2-1B-GGUF	—	—	—
Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24	—	—	—
XiaomiMiMo/MiMo-7B-RL	—	—	—
Zyphra/Zamba2-1.2B-instruct	—	—	—
allenai/OLMo-2-0425-1B	—	—	—
ansulev/Darwin-9B-NEG	—	—	—
antirez/deepseek-v4-gguf	—	—	—
apple/OpenELM-1_1B-Instruct	—	—	—
bartowski/Llama-3.2-1B-Instruct-GGUF	—	—	—
bigscience/bloom-560m	—	—	—
bigscience/bloomz-560m	—	—	—
casperhansen/llama-3.3-70b-instruct-awq	—	—	—
casperhansen/mistral-nemo-instruct-2407-awq	—	—	—
cyankiwi/GLM-4.5-Air-AWQ-4bit	—	—	—
cyankiwi/Hermes-4-14B-AWQ-4bit	—	—	—
cyankiwi/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit	—	—	—
cyankiwi/Qwen3-30B-A3B-Thinking-2507-AWQ-4bit	—	—	—
cyankiwi/Qwen3-Coder-30B-A3B-Instruct-AWQ-4bit	—	—	—
cyankiwi/Qwen3-Next-80B-A3B-Instruct-AWQ-4bit	—	—	—
datajuicer/LLaMA-1B-dj-refine-150B	—	—	—
datificate/gpt2-small-spanish	—	—	—
decart-ai/Kimi-K2.7-Code-NVFP4	—	—	—
deepreinforce-ai/Ornith-1.0-35B	—	—	—
deepreinforce-ai/Ornith-1.0-35B-FP8	—	—	—
deepreinforce-ai/Ornith-1.0-35B-GGUF	—	—	—
deepreinforce-ai/Ornith-1.0-397B-FP8	—	—	—
deepreinforce-ai/Ornith-1.0-9B	—	—	—
deepreinforce-ai/Ornith-1.0-9B-GGUF	—	—	—
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct	—	—	—
deepseek-ai/DeepSeek-R1	—	—	—
deepseek-ai/DeepSeek-R1-0528	—	—	—
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B	—	—	—
deepseek-ai/DeepSeek-R1-Distill-Llama-70B	—	—	—
deepseek-ai/DeepSeek-R1-Distill-Llama-8B	—	—	—
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B	—	—	—
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B	—	—	—
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B	—	—	—
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	—	—	—
deepseek-ai/DeepSeek-V2-Lite	—	—	—
deepseek-ai/DeepSeek-V2-Lite-Chat	—	—	—
deepseek-ai/DeepSeek-V3	—	—	—
deepseek-ai/DeepSeek-V3-0324	—	—	—
deepseek-ai/DeepSeek-V3.2	—	—	—
deepseek-ai/DeepSeek-V4-Flash	—	—	—
deepseek-ai/DeepSeek-V4-Pro	—	—	—
deepseek-ai/deepseek-coder-6.7b-instruct	—	—	—
deepseek-ai/deepseek-coder-7b-instruct-v1.5	—	—	—
distilbert/distilgpt2	—	—	—
dphn/dolphin-2.9.1-yi-1.5-34b	—	—	—
empero-ai/Qwythos-9B-Claude-Mythos-5-1M-GGUF	—	—	—
facebook/opt-1.3b	—	—	—
facebook/opt-125m	—	—	—
farbodtavakkoli/OTel-LLM-8B-A1B-IT	—	—	—
farbodtavakkoli/OTel-LLM-E4B-IT	—	—	—
google/gemma-2-2b-it	—	—	—
google/gemma-2-9b-it	—	—	—
google/gemma-3-1b-it	—	—	—
google/gemma-3-270m	—	—	—
google/gemma-3-270m-it	—	—	—
h2oai/h2ovl-mississippi-2b	—	—	—
h2oai/h2ovl-mississippi-800m	—	—	—
hmellor/tiny-random-BambaForCausalLM	—	—	—
hmellor/tiny-random-Gemma2ForCausalLM	—	—	—
hmellor/tiny-random-LlamaForCausalLM	—	—	—
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF	—	—	—
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4	—	—	—
ibm-granite/granite-4.0-h-small	—	—	—
ibm-granite/granite-4.1-3b	—	—	—
ibm-granite/granite-4.1-8b	—	—	—
ibm-research/PowerMoE-3b	—	—	—
janhq/Jan-v3.5-4B-gguf	—	—	—
kaitchup/Phi-3-mini-4k-instruct-gptq-4bit	—	—	—
kosbu/Llama-3.3-70B-Instruct-AWQ	—	—	—
llamafactory/tiny-random-Llama-3	—	—	—
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bit	—	—	—
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-8bit	—	—	—
meta-llama/Llama-2-7b-chat-hf	—	—	—
meta-llama/Llama-2-7b-hf	—	—	—
meta-llama/Llama-3.1-70B-Instruct	—	—	—
meta-llama/Llama-3.1-8B	—	—	—
meta-llama/Llama-3.1-8B-Instruct	—	—	—
meta-llama/Llama-3.2-1B	—	—	—
meta-llama/Llama-3.2-1B-Instruct	—	—	—
meta-llama/Llama-3.2-3B	—	—	—
meta-llama/Llama-3.2-3B-Instruct	—	—	—
meta-llama/Llama-3.3-70B-Instruct	—	—	—
meta-llama/Meta-Llama-3-8B	—	—	—
meta-llama/Meta-Llama-3-8B-Instruct	—	—	—
microsoft/Phi-3-mini-4k-instruct	—	—	—
microsoft/Phi-3.5-mini-instruct	—	—	—
microsoft/Phi-4-mini-instruct	—	—	—
microsoft/Phi-tiny-MoE-instruct	—	—	—
microsoft/phi-2	—	—	—
microsoft/phi-4	—	—	—
mistralai/Mistral-7B-Instruct-v0.2	—	—	—
mistralai/Mistral-7B-v0.1	—	—	—
mlabonne/Qwen3-30B-A3B-abliterated	—	—	—
mlx-community/gpt-oss-20b-MXFP4-Q8	—	—	—
moonshotai/Kimi-K2-Instruct	—	—	—
moonshotai/Kimi-K2-Instruct-0905	—	—	—
mtgv/MobileLLaMA-1.4B-Chat	—	—	—
nm-testing/SmolLM-1.7B-Instruct-quantized.w4a16	—	—	—
nvidia/DeepSeek-R1-0528-NVFP4-v2	—	—	—
nvidia/DeepSeek-V4-Flash-NVFP4	—	—	—
nvidia/GLM-5.2-NVFP4	—	—	—
nvidia/Gemma-4-26B-A4B-NVFP4	—	—	—
nvidia/Gemma-4-31B-IT-NVFP4	—	—	—
nvidia/Kimi-K2.5-NVFP4	—	—	—
nvidia/Kimi-K2.6-NVFP4	—	—	—
nvidia/Llama-3.1-8B-Instruct-FP8	—	—	—
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5	—	—	—
nvidia/MiniMax-M2.7-NVFP4	—	—	—
nvidia/MiniMax-M3-NVFP4	—	—	—
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16	—	—	—
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8	—	—	—
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4	—	—	—
nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16	—	—	—
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16	—	—	—
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8	—	—	—
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4	—	—	—
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4	—	—	—
nvidia/NVIDIA-Nemotron-Nano-9B-v2	—	—	—
nvidia/Nemotron-Labs-Diffusion-8B-Base	—	—	—
nvidia/Nemotron-Mini-4B-Instruct	—	—	—
nvidia/Qwen3.5-397B-A17B-NVFP4	—	—	—
nvidia/Qwen3.6-27B-NVFP4	—	—	—
nvidia/Qwen3.6-35B-A3B-NVFP4	—	—	—
nvidia/diffusiongemma-26B-A4B-it-NVFP4	—	—	—
openai-community/gpt2	—	—	—
openai-community/gpt2-large	—	—	—
openai-community/gpt2-medium	—	—	—
openai/gpt-oss-120b	—	—	—
openai/gpt-oss-20b	—	—	—
openbmb/MiniCPM5-1B	—	—	—
peft-internal-testing/tiny-random-OPTForCausalLM	—	—	—
prefeitura-rio/Rio-3.0-Open	—	—	—
prefeitura-rio/Rio-3.0-Open-Mini	—	—	—
prism-ml/Bonsai-27B-gguf	—	—	—
prism-ml/Ternary-Bonsai-27B-gguf	—	—	—
rinna/japanese-gpt-neox-small	—	—	—
sakamakismile/Qwen3.6-27B-Text-NVFP4-MTP	—	—	—
shibing624/macbert4csc-base-chinese	—	—	—
speakleash/Bielik-11B-v3.0-Instruct	—	—	—
speakleash/Bielik-11B-v3.0-Instruct-awq	—	—	—
sshleifer/tiny-gpt2	—	—	—
state-spaces/mamba-130m-hf	—	—	—
stelterlab/Mistral-Small-24B-Instruct-2501-AWQ	—	—	—
stepfun-ai/Step-3.5-Flash	—	—	—
tiiuae/falcon-7b	—	—	—
trl-internal-testing/tiny-Cohere2ForCausalLM	—	—	—
trl-internal-testing/tiny-Glm4MoeForCausalLM	—	—	—
trl-internal-testing/tiny-GptOssForCausalLM	—	—	—
trl-internal-testing/tiny-NemotronHForCausalLM-nano	—	—	—
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5	—	—	—
trl-internal-testing/tiny-Qwen3ForCausalLM	—	—	—
trl-internal-testing/tiny-Qwen3MoeForCausalLM	—	—	—
trl-internal-testing/tiny-random-LlamaForCausalLM	—	—	—
typhoon-ai/typhoon2.5-qwen3-4b	—	—	—
unsloth/GLM-4.7-Flash	—	—	—
unsloth/GLM-5.2-GGUF	—	—	—
unsloth/Llama-3.1-8B-Instruct	—	—	—
unsloth/Llama-3.2-1B-Instruct	—	—	—
unsloth/Meta-Llama-3.1-8B-Instruct	—	—	—
unsloth/Qwen-AgentWorld-35B-A3B-GGUF	—	—	—
unsloth/Qwen2.5-14B-Instruct	—	—	—
unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit	—	—	—
unsloth/Qwen2.5-7B-Instruct-bnb-4bit	—	—	—
unsloth/Qwen3-Coder-Next-GGUF	—	—	—
unsloth/gpt-oss-20b-GGUF	—	—	—
unsloth/mistral-7b-v0.3-bnb-4bit	—	—	—
vcruz305/Hy3-GGUF	—	—	—
xlnet/xlnet-base-cased	—	—	—
yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF	—	—	—
yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF	—	—	—
zai-org/GLM-4.5-Air	—	—	—
zai-org/GLM-4.7-Flash	—	—	—
zai-org/GLM-5-FP8	—	—	—
zai-org/GLM-5.1-FP8	—	—	—
zai-org/GLM-5.2	—	—	—
zai-org/GLM-5.2-FP8	—	—	—

FAQ

: Hugging Face was founded in 2016 by Clément Delangue, Julien Chaumond, and Thomas Wolf, originally as a teen chatbot app before pivoting to ML tooling.
: Transformers is Hugging Face's open-source Python library providing a unified API for thousands of pre-trained models for NLP, computer vision, audio, and multimodal tasks.