Shanghai AI startup building large-scale multimodal models under the Step brand, targeting top-tier reasoning benchmarks.
What They Do
StepFun develops frontier foundation models with a focus on multimodal capabilities (image understanding, long video analysis) and strong Chinese-language performance. Step-2 achieved competitive results in mathematics and reasoning benchmarks.
Mission
Build frontier multimodal AI that advances human-level understanding and reasoning.
Available Models
| Model | Family | Context | Input /M | Output /M |
|---|---|---|---|---|
| step-3.5-flash | — | — | — | |
| step-3.5-flash-2603 | — | — | — | |
| step-3.7-flash | — | — | — | |
| step-image-edit-2 | — | — | — | |
| step-tts-2 | — | — | — | |
| stepaudio-2.5-asr | — | — | — | |
| stepaudio-2.5-tts | — | — | — |
FAQ
StepFun was founded in 2023 by Jiang Daxin (CEO), formerly VP of Microsoft Research Asia.