Kimi K2 Thinking Released
Narrative
First open-weights model to beat GPT-5 and Claude Sonnet 4.5 on key benchmarks. Native thinking-while-using-tools capability. 200-300 sequential tool calls. INT4 quantization via QAT. Trained for ~$4.6M.
Reality
HLE 44.9%, BrowseComp 60.2%, SWE-Bench Verified 71.3% — all exceeding GPT-5 and Claude Sonnet 4.5. Artificial Analysis ranked it #2 overall (composite 67), behind only GPT-5 (68). Verified independently.
Implication
Historic moment for open-source AI — first open model genuinely competitive with top proprietary systems across reasoning and agentic tasks. $4.6M training cost challenged assumption that frontier models require billions in compute.