TII Falcon-H1R 7B Release
Narrative
Compact 7B reasoning model outperforms 15B models. 88.1% AIME-24, 68.6% LCB v6. Hybrid Transformer-Mamba2 architecture. 256K context. Open-source.
Reality
Benchmarks verified. Efficiency gains real: 7B matching 32B-50B performance. 1,500 tokens/sec/GPU. Open weights under Falcon LLM license. Validates hybrid architectures.
Implication
Proved small models with efficient architecture can match larger ones. Hybrid Transformer-Mamba2 shows path beyond pure transformers. Test-time scaling via DeepConf validated.