OpenAI Releases GPT-5.3-Codex-Spark with Cerebras

Date: February 12, 2026

Company: OpenAI

Category: Models & Research

Narrative

Research preview of GPT-5.3-Codex-Spark, a smaller real-time coding model optimized for ultra-fast inference. First OpenAI model on Cerebras Wafer Scale Engine 3 hardware. Delivers >1000 tokens/second for near-instant feedback while maintaining strong coding capability. Includes 80% reduction in roundtrip overhead, 30% reduction in per-token overhead, 50% faster time-to-first-token through infrastructure improvements.

OpenAI Blog

Reality

Announced February 12, 2026. Available as research preview to ChatGPT Pro users via Codex app, CLI, and IDE extensions. First milestone in $10B+ multi-year Cerebras partnership announced January 2026. Text-only, 128k context window. Completes tasks in fraction of time vs full GPT-5.3-Codex while maintaining strong SWE-Bench Pro and Terminal-Bench performance.

Implication

Marks OpenAI's first major inference partnership beyond Nvidia, diversifying hardware strategy. Enables two complementary Codex modes: real-time collaboration (Spark) vs long-running tasks (full model). Demonstrates industry shift toward latency-optimized models for interactive workflows. Sets expectation for >1000 tokens/sec as new baseline for real-time AI coding.

OpenAI Releases GPT-5.3-Codex-Spark with Cerebras

Narrative

Reality

Implication

Tags