Back to List
TechnologyAIChipsInnovation

OpenAI Partners with Cerebras for 'Near-Instant' Code Generation with GPT-5.3-Codex-Spark, Diversifying Beyond Nvidia

OpenAI has launched GPT-5.3-Codex-Spark, a new coding model designed for near-instantaneous response times, marking its first major inference partnership outside of its traditional Nvidia-dominated infrastructure. This model runs on hardware from Cerebras Systems, a chipmaker specializing in low-latency AI workloads. The move comes as OpenAI navigates a complex relationship with Nvidia, faces criticism over ChatGPT ads, secures a Pentagon contract, and experiences internal organizational changes. While an OpenAI spokesperson stated that GPUs remain foundational, Cerebras complements these by excelling in workflows requiring extremely low latency, enhancing real-time coding experiences. Codex-Spark is OpenAI's first model built for real-time coding collaboration, claiming over 1000 tokens per second on ultra-low latency hardware, though specific latency metrics were not provided.

VentureBeat

OpenAI on Thursday launched GPT-5.3-Codex-Spark, a stripped-down coding model engineered for near-instantaneous response times. This deployment signifies the company's first significant inference partnership outside its traditional Nvidia-dominated infrastructure. The model operates on hardware provided by Cerebras Systems, a Sunnyvale-based chipmaker renowned for its wafer-scale processors that specialize in low-latency AI workloads.

This partnership emerges at a critical juncture for OpenAI. The company is currently navigating a strained relationship with its long-standing chip supplier, Nvidia. Concurrently, it faces increasing criticism regarding its decision to introduce advertisements into ChatGPT, has recently announced a Pentagon contract, and is experiencing internal organizational upheaval, including the disbandment of a safety-focused team and the resignation of at least one researcher in protest.

An OpenAI spokesperson clarified the strategic importance of this new collaboration to VentureBeat, stating, "GPUs remain foundational across our training and inference pipelines and deliver the most cost effective tokens for broad usage." The spokesperson added, "Cerebras complements that foundation by excelling at workflows that demand extremely low latency, tightening the end-to-end loop so use cases such as real-time coding in Codex feel more responsive as you iterate." This careful articulation, emphasizing the foundational role of GPUs while positioning Cerebras as a complement, highlights OpenAI's delicate balancing act as it diversifies its chip suppliers without alienating Nvidia, which remains the dominant force in AI accelerators.

OpenAI acknowledges that these speed gains come with certain capability tradeoffs, which the company believes developers will accept. Codex-Spark is presented as OpenAI's inaugural model specifically designed for real-time coding collaboration. The company asserts that the model can deliver more than 1000 tokens per second when served on ultra-low latency hardware. However, OpenAI declined to provide specific latency metrics, such as time-to-first-token figures, only stating that "Codex-Spark is optimized to feel near-instant."

Related News

Project N.O.M.A.D: A Self-Sufficient Offline Survival Computer with AI and Essential Tools for Anytime, Anywhere Access
Technology

Project N.O.M.A.D: A Self-Sufficient Offline Survival Computer with AI and Essential Tools for Anytime, Anywhere Access

Project N.O.M.A.D (N.O.M.A.D project) is introduced as a self-sufficient, offline survival computer designed to provide users with critical tools, knowledge, and AI capabilities. This system aims to ensure users can access information and maintain an advantage regardless of their location or connectivity status. The project emphasizes self-reliance and preparedness through its integrated features.

MiroFish: A Concise and Universal Swarm Intelligence Engine for Predicting Everything
Technology

MiroFish: A Concise and Universal Swarm Intelligence Engine for Predicting Everything

MiroFish, an innovative project by 666ghj, has emerged as a trending repository on GitHub. Described as a concise and universal swarm intelligence engine, MiroFish aims to predict a wide array of phenomena. The project's core concept revolves around leveraging collective intelligence to offer predictive capabilities across various domains. Further details regarding its specific applications or underlying technology are not provided in the initial description.

GitNexus: Zero-Server Code Smart Engine Transforms GitHub Repos and ZIP Files into Interactive Knowledge Graphs with Built-in Graph RAG Agent for Enhanced Code Exploration
Technology

GitNexus: Zero-Server Code Smart Engine Transforms GitHub Repos and ZIP Files into Interactive Knowledge Graphs with Built-in Graph RAG Agent for Enhanced Code Exploration

GitNexus is a client-side knowledge graph creator that operates entirely within the browser, requiring no server-side code. Users can input GitHub repositories or ZIP files to generate an interactive knowledge graph, which includes a built-in Graph RAG agent. This tool is designed to significantly enhance code exploration by providing a visual and interactive way to understand codebases.