Back to List
TechnologyAIChipsInnovation

OpenAI Partners with Cerebras for 'Near-Instant' Code Generation with GPT-5.3-Codex-Spark, Diversifying Beyond Nvidia

OpenAI has launched GPT-5.3-Codex-Spark, a new coding model designed for near-instantaneous response times, marking its first major inference partnership outside of its traditional Nvidia-dominated infrastructure. This model runs on hardware from Cerebras Systems, a chipmaker specializing in low-latency AI workloads. The move comes as OpenAI navigates a complex relationship with Nvidia, faces criticism over ChatGPT ads, secures a Pentagon contract, and experiences internal organizational changes. While an OpenAI spokesperson stated that GPUs remain foundational, Cerebras complements these by excelling in workflows requiring extremely low latency, enhancing real-time coding experiences. Codex-Spark is OpenAI's first model built for real-time coding collaboration, claiming over 1000 tokens per second on ultra-low latency hardware, though specific latency metrics were not provided.

VentureBeat

OpenAI on Thursday launched GPT-5.3-Codex-Spark, a stripped-down coding model engineered for near-instantaneous response times. This deployment signifies the company's first significant inference partnership outside its traditional Nvidia-dominated infrastructure. The model operates on hardware provided by Cerebras Systems, a Sunnyvale-based chipmaker renowned for its wafer-scale processors that specialize in low-latency AI workloads.

This partnership emerges at a critical juncture for OpenAI. The company is currently navigating a strained relationship with its long-standing chip supplier, Nvidia. Concurrently, it faces increasing criticism regarding its decision to introduce advertisements into ChatGPT, has recently announced a Pentagon contract, and is experiencing internal organizational upheaval, including the disbandment of a safety-focused team and the resignation of at least one researcher in protest.

An OpenAI spokesperson clarified the strategic importance of this new collaboration to VentureBeat, stating, "GPUs remain foundational across our training and inference pipelines and deliver the most cost effective tokens for broad usage." The spokesperson added, "Cerebras complements that foundation by excelling at workflows that demand extremely low latency, tightening the end-to-end loop so use cases such as real-time coding in Codex feel more responsive as you iterate." This careful articulation, emphasizing the foundational role of GPUs while positioning Cerebras as a complement, highlights OpenAI's delicate balancing act as it diversifies its chip suppliers without alienating Nvidia, which remains the dominant force in AI accelerators.

OpenAI acknowledges that these speed gains come with certain capability tradeoffs, which the company believes developers will accept. Codex-Spark is presented as OpenAI's inaugural model specifically designed for real-time coding collaboration. The company asserts that the model can deliver more than 1000 tokens per second when served on ultra-low latency hardware. However, OpenAI declined to provide specific latency metrics, such as time-to-first-token figures, only stating that "Codex-Spark is optimized to feel near-instant."

Related News

Technology

Open-Mercato: AI-Powered CRM/ERP Framework for R&D, Operations, and Growth – Enterprise-Grade, Modular, and Highly Customizable

Open-Mercato is an AI-supported CRM/ERP foundational framework designed to empower research and development, new processes, operations, and growth. It boasts a modular and scalable architecture, specifically tailored for teams seeking robust default functionalities alongside extensive customization options. The framework positions itself as a superior enterprise-grade alternative to solutions like Django and Retool, offering a powerful platform for businesses.

Technology

Heretic: Fully Automated Censorship Removal for Language Models Trending on GitHub

Heretic, a new project by p-e-w, has recently gained traction on GitHub Trending. Published on February 21, 2026, this tool focuses on the fully automated removal of censorship from language models. The project's primary aim is to provide a solution for users seeking to bypass restrictions within these AI systems, as indicated by its brief description and prominent GitHub presence.

Technology

Superpowers: A Comprehensive Software Development Workflow and Skill Framework for Coding Agents on GitHub Trending

Superpowers, recently featured on GitHub Trending, introduces an effective agent skill framework and a complete software development methodology. Designed for coding agents, this workflow is built upon a foundation of composable 'skills' and includes an initial set of these skills. It aims to streamline the development process for AI-driven coding agents by providing a structured and modular approach to their capabilities.