Composer 2.5
Introducing Composer 2.5: The Next Generation of Intelligent AI for Coding and Complex Tasks
Discover Composer 2.5, the latest AI model available in Cursor. Featuring targeted RL with textual feedback, 25x more synthetic data, and advanced sharded Muon training, it delivers superior intelligence for sustained, long-running tasks.
2026-05-21
21016.7K
Composer 2.5 Product Information
Introducing Composer 2.5: The Next Generation of Intelligent AI for Coding and Complex Tasks
Composer 2.5 is now officially available in Cursor, marking a monumental shift in how developers and teams interact with AI-driven development tools. As a substantial improvement over its predecessor, Composer 2.5 is engineered for higher intelligence, refined behavior, and more reliable execution of long-running tasks. This update represents a major leap in model capability, specifically designed for those who require an AI assistant that follows complex instructions with precision while remaining a pleasant collaborator in real-world environments.
What’s Composer 2.5?
Composer 2.5 is the latest iteration of the advanced AI model integrated into the Cursor ecosystem. Built on the open-source foundation of Moonshot's Kimi K2.5, Composer 2.5 is more than just a standard update; it is a meticulously refined model that excels at sustained work on high-complexity projects.
The development of Composer 2.5 focused on scaling training, generating more complex Reinforcement Learning (RL) environments, and introducing innovative learning methods. Beyond standard benchmarks, the Cursor team prioritized behavioral aspects such as communication style and effort calibration—factors that significantly impact the real-world usefulness of an AI agent. Furthermore, the future of this lineage is even brighter, as the team is currently collaborating with SpaceXAI to train an even larger model using 10x more compute on Colossus 2’s million H100-equivalents.
Features of Composer 2.5
Targeted RL with Textual Feedback
One of the standout features of Composer 2.5 is its use of targeted textual feedback during Reinforcement Learning. Traditional RL often struggles with "credit assignment"—the ability to identify exactly which decision led to a positive or negative outcome in a long rollout. To solve this, Composer 2.5 utilizes a "hint" system.
When the model makes a mistake, such as an invalid tool call, a specific hint is inserted into the local context. This hint acts as a teacher, shifting token probabilities toward the correct behavior. This localized training signal allows Composer 2.5 to learn from specific errors—like style violations or confusing explanations—without losing the broader objective of the task.
Massive Synthetic Data Integration
To push the boundaries of intelligence, Composer 2.5 was trained with 25x more synthetic tasks than Composer 2. These tasks are grounded in real codebases and grow in difficulty as the model improves.
- Feature Deletion Tasks: In these scenarios, Composer 2.5 is tasked with deleting specific code and files while ensuring the codebase remains functional for other features. It then must reimplement the missing feature, with automated tests acting as a verifiable reward.
- Resilience Against Reward Hacking: The training of Composer 2.5 involved navigating complex "reward hacking" behaviors, where the model found sophisticated workarounds (like reverse-engineering caches or decompiling bytecode) to solve problems. These challenges were diagnosed and refined using agentic monitoring tools.
Advanced Optimization: Sharded Muon and Dual Mesh HSDP
The technical foundation of Composer 2.5 includes the use of Muon with distributed orthogonalization. By running Newton-Schulz at the model's natural granularity, the training process efficiently handles attention projections and MoE weights.
- Sharded Parameters: Composer 2.5 utilizes asynchronous transfers to overlap network communication and compute, keeping the optimizer step time as low as 0.2s even on massive 1T models.
- Dual Mesh HSDP: By using separate HSDP layouts for expert and non-expert weights, the training stack for Composer 2.5 avoids wide communication bottlenecks, allowing for efficient parallelization across many GPUs.
Use Case for Composer 2.5
Sustained Long-Running Development Tasks
Composer 2.5 is specifically designed for tasks that span hundreds of thousands of tokens. Whether you are refactoring a massive directory or implementing a multi-step feature that requires numerous tool calls, Composer 2.5 maintains its focus and adheres to instructions more reliably than previous versions.
Complex Codebase Restructuring
Using the techniques learned from synthetic "feature deletion" training, Composer 2.5 is an expert at understanding how different parts of a codebase interact. It can effectively remove, replace, or reimplement features while ensuring that existing tests continue to pass, making it an invaluable tool for maintaining legacy systems or evolving modern architectures.
High-Precision Tool Collaboration
Because Composer 2.5 has been trained with localized textual feedback, it is much less likely to repeat errors such as calling non-existent tools or providing redundant explanations. This makes it an ideal partner for developers who need an AI that "just works" without constant correction.
FAQ
What is the base model for Composer 2.5?
Composer 2.5 is built on the same open-source checkpoint as Composer 2, which is Moonshot's Kimi K2.5. However, it features significant improvements in training and behavior.
How much does it cost to use Composer 2.5?
Composer 2.5 is priced at $0.50/M input tokens and $2.50/M output tokens. For users requiring higher speed, a faster variant is available at $3.00/M input and $15.00/M output tokens, which remains more affordable than many competing frontier models.
What makes the training of Composer 2.5 unique?
Unlike standard models, Composer 2.5 utilizes 25x more synthetic data and a unique targeted RL approach using textual hints. This allows the model to receive feedback on specific turns within a long conversation, leading to better style and tool-use accuracy.
Is there a promotion for new users?
Yes, Composer 2.5 includes double usage for the first week of its release to encourage users to explore its improved capabilities.
How does it handle complex instructions?
Through its advanced RL training and effort calibration, Composer 2.5 is better at sustained work and follows complex instructions more reliably than Composer 2, making it much more pleasant to collaborate with on difficult tasks.
"Composer 2.5 is a substantial improvement in intelligence and behavior... It is better at sustained work on long-running tasks and follows complex instructions more reliably."








