
Managing AI Coding with Agent Evaluation: Meituan's Practice in Refactoring 310,000 Lines of Code
Meituan's technical team has introduced a groundbreaking approach to managing AI-assisted development, focusing on the refactoring of 310,000 lines of code. As AI now generates over 90% of code in certain environments, the primary challenge has shifted from production speed to the management of AI's output quality. The team argues that without unified standards, AI can exponentially increase technical debt and system chaos. To combat this, Meituan implemented an 'Agent evaluation' mindset, utilizing four key pillars: technical debt sorting, rule construction, a standardized Refactoring SOP, and a Pre-PR (Pull Request) mechanism. This strategy successfully transitions code refactoring from a high-cost, specialized project into a sustainable, daily iterative process, ensuring long-term system stability in the era of AI-dominated coding.
Key Takeaways
- Shift in Focus: When AI generates more than 90% of a system's code, the bottleneck is no longer coding speed but the constraints and rules applied to the AI.
- Chaos Amplification: Without standardized management and unified protocols, AI-generated code tends to multiply existing technical debt and organizational chaos.
- The Four Pillars: Meituan’s management strategy relies on technical debt sorting, rule-based construction, a Refactoring SOP, and a Pre-PR mechanism.
- Continuous Integration: The goal of this framework is to turn large-scale refactoring into a routine, iterative action rather than a costly, one-off technical project.
In-Depth Analysis
The Paradox of AI Speed and System Chaos
In the current landscape of software engineering, the integration of AI has reached a tipping point where the vast majority of code—exceeding 90% in Meituan's practice—is generated by artificial intelligence. However, this surge in productivity introduces a significant paradox: while code is produced faster than ever, the potential for system-wide disorder increases at the same rate. The original news highlights that the determining factor for a system's health is no longer the speed of the developer (or the AI), but the robustness of the constraints placed upon the AI's capabilities.
Without a unified set of specifications, AI acts as a force multiplier for technical debt. It does not inherently understand the long-term architectural goals of a complex system; instead, it follows patterns that may lead to fragmented and inconsistent codebases. Meituan's experience with 310,000 lines of code demonstrates that the management of AI coding requires a fundamental shift from 'writing' to 'governing.'
The Agent Evaluation Framework: A Strategic Approach to Refactoring
To manage the complexities of AI-generated code at scale, Meituan adopted an 'Agent evaluation' mindset. This approach treats the AI as an active participant in the development lifecycle that must be continuously assessed and guided. The framework is built upon four critical components:
- Technical Debt Sorting: Before refactoring can begin, there must be a clear understanding of existing issues. By systematically identifying technical debt, the team can prioritize which areas of the 310,000-line codebase require the most urgent AI intervention.
- Rule Construction: Rules serve as the guardrails for AI. By defining strict coding standards and architectural requirements, the team ensures that the AI-generated output aligns with the organization's technical vision.
- Refactoring SOP (Standard Operating Procedure): Standardizing the refactoring process allows for consistency across different modules and teams. This SOP ensures that the AI follows a predictable path when modifying existing code.
- Pre-PR Mechanism: This acts as a final gatekeeper. Before code is even submitted for a Pull Request, it undergoes a validation phase to ensure it meets the predefined rules and does not introduce new debt. This mechanism is essential for maintaining quality control in a high-velocity environment.
From Specialized Projects to Daily Iterations
One of the most significant outcomes of Meituan's practice is the transformation of the refactoring process itself. Traditionally, refactoring hundreds of thousands of lines of code is viewed as a high-cost, high-risk 'special project' that requires dedicated time and resources. By applying Agent evaluation and automated mechanisms, Meituan has successfully integrated refactoring into the daily development workflow.
This shift means that code quality is maintained continuously as part of regular iterations. Instead of waiting for technical debt to become unmanageable, the system is constantly being refined. This 'continuous refactoring' model is likely the only sustainable way to manage large-scale systems where AI is the primary contributor to the codebase.
Industry Impact
Meituan's practice sets a vital precedent for the global software industry as it moves toward an AI-native development paradigm. The significance lies in the realization that AI tools, while powerful, require a new layer of 'meta-management.' As other organizations reach the 90% AI-generated code threshold, the focus will inevitably shift toward building similar evaluation frameworks and Pre-PR mechanisms. This move signals the end of the 'wild west' phase of AI coding and the beginning of a more disciplined, rule-based era of automated software engineering. The transition from 'manual refactoring' to 'AI-managed continuous improvement' will likely become the standard for maintaining enterprise-level software quality.
Frequently Asked Questions
Question: Why does AI-generated code require more constraints than human-written code?
According to the practice shared by Meituan, AI has the potential to amplify chaos and technical debt if left unguided. Because AI can generate code at a volume and speed that humans cannot easily manually audit, unified rules and constraints are necessary to ensure the output remains consistent with the system's architecture and standards.
Question: What is the purpose of the Pre-PR mechanism in AI coding?
The Pre-PR mechanism serves as an automated validation layer that checks AI-generated code against established rules and standards before it enters the formal Pull Request stage. This ensures that only high-quality, compliant code is moved forward, reducing the burden on human reviewers and preventing the accumulation of technical debt.
Question: How does this approach change the cost of code refactoring?
By using an Agent evaluation mindset and a standardized SOP, refactoring is transformed from an expensive, specialized 'one-off' task into a routine part of daily development iterations. This lowers the overall cost and risk by addressing code quality issues incrementally rather than allowing them to build up into a massive, high-stakes project.


