
Managing AI Coding with Agent Evaluation Logic: Lessons from a 310,000-Line Code Refactoring Project
Meituan's technical team has introduced a novel approach to managing AI-driven development by applying Agent evaluation logic to a massive 310,000-line code refactoring initiative. With AI now capable of generating over 90% of code, the primary challenge has shifted from production speed to the management of system complexity and chaos. By implementing a structured framework—including technical debt sorting, rule construction, a standardized refactoring SOP, and a Pre-PR mechanism—the team has successfully transitioned refactoring from a high-cost, periodic task into a continuous, iterative daily action. This methodology ensures that AI's capabilities are constrained by unified standards, preventing the amplification of technical debt and ensuring long-term system stability in an AI-native development environment.
Key Takeaways
- Shift in Focus: When AI generates over 90% of code, the critical factor is no longer the speed of generation but the ability to constrain and manage AI outputs.
- Agent Evaluation Logic: Applying evaluation frameworks typically used for AI Agents to the coding process helps maintain system order and quality.
- Four-Pillar Framework: Successful large-scale refactoring (310,000 lines) relies on technical debt sorting, rule construction, refactoring SOPs, and a Pre-PR mechanism.
- Continuous Integration: Refactoring has evolved from a high-cost, specialized project into a sustainable, daily iterative action integrated into the development lifecycle.
In-Depth Analysis
The Challenge of AI-Generated Chaos
In the current landscape of software engineering, the Meituan technical team observes that AI is now capable of generating more than 90% of code. However, this efficiency presents a double-edged sword. Without unified specifications and strict constraints, AI does not inherently improve system quality; instead, it can exponentially amplify existing chaos and technical debt. The core problem identified is that the speed of AI coding can outpace the human ability to maintain architectural integrity. Therefore, the focus of technical management must shift from "how to write faster" to "how to constrain AI effectively."
The Framework of AI Coding Management
To address the complexities of a 310,000-line code refactoring project, the team developed a management strategy based on Agent evaluation logic. This approach is structured around four critical components:
- Technical Debt Sorting: This involves a systematic identification of existing issues within the codebase. By understanding the current state of technical debt, the team can provide the AI with a clear roadmap of what needs to be addressed.
- Rule Construction: To prevent the AI from creating further disorder, the team established a set of rigorous rules. These rules act as the boundaries within which the AI must operate, ensuring that all generated or refactored code adheres to specific standards.
- Refactoring SOP (Standard Operating Procedure): By standardizing the refactoring process, the team ensures consistency across the project. This SOP guides the AI through the necessary steps to transform legacy code into a modernized state without introducing new regressions.
- Pre-PR Mechanism: This serves as a critical gatekeeping stage. Before code is even submitted for a Pull Request, it undergoes a preliminary check to ensure it meets the predefined rules and quality standards. This mechanism is essential for catching AI-generated errors early in the cycle.
From Special Projects to Daily Iteration
One of the most significant outcomes of this practice is the transformation of the refactoring workflow. Traditionally, refactoring 310,000 lines of code would be viewed as a high-cost, one-time "special project" that consumes significant resources and time. By utilizing AI Agents and the described management framework, Meituan has successfully integrated refactoring into the daily development routine. This shift allows for continuous improvement of the codebase alongside regular feature iterations, making the maintenance of high-quality code a sustainable and low-friction activity.
Industry Impact
The methodology shared by Meituan marks a significant milestone in AI-native software engineering. As industry-wide AI adoption grows, the "Agent evaluation" mindset for code management provides a blueprint for other organizations facing similar scaling challenges. It highlights that the future of software development lies not just in the power of Large Language Models (LLMs) to write code, but in the engineering of robust systems that can evaluate, constrain, and direct that power. This approach reduces the long-term maintenance burden and sets a new standard for how large-scale legacy systems can be modernized in the age of AI.
Frequently Asked Questions
Question: Why is "Agent evaluation logic" used for managing AI coding?
Because AI-generated code can quickly lead to system chaos if left unmanaged. Using Agent evaluation logic allows teams to treat the AI as an autonomous entity that requires specific constraints, benchmarks, and feedback loops to ensure its output aligns with human-defined architectural standards.
Question: What is the role of the Pre-PR mechanism in this process?
The Pre-PR mechanism acts as an automated quality gate. It evaluates AI-generated refactoring results against established rules and standards before they reach the human review stage, ensuring that only high-quality, compliant code is integrated into the main repository.
Question: How does this approach change the cost of code refactoring?
It transforms refactoring from a high-cost, periodic "special project" into a continuous, low-cost daily action. By automating the sorting and refactoring process through AI Agents with strict SOPs, the effort required to maintain code quality is distributed across the entire development lifecycle.

