AI News on July 4, 2026

Meituan Launches LongCat-2.0: A 1.6 Trillion Parameter Model Trained on Domestic Computing Clusters
Industry News

Meituan Launches LongCat-2.0: A 1.6 Trillion Parameter Model Trained on Domestic Computing Clusters

Meituan's technology team has officially announced the release of LongCat-2.0, a massive 1.6 trillion parameter model. This release marks a significant milestone as the industry's first model of this scale to complete its entire training and inference lifecycle on a domestic computing cluster consisting of 50,000 cards. LongCat-2.0 was pre-trained from scratch and features a dynamic activation architecture, with an average of 48B parameters active during operation. Designed with a native 1 million (1M) token ultra-long context window, the model is specifically optimized for Agentic Coding tasks. Its core objective is to provide superior stability and efficiency in code understanding, generation, and execution, addressing the complex needs of modern software development environments.

美团技术团队
Meituan Technical Team Presents Selected Academic Research at ICML 2026
Industry News

Meituan Technical Team Presents Selected Academic Research at ICML 2026

The Meituan Technical Team has announced its participation in the International Conference on Machine Learning (ICML) 2026, showcasing a selection of academic papers. As one of the most influential international academic conferences in the field, ICML serves as a premier platform for discussing the critical challenges and core issues facing the future of machine learning. Meituan's involvement highlights its commitment to contributing to frontier research that possesses both significant theoretical value and practical impact. By engaging with this global community, the Meituan Technical Team aims to help drive the development of the field and influence future research directions through the evaluation and dissemination of high-impact research results.

美团技术团队
LongCat Releases VitaBench 2.0: A Pioneering Benchmark for Long-Term Dynamic AI Agent Evaluation
Research Breakthrough

LongCat Releases VitaBench 2.0: A Pioneering Benchmark for Long-Term Dynamic AI Agent Evaluation

The LongCat team has officially released VitaBench 2.0, marking a significant milestone in the evaluation of artificial intelligence agents. As the first benchmark specifically designed for long-term dynamic user modeling in real-life scenarios, VitaBench 2.0 provides a systematic framework to assess Large Language Models (LLMs). The benchmark focuses on two critical dimensions: personalization and proactivity. By simulating authentic, evolving user interactions over extended periods, VitaBench 2.0 aims to bridge the gap between laboratory testing and real-world application, ensuring that AI agents can effectively adapt to individual user needs and take initiative in complex, dynamic environments.

美团技术团队
Meituan Technical Team Showcases Cutting-Edge AI Research in Search and Recommendation at Top Global Conferences
Industry News

Meituan Technical Team Showcases Cutting-Edge AI Research in Search and Recommendation at Top Global Conferences

Meituan's Business R&D Platform/Search & Recommendation ASX (Agentic System X) team has recently shared insights from their latest research published at premier AI conferences. Focusing on the development of an Agent technology system powered by Large Language Models (LLMs), the team has made significant strides in LLM post-training, Agentic Reinforcement Learning, and multi-modal understanding. With dozens of papers accepted by prestigious venues such as ICLR, NeurIPS, CVPR, and AAAI, Meituan is positioning itself at the forefront of AI innovation. This special feature highlights six selected papers that demonstrate the team's commitment to advancing search and recommendation technologies through sophisticated agentic systems and multi-modal integration, providing valuable insights for the broader AI research community.

美团技术团队
Meituan Open Sources Innovative AIGC Poster Generation Framework Featuring a Comprehensive Technical Closed Loop
Open Source

Meituan Open Sources Innovative AIGC Poster Generation Framework Featuring a Comprehensive Technical Closed Loop

Meituan's intelligent creation team has announced the development and open-sourcing of a robust AIGC technical system designed for automated poster generation. This system is built upon a unique "Generation-Editing-Evaluation" closed loop, ensuring a streamlined workflow from initial content creation to final quality control. The technology has already seen successful implementation in high-traffic commercial scenarios, including Meituan Waimai (food delivery) and various brand IP developments. By open-sourcing this entire technical framework, Meituan provides the global developer community with a proven model for integrating generative AI into professional marketing and design workflows, marking a significant step in the democratization of intelligent design tools.

美团技术团队
Meituan LongCat Team Open-Sources WBench: The First Systematic Multi-Round Benchmark for Interactive Video World Models
Research Breakthrough

Meituan LongCat Team Open-Sources WBench: The First Systematic Multi-Round Benchmark for Interactive Video World Models

The Meituan LongCat team has officially introduced and open-sourced WBench, a groundbreaking evaluation benchmark designed specifically for interactive video world models. As the first systematic framework of its kind, WBench focuses on multi-round interactions, moving beyond traditional passive video observation. Described by the developers as a "CT scanner" for AI, the tool is engineered to precisely diagnose the limitations of current world models as they attempt to transition from "passive viewing" to "active interaction." By testing the boundaries of these models in diverse scenarios—ranging from lunar environments to cybernetic cities—WBench provides a critical diagnostic layer for the industry. This open-source initiative aims to identify exactly where models fail in interactive sequences, offering a structured path forward for the development of more responsive and capable world models.

美团技术团队
Meituan LongCat Launches General 365: New Reasoning Benchmark Reveals AI Performance Gaps
Industry News

Meituan LongCat Launches General 365: New Reasoning Benchmark Reveals AI Performance Gaps

Meituan's LongCat team has officially released General 365, a new evaluation benchmark specifically designed to measure the reasoning capabilities of large language models. In a comprehensive assessment of 26 mainstream AI models, the benchmark revealed a significant struggle across the industry to handle complex reasoning tasks. According to the results, Gemini 3 Pro emerged as the top performer but only managed an accuracy rate of 62.8%. Most notably, the vast majority of the models tested failed to reach the 60% accuracy threshold, which is considered the passing mark. This release by Meituan's technical team establishes a more rigorous standard for AI evaluation, highlighting that even the most advanced models currently available face substantial challenges in logical reasoning.

美团技术团队
Meituan Fulfillment AI Team Showcases LLM-Based Agent Technology and ACL 2026 Research Breakthroughs
Industry News

Meituan Fulfillment AI Team Showcases LLM-Based Agent Technology and ACL 2026 Research Breakthroughs

Meituan's Fulfillment AI Algorithm Team is advancing the integration of Large Language Model (LLM) Agent systems into its core business operations. By focusing on a self-evolving Agent operating system, the team leverages cutting-edge techniques such as Continuous Pre-Training (CPT), Post-training, Agentic Reinforcement Learning (RL), and multi-modal understanding. Their research has gained significant international recognition, with dozens of papers published at top-tier AI conferences including ACL and EMNLP. This latest technical session highlights their contributions to ACL 2026, demonstrating how AI-driven agents are being utilized to optimize fulfillment services. The team's work represents a major step in applying theoretical AI research to solve real-world logistics and operational challenges through autonomous, evolving systems.

美团技术团队
Meituan Showcases AI Innovation at ACL 2026: Six Papers Redefining LLM Evaluation, Reasoning, and Generative Systems
Industry News

Meituan Showcases AI Innovation at ACL 2026: Six Papers Redefining LLM Evaluation, Reasoning, and Generative Systems

Meituan's technical team has achieved significant recognition at ACL 2026, a premier international conference for computational linguistics and natural language processing. The team had six papers accepted, showcasing advancements across several critical AI domains. These research contributions span large model evaluation, complex process reasoning, and the optimization of competition-level mathematical thinking. Furthermore, the papers delve into reinforcement learning enhancements and the development of generative recommendation systems. By addressing these diverse technical challenges, Meituan aims to establish new paradigms for generative AI, focusing on both theoretical improvements and practical application optimizations within the NLP landscape. This selection highlights Meituan's commitment to pushing the boundaries of how Large Language Models (LLMs) are evaluated and utilized in real-world scenarios.

美团技术团队
Meituan Open-Sources LongCat-Video-Avatar 1.5: A Major Leap Toward Commercial-Grade Digital Human Video Generation
Open Source

Meituan Open-Sources LongCat-Video-Avatar 1.5: A Major Leap Toward Commercial-Grade Digital Human Video Generation

Meituan's technical team has officially open-sourced LongCat-Video-Avatar 1.5, marking a significant transition from experimental state-of-the-art (SOTA) research to practical, commercial-grade applications. This updated model introduces comprehensive improvements in five key areas: lip-sync accuracy, physical plausibility, long-form video stability, multi-person interaction, and inference efficiency. Designed to handle complex commercial scenarios, LongCat-Video-Avatar 1.5 moves digital human technology from controlled 'rehearsal' environments to the 'real stage' of diverse, high-quality content generation. By focusing on stability and natural movement, the model enables the creation of personalized digital humans that can interact naturally in various business contexts, providing a robust tool for the AI industry's move toward scalable, high-fidelity video production.

美团技术团队
Chrome DevTools MCP: Empowering AI Programming Agents with Browser Debugging Capabilities
Product Launch

Chrome DevTools MCP: Empowering AI Programming Agents with Browser Debugging Capabilities

ChromeDevTools has officially released 'chrome-devtools-mcp', a specialized tool designed to integrate Chrome's powerful developer environment with programming agents. Hosted on GitHub and distributed via NPM, this project marks a significant step in making web debugging and inspection tools accessible to autonomous AI entities. By leveraging the Model Context Protocol (MCP), the tool allows agents to interact directly with the browser's internal state, facilitating a more seamless workflow for AI-driven web development and automated troubleshooting. This release highlights the growing trend of adapting traditional developer tools for the era of artificial intelligence, ensuring that agents have the necessary context to perform complex programming tasks within the browser.

GitHub Trending
Superpowers: A Comprehensive Methodology and Skill Framework for AI Programming Agents
Open Source

Superpowers: A Comprehensive Methodology and Skill Framework for AI Programming Agents

Superpowers is an innovative framework designed to provide a structured software development methodology for AI programming agents. Created by developer 'obra' and featured on GitHub Trending, the project offers a proven approach to agent-led development by utilizing a system of composable skills and foundational instructions. This framework aims to standardize how agents approach programming tasks, ensuring a more reliable and efficient development lifecycle. By focusing on modularity and clear initial guidance, Superpowers enables developers to build more capable and predictable AI agents for complex software engineering projects. The framework represents a shift toward more disciplined and architectural approaches in the field of autonomous AI development, providing the necessary tools to transform raw AI capabilities into effective programming assistants.

GitHub Trending
Browser-use Launches video-use: A New Paradigm for Editing Videos via Programming Agents
Open Source

Browser-use Launches video-use: A New Paradigm for Editing Videos via Programming Agents

The GitHub repository "video-use," developed by the browser-use organization, has emerged as a significant trending project in the open-source community. The project introduces a specialized approach to multimedia manipulation by utilizing programming agents to perform video editing tasks. By shifting the focus from manual graphical interfaces to agentic, code-driven workflows, video-use aims to automate the complexities of video post-production. This development highlights a growing trend in the AI industry where autonomous agents are being tasked with high-level creative and technical execution. As an open-source tool, it provides a foundation for developers to integrate intelligent automation into video processing pipelines, marking a transition from simple generative AI to functional, action-oriented agentic systems.

GitHub Trending
Agency-Agents: Revolutionizing Workflow Automation with Specialized AI Expert Teams
Open Source

Agency-Agents: Revolutionizing Workflow Automation with Specialized AI Expert Teams

Agency-Agents, a new open-source project by developer msitarzewski, introduces a comprehensive framework designed to function as a complete AI agency. The project moves beyond general-purpose AI by offering a suite of specialized agents, including frontend development experts, Reddit community managers, creative injectors, and reality checkers. Each agent is designed with a specific personality, professional workflow, and mature delivery capabilities. By structuring AI as a ready-to-use team of experts, Agency-Agents aims to provide businesses and developers with a plug-and-play solution for complex project execution. This approach highlights a significant shift in the AI industry toward specialized, agentic workflows where multiple autonomous entities collaborate to achieve professional-grade results across various domains such as development, marketing, and creative strategy.

GitHub Trending
Strix: The Open-Source AI Penetration Testing Tool Revolutionizing Vulnerability Discovery and Remediation
Open Source

Strix: The Open-Source AI Penetration Testing Tool Revolutionizing Vulnerability Discovery and Remediation

Strix has emerged as a significant open-source project on GitHub, offering an AI-powered approach to penetration testing. The tool is specifically designed to help developers and security teams discover and fix application vulnerabilities through automated processes. By combining artificial intelligence with traditional security testing methodologies, Strix aims to provide a comprehensive solution for maintaining robust application security. This analysis explores the core functionality of Strix, its role in the open-source community, and the broader implications of AI-driven security tools in the modern software development lifecycle. As an open-source initiative, it emphasizes transparency and collaborative improvement in the fight against evolving cyber threats.

GitHub Trending
Career-Ops: An AI-Driven Job Search System Leveraging Claude Code and Go Dashboards
Open Source

Career-Ops: An AI-Driven Job Search System Leveraging Claude Code and Go Dashboards

Career-Ops is a newly trending open-source project developed by santifer that introduces an AI-driven approach to career management and job searching. Built upon the capabilities of Claude Code, the system offers a robust suite of features including 14 specialized skill modes, a high-performance dashboard developed in Go, and automated PDF generation. Designed to streamline the often-tedious process of job hunting, Career-Ops incorporates batch processing capabilities to handle multiple tasks simultaneously. This analysis explores the technical components of the project, its reliance on Anthropic's Claude Code for intelligent automation, and how its multi-modal skill approach aims to revolutionize the way professionals interact with the modern job market.

GitHub Trending
Comprehensive Fitness Training Dataset Featuring 433 Exercises Released on GitHub for AI and App Development
Open Source

Comprehensive Fitness Training Dataset Featuring 433 Exercises Released on GitHub for AI and App Development

A significant new resource for the health and fitness technology sector has emerged on GitHub. Titled 'exercises-dataset' and authored by hasaneyldrm, this comprehensive repository provides a structured collection of 433 distinct fitness training entries. Each exercise in the dataset is meticulously documented with essential metadata, including its name, category, target muscle groups, and required equipment. Beyond text-based instructions, the dataset distinguishes itself by including visual components such as thumbnails and animated videos for every entry. This multi-modal approach offers a robust foundation for developers looking to build AI-driven workout planners, fitness tracking applications, or educational platforms. By providing high-quality, structured data openly, the project aims to streamline the development of digital fitness solutions and enhance the accuracy of exercise recognition and guidance systems.

GitHub Trending
Caveman Prompting: Reducing Claude Code Token Consumption by 65% Through Simplified Communication
Open Source

Caveman Prompting: Reducing Claude Code Token Consumption by 65% Through Simplified Communication

A new GitHub project titled 'caveman,' developed by JuliusBrussee, introduces a specialized skill for Claude Code designed to drastically optimize token usage. By adopting a 'primitive' or 'caveman-like' communication style, the tool claims to reduce token consumption by up to 65%. This approach challenges the standard practice of using verbose natural language in AI interactions, focusing instead on extreme brevity and structural simplicity. The project highlights a significant trend in prompt engineering where efficiency and cost-effectiveness are prioritized. By stripping away linguistic redundancies, 'caveman' allows developers to maximize the utility of Large Language Models (LLMs) while minimizing the overhead associated with token-based billing and context window limitations.

GitHub Trending
Open-Source Steam Controller Auto-Charge Project Uses Computer Vision and Haptic Pulses for Autonomous Magnetic Docking
Open Source

Open-Source Steam Controller Auto-Charge Project Uses Computer Vision and Haptic Pulses for Autonomous Magnetic Docking

The Steam Controller Auto-Charge is an innovative open-source web application designed to enable a Steam Controller to autonomously navigate to its magnetic charging puck. By leveraging OpenCV.js for optical flow tracking via an overhead camera and WebHID for telemetry, the system guides the controller using asymmetric haptic pulses generated by its internal Linear Resonant Actuators (LRAs). The project features a specialized 'Proximity Creep Mode' that reduces haptic frequency for gentle docking and provides real-time battery monitoring by intercepting specific HID reports. Built with the Nix package manager for cross-platform compatibility, this tool demonstrates a unique intersection of computer vision, web-based hardware communication, and creative haptic engineering to solve a practical hardware charging challenge.

Hacker News
Mistral AI Unveils Leanstral 1.5: A New Era of Open Source Formal Verification and Proof Engineering
Product Launch

Mistral AI Unveils Leanstral 1.5: A New Era of Open Source Formal Verification and Proof Engineering

Mistral AI has announced the release of Leanstral 1.5, a specialized open-source model designed to advance formal verification in the Lean 4 programming language. Released under the Apache-2.0 license, the model features 6 billion active parameters out of a total 119 billion, balancing computational efficiency with high-level reasoning. Leanstral 1.5 has demonstrated exceptional performance, saturating the miniF2F benchmark and solving 587 out of 672 PutnamBench problems. Beyond theoretical benchmarks, the model has proven its practical utility in agentic proof engineering by identifying five previously unknown bugs in real-world open-source repositories. Trained through a rigorous three-stage process including reinforcement learning with CISPO, Leanstral 1.5 is now available via Hugging Face and a free API, aiming to democratize access to rigorous formal methods for developers and researchers.

Hacker News