AI News on June 28, 2026

Meituan Open Sources Innovative AIGC Poster Generation System Featuring a Technical Closed Loop
Open Source

Meituan Open Sources Innovative AIGC Poster Generation System Featuring a Technical Closed Loop

The Meituan Intelligent Creation Team has announced the development and open-sourcing of a comprehensive technical system for AIGC poster generation. This innovative framework is built upon a "Generation-Editing-Evaluation" closed loop, designed to streamline the entire creative workflow from initial asset creation to final quality assessment. Currently, the technology has been successfully implemented within Meituan's core business sectors, including Meituan Waimai (food delivery) and various brand IP scenarios. By open-sourcing this entire technical architecture, Meituan aims to contribute to the broader AI community, providing a robust foundation for automated design and intelligent content creation. The system represents a significant step in moving AIGC from experimental phases to practical, high-efficiency industrial applications.

美团技术团队
Meituan LongCat Team Open-Sources WBench: The First Systematic Multi-Round Benchmark for Interactive Video World Models
Research Breakthrough

Meituan LongCat Team Open-Sources WBench: The First Systematic Multi-Round Benchmark for Interactive Video World Models

The Meituan LongCat team has officially released and open-sourced WBench, a groundbreaking evaluation framework designed to measure the capabilities of interactive video world models. As the first systematic multi-round benchmark of its kind, WBench serves as a diagnostic "CT scanner" for the AI industry, pinpointing the specific technical hurdles models face when transitioning from passive video generation to active, multi-round interaction. By evaluating performance across diverse scenarios—ranging from lunar explorations to complex cybernetic urban environments—WBench establishes a new standard for assessing how world models understand and react to interactive prompts. This open-source initiative aims to provide researchers with the tools necessary to identify where current models fail and how to push the boundaries of interactive artificial intelligence.

美团技术团队
Meituan Showcases AI Innovations at ACL 2026: Advancing Large Model Evaluation, Reasoning, and Generative Recommendation Systems
Industry News

Meituan Showcases AI Innovations at ACL 2026: Advancing Large Model Evaluation, Reasoning, and Generative Recommendation Systems

Meituan's technical team has announced the acceptance of six research papers at the prestigious ACL 2026 conference, marking a significant contribution to the fields of computational linguistics and natural language processing. The research spans a diverse array of critical AI domains, including large-scale model evaluation, complex process reasoning, and the optimization of competition-level mathematical thinking. Furthermore, the papers delve into reinforcement learning optimization and the evolving field of generative recommendation systems. By focusing on these specific areas, Meituan aims to establish a new paradigm for generative AI, moving from theoretical capability assessment to the practical optimization of inference and reasoning. This selection of work highlights the company's commitment to advancing NLP technologies and their application in solving complex, real-world computational challenges.

美团技术团队
Meituan Technical Team Open-Sources LongCat-Video-Avatar 1.5 for Commercial-Grade Digital Human Video Generation
Open Source

Meituan Technical Team Open-Sources LongCat-Video-Avatar 1.5 for Commercial-Grade Digital Human Video Generation

Meituan's technical team has officially open-sourced LongCat-Video-Avatar 1.5, a significant advancement in digital human video modeling. Moving beyond experimental state-of-the-art (SOTA) benchmarks, this version is specifically engineered for commercial-grade applications. The update introduces comprehensive improvements in lip-synchronization, physical plausibility, and long-form video stability. Furthermore, it enhances multi-person interaction capabilities and optimizes inference efficiency. Designed to perform reliably in complex commercial environments, LongCat-Video-Avatar 1.5 facilitates the transition of digital human technology from controlled laboratory settings to diverse, real-world scenarios. This release provides a robust framework for generating high-quality, natural digital human content at scale, addressing the critical needs of modern industry applications.

美团技术团队
LARYBench Released: Defining the ImageNet for Embodied Action Representations via Large-Scale Human Video Learning
Research Breakthrough

LARYBench Released: Defining the ImageNet for Embodied Action Representations via Large-Scale Human Video Learning

The Meituan Technical Team has introduced LARYBench (Latent Action Representation Yielding Benchmark), a systematic evaluation framework designed to guide the learning of general latent action representations from extensive visual datasets. Positioned as the 'ImageNet' for embodied AI, LARYBench provides a standardized method for measuring how models understand and execute physical actions. Experimental findings reveal a significant shift in AI development: general vision models demonstrate superior performance in action generalization and control precision compared to specialized action expert models. Furthermore, the benchmark proves that embodied action representations can effectively emerge from large-scale human video data, suggesting that specialized robotic data may not be the only path to achieving high-level embodied intelligence.

美团技术团队
Meituan LongCat Team Unveils LongCat-AudioDiT: Redefining Zero-Shot Voice Cloning via Waveform Latent Space
Research Breakthrough

Meituan LongCat Team Unveils LongCat-AudioDiT: Redefining Zero-Shot Voice Cloning via Waveform Latent Space

Meituan's LongCat team has announced a significant advancement in speech synthesis with the release of LongCat-AudioDiT. This new model aims to overcome the limitations of traditional zero-shot Text-to-Speech (TTS) systems by eliminating intermediate representations like Mel-spectrograms. Instead, it utilizes a diffusion-based approach operating directly within the waveform latent space. This method is designed to prevent the accumulation of cascade errors that often occur during multi-stage data conversion. By allowing the AI to learn the inherent patterns of sound directly, LongCat-AudioDiT pushes the boundaries of high-fidelity voice cloning and streamlined audio generation, marking a technical shift in how AI models interpret and replicate human vocal characteristics.

美团技术团队
Meituan Technical Team Unveils LongCat-Flash-Prover: A New Frontier in Rigorous AI Mathematical Theorem Proving
Product Launch

Meituan Technical Team Unveils LongCat-Flash-Prover: A New Frontier in Rigorous AI Mathematical Theorem Proving

The Meituan technical team has announced the open-source release of LongCat-Flash-Prover, a specialized model designed to bridge the gap between simple mathematical calculation and rigorous theorem proving. Unlike traditional AI models that focus on reaching a final numerical answer, LongCat-Flash-Prover emphasizes the strict logical chains required for formal mathematical verification. By addressing the limitations of natural language ambiguity—which often leads to the total collapse of a proof—this model aims to transition AI capabilities from speculative "answer guessing" to executing "rigorous proofs." This release marks a significant step in addressing the challenges of complex reasoning and mathematical formalization, providing the global research community with a dedicated tool for high-precision logical tasks.

美团技术团队
Meituan Releases LongCat-Next: A Native Multimodal Model Designed to Perceive and Interact with the Physical World
Open Source

Meituan Releases LongCat-Next: A Native Multimodal Model Designed to Perceive and Interact with the Physical World

Meituan's technical team has officially announced the release and open-sourcing of LongCat-Next, a native multimodal model that represents a major step toward physical-world AI. By integrating vision and speech as native modalities—essentially the AI's "mother tongue"—LongCat-Next is designed to bridge the gap between digital processing and real-world interaction. Alongside the model, Meituan has open-sourced its discrete tokenizer, providing the developer community with the core tools needed to build systems that can perceive, understand, and act within the physical environment. This initiative underscores Meituan's commitment to advancing AI capabilities beyond text-based interfaces, focusing on the practical application of intelligence in complex, real-world scenarios through an open-source research philosophy.

美团技术团队
Zhang Xuefeng Skill: An AI-Generated Cognitive Operating System for Career and Education Planning
Open Source

Zhang Xuefeng Skill: An AI-Generated Cognitive Operating System for Career and Education Planning

The GitHub repository 'zhangxuefeng-skill,' developed by user alchaincyf, has introduced a specialized 'cognitive operating system' designed to streamline decision-making in the educational and professional sectors. Based on the methodologies of renowned education consultant Zhang Xuefeng, the project provides a practical thinking framework for college entrance exams, postgraduate applications, and career planning. Generated through the 'Nuwa.skill' platform, this repository represents a significant step in the digitization of expert knowledge into structured, AI-driven frameworks. By offering a systematic approach to high-stakes academic transitions, the project aims to equip users with a repeatable logic for navigating complex educational landscapes and professional development paths.

GitHub Trending
Openpilot: The Robotics Operating System Revolutionizing Driver Assistance for Over 300 Vehicles
Industry News

Openpilot: The Robotics Operating System Revolutionizing Driver Assistance for Over 300 Vehicles

Openpilot, a prominent robotics operating system developed by commaai, has reached a significant milestone in the automotive technology sector. The system is designed to upgrade and enhance the driver assistance capabilities of a vast array of automobiles. According to the latest project updates, openpilot now supports more than 300 different vehicle models, providing a standardized platform for advanced driving features. By functioning as a comprehensive robotics OS, it bridges the gap between traditional automotive hardware and modern automated software requirements. This expansion highlights the growing trend of software-defined vehicle enhancements and the increasing accessibility of sophisticated driver assistance systems across diverse automotive platforms.

GitHub Trending
Google Labs Introduces DESIGN.md: A New Format Specification for Describing Visual Identities to AI Coding Agents
Open Source

Google Labs Introduces DESIGN.md: A New Format Specification for Describing Visual Identities to AI Coding Agents

Google Labs has unveiled DESIGN.md, a specialized format specification designed to bridge the gap between design systems and AI-driven development. The specification provides a standardized way to describe visual identities to coding agents, ensuring they maintain a persistent and structured understanding of design requirements. By formalizing how design information is communicated to machines, DESIGN.md aims to improve the accuracy and consistency of UI/UX implementation in automated coding workflows. This initiative, hosted on GitHub, represents a significant step toward making design systems machine-readable and actionable for the next generation of AI software engineering tools, allowing agents to move beyond simple prompts toward a deeper, more durable comprehension of brand and interface guidelines.

GitHub Trending
MinerU: Transforming Complex PDF and Office Documents into LLM-Ready Data for Agentic Workflows
Open Source

MinerU: Transforming Complex PDF and Office Documents into LLM-Ready Data for Agentic Workflows

MinerU, a specialized tool developed by OpenDataLab, addresses a critical bottleneck in the AI development lifecycle: the conversion of unstructured, complex documents into machine-readable formats. By transforming PDF and Microsoft Office files into structured Markdown and JSON, MinerU provides the essential data foundation required for modern Large Language Model (LLM) applications. Specifically designed to support Agentic workflows, the tool ensures that AI agents can consume and process information with high fidelity. This release marks a significant step forward in streamlining data ingestion pipelines, allowing developers to move beyond the challenges of legacy document parsing and focus on building sophisticated, autonomous AI systems that rely on accurate, structured data inputs.

GitHub Trending
SoftBank CEO and Industry Experts Question the Hype Surrounding Elon Musk’s Orbital Data Center Vision
Industry News

SoftBank CEO and Industry Experts Question the Hype Surrounding Elon Musk’s Orbital Data Center Vision

Recent reports highlight a growing wave of skepticism regarding Elon Musk’s ambitious vision for orbital data centers. SoftBank’s CEO is among the prominent industry leaders raising critical questions about the feasibility and the significant "hype" associated with the project. While the concept of space-based data infrastructure has captured public imagination, it has not gained universal acceptance among major technology investors and experts. The skepticism from such high-profile figures suggests a potential disconnect between the visionary claims and the practical realities of implementing orbital data solutions. This development marks a shift in the industry discourse, as stakeholders move beyond initial excitement to demand more substantive evidence and clarity regarding the long-term viability of Musk’s latest technological venture in the space and data sectors.

TechCrunch AI
Adrafinil: A New macOS Utility Designed to Keep Laptops Awake Exclusively During AI Agent Activity
Product Launch

Adrafinil: A New macOS Utility Designed to Keep Laptops Awake Exclusively During AI Agent Activity

Adrafinil is an innovative macOS menu bar application that introduces a "eugeroic" approach to machine power management. Unlike traditional utilities that keep a computer awake indefinitely, Adrafinil prevents a Mac from sleeping—including in clamshell (lid-closed) mode—only while an AI coding agent is actively performing a task. Supporting popular agents such as Claude Code, Codex, and Cursor, the tool ensures that long-running AI sessions are not interrupted when the user closes the laptop lid. Once the agent completes its work and releases the session, Adrafinil allows the system to return to its normal sleep behavior immediately. By utilizing a secure, audited helper for privileged sleep control and standard system assertions, Adrafinil offers a specialized solution for developers and AI users who require automated, task-aware system wakefulness.

Hacker News
Margaret Atwood Critiques AI Development: The 'Garbage In, Garbage Out' Challenge for Generative Models
Industry News

Margaret Atwood Critiques AI Development: The 'Garbage In, Garbage Out' Challenge for Generative Models

Acclaimed author Margaret Atwood, known for 'The Handmaid's Tale' and 'The Blind Assassin,' recently shared her critical perspective on artificial intelligence at the Babell Literary and Cultural Festival in Porto, Portugal. Atwood characterized the fundamental flaw of current AI systems using the classic computing adage 'garbage in, garbage out' (GIGO). Having personally experimented with AI tools, the author expressed skepticism regarding the technology's ability to produce high-quality literary work when the underlying training data is flawed or derivative. Her comments highlight a growing concern among creative professionals about the data sources powering large language models and the resulting impact on the quality of machine-generated prose. This critique serves as a significant intervention in the ongoing debate over AI's role in the arts and the necessity of data integrity.

The Verge
Apple Vision Pro Vice President Paul Meade Reportedly Departs to Join OpenAI Hardware Team
Industry News

Apple Vision Pro Vice President Paul Meade Reportedly Departs to Join OpenAI Hardware Team

Paul Meade, the Apple Vice President who has been leading the Vision Pro headset division, is reportedly leaving the company to join OpenAI. According to reports, Meade will be transitioning to a role within OpenAI’s hardware team. This high-profile departure marks a significant shift for Apple’s executive leadership, particularly within its specialized headset department. Meade’s move to OpenAI highlights the artificial intelligence organization's ongoing efforts to build out its hardware capabilities by recruiting established talent from industry leaders. The transition of a Vice President from a major hardware product line at Apple to a hardware-focused position at OpenAI suggests a strategic realignment of executive expertise in the technology sector, focusing on the intersection of physical devices and advanced artificial intelligence.

TechCrunch AI