
Meituan LongCat Team Unveils WBench: The First Systematic Multi-Round Benchmark for Interactive Video World Models
The Meituan LongCat team has announced the open-sourcing of WBench, a groundbreaking evaluation framework designed to measure the performance of interactive video world models. As the first systematic multi-round benchmark in this field, WBench serves as a diagnostic tool—likened to a 'CT scanner'—to identify the technical bottlenecks encountered when AI transitions from passive video generation to active, multi-turn interaction. By testing models across diverse scenarios ranging from lunar environments to futuristic urban settings, WBench aims to define the current boundaries of world models and provide a clear roadmap for future development in interactive artificial intelligence.









