MiniCPM-o: A Gemini 2.5 Flash-Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Mobile Devices
OpenBMB has introduced MiniCPM-o, a multimodal large language model (MLLM) designed for mobile applications. This model is positioned as a Gemini 2.5 Flash-level solution, specifically tailored to handle vision, speech, and full-duplex multimodal live streaming functionalities directly on mobile devices. The announcement was made via GitHub Trending, highlighting its potential for advanced mobile-centric AI applications.