AI News on November 18, 2025

Technology

ElevenLabs Unveils "Image & Video Platform": A Super AI Content Factory for Integrated Visuals, Audio, and Music Generation, Revolutionizing Content Creation Workflow

ElevenLabs, a leader in multimodal AI, has launched its new "Image & Video Platform," transforming from a voice-only tool into a comprehensive AI content factory. This platform integrates image generation, video generation, voice synthesis, music creation, and sound effect design, enabling creators and marketers to produce commercial-grade videos from script to final product within a single interface. It eliminates the need for switching between multiple platforms by seamlessly combining visual generation with ElevenLabs' audio capabilities. The platform incorporates top multimodal models like Google Veo, OpenAI Sora, and Kling, alongside ElevenLabs' proprietary AI voice and music generation. Designed for commercial use, it supports various aspect ratios, includes a commercial-safe audio library, offers multi-language narration replacement, and features a timeline editor for precise synchronization. Official demonstrations show a 30-second brand advertisement can be created in just five minutes, significantly boosting content production efficiency.

AI新闻资讯 - AI Base
Technology

Google DeepMind Unveils SIMA 2: A General-Purpose AI Agent Powered by Gemini, Achieving Near-Human Performance in Complex 3D Virtual Worlds with Enhanced Reasoning and Self-Improvement

Google DeepMind has launched SIMA 2, an upgraded general-purpose AI agent designed to navigate and perform tasks in complex 3D game environments. Building on its predecessor, SIMA 1 (released in 2024), SIMA 2 integrates the Gemini 2.5 Flash Lite model as its core reasoning engine, enabling it to better understand goals, interpret plans, and continuously learn through self-improvement. While SIMA 1 achieved a 31% task completion rate with over 600 language instructions, SIMA 2 significantly boosts this to 62%, nearing the 71% completion rate of human players. SIMA 2 maintains the same interface but transforms from a mere instruction executor into an interactive game partner, capable of explaining its intentions and answering questions about its goals. It also expands its instruction channels to include voice, graphics, and emojis, demonstrating advanced reasoning by interpreting abstract requests. Furthermore, SIMA 2 features a self-improvement mechanism where it learns from its own experience in new games, with the Gemini model generating and scoring new tasks, leading to success in previously failed scenarios without additional human demonstrations. DeepMind also showcased SIMA 2's integration with Genie 3, allowing it to generate interactive 3D environments from a single image or text prompt, marking a significant step towards advanced real-world robotics.

AI新闻资讯 - AI Base
Product

ElevenLabs Unveils Image & Video (Beta): An All-in-One AI Content Creation Platform for Visuals, Audio, and Music Generation

ElevenLabs has officially launched Image & Video (Beta), a comprehensive AI content creation platform designed for creators and marketers. This integrated platform combines image, video, sound, music, and sound effect generation capabilities. It leverages leading multimodal generative models such as Veo, Kling, and Sora to enable rapid visual content creation. Users can directly synthesize voices, overlay narrations, and edit soundtracks within the ElevenLabs platform, producing commercial and creative video content. The platform supports a streamlined workflow, including image/video generation, audio/voiceover addition with lip-sync, background music and sound effect editing, multi-segment synthesis, and ultra-resolution enhancement via Topaz integration. It aims to provide a unified creative environment, eliminating the need for multiple tools and catering to content creators, marketing teams, educators, and game developers.

Xiaohu.AI 日报
Technology

Poe AI Launches Group Chat Feature: Up to 200 Users Collaborate with 200+ AI Models for Enhanced Interactive Experiences

Poe, a prominent AI platform, has officially introduced a new 'Group Chat' feature, integrating multi-model AI with real-time multi-person interaction. This innovation allows up to 200 users to join a single chat and collaborate with hundreds of AI models for diverse scenarios like travel planning and creative brainstorming. The feature supports seamless collaboration with any AI model, eliminating the need to switch tools and enabling simultaneous use of various AI types within one chatroom. It is compatible with over 200 AI models, including text, image, video, and audio, and allows custom bot integration. Users can mix and match top models like GPT-5.1, Claude 4.5, Gemini 2.5, Sora 2, and Veo 3.1 for comprehensive content creation. The group chat also offers cross-device synchronization for desktop and mobile, ensuring uninterrupted collaboration. This feature is poised to transform enterprise meetings, online education, and virtual communities, making advanced AI resources more accessible.

AI新闻资讯 - AI Base
Technology

xAI Unveils Grok 4.1: A Leap in Emotional Intelligence and Personality Coherence for AI Models, Outperforming Rivals in Key Benchmarks

xAI has officially launched Grok 4.1, aiming for a more natural and credible AI experience beyond a mere question-answering machine. The update significantly enhances creativity, emotional intelligence, and collaborative capabilities, focusing on nuanced intent understanding and consistent personality. Grok 4.1 utilizes a large-scale reinforcement learning (RL) infrastructure and innovative agentic reasoning models as reward models for self-improvement. Key advancements include novel reward modeling, where high-order reasoning models automatically review and refine responses, reducing reliance on manual annotation. It also introduces 'Personality Alignment' as an optimization goal, adding emotional expression rewards and personality coherence metrics to its training. This ensures the model maintains a stable identity and consistent tone across conversations, fostering a sense of continuity for users. Performance assessments show Grok 4.1 leading LMArena's text榜单 (Text Arena) with 1483 Elo, outperforming Gemini 2.5 Pro, Claude, and GPT-4.5. It also achieved the highest scores in EQ-Bench3 for emotional empathy and human-like responses and ranked second only to GPT-5 series in Creative Writing v3 Benchmark. Furthermore, Grok 4.1 reduced information error rates by approximately 65% and hallucination occurrences by three times.

Xiaohu.AI 日报