Back to List
TechnologyAIInnovationVideo Editing

Google Gemini Update: New Multi-Reference Image Feature Empowers Users with Enhanced Control Over AI Video and Audio Generation

Google has rolled out an update for its Gemini application, introducing a novel method for AI video generation control. Users can now upload multiple reference images within a single video prompt, allowing the system to generate video and audio based on these images and accompanying text. This new functionality grants users more direct control over the final visual and auditory output of their videos. Previously, Google had tested this feature within its extended video AI platform, Flow, which offers higher video quotas and supports extending existing video clips and stitching multiple scenes. The Veo 3.1 version, released in mid-October, reportedly shows significant improvements in texture realism, input fidelity, and audio quality compared to Veo 3.0. This update aims to provide creators with greater flexibility and customization in AI-powered content creation.

AI新闻资讯 - AI Base

Google has recently updated its Gemini application, introducing a significant new feature that enhances user control over AI video generation. Users can now upload multiple reference images within a single video prompt. The system will then generate both video and audio content based on these uploaded images and any accompanying text, giving users more direct influence over the final appearance and sound of their videos.

This functionality was previously tested by Google within its extended video AI platform, Flow. The Flow platform not only supports extending existing video clips and stitching multiple scenes together but also provides higher video quotas compared to the Gemini application.

According to Google, the Veo 3.1 version, which was released in mid-October, demonstrates notable improvements over its predecessor, Veo 3.0. These enhancements are particularly evident in areas such as texture realism, input fidelity, and overall audio quality. This update allows users to leverage AI tools with greater flexibility, enabling them to create content that more closely aligns with their specific requirements.

The ability to upload multiple reference images means that creators can integrate more personalized elements into their video productions, thereby offering audiences a richer visual and auditory experience. In the rapidly evolving landscape of AI technology, Google's latest move underscores its ongoing commitment to innovation in the video generation domain. As user demands become increasingly diverse, the flexibility and customizability of AI tools are becoming paramount, and Gemini's new feature is expected to attract considerable attention and adoption from creators.

Related News

Technology

Google Unveils Antigravity: A New AI-Powered Autonomous Platform for End-to-End Software Development, Integrating with Gemini 3 for Agentic Coding

Google has launched Antigravity, a novel platform designed for "AI agent-led development," moving beyond traditional IDEs. This autonomous agent collaboration system enables AI to independently plan, execute, and verify complete software development tasks. Deeply integrated with the Gemini 3 model, Antigravity represents Google's key product in "Agentic Coding." It addresses limitations of previous AI tools, which were primarily assistive and required manual operation and step-by-step human prompts. Antigravity allows AI to work across editors, terminals, and browsers, plan complex multi-step tasks, automatically execute actions via tool calls, and self-check results. It shifts the development paradigm from human-operated tools to AI-operated tools with human supervision and collaboration. The platform's core philosophy revolves around Trust, Autonomy, Feedback, and Self-Improvement, providing transparency into AI's decision-making, enabling autonomous cross-environment operations, facilitating real-time human feedback, and allowing AI to learn from past experiences.

Technology

Google Vids Unlocks Advanced AI Features for All Gmail Users: Free Access to AI Voiceovers, Redundancy Removal, and Image Editing

Google has made several advanced AI features in its Vids video editing platform available to all users with a Gmail account, previously exclusive to paid subscribers. These newly accessible tools include AI voiceovers, automatic removal of redundant speech, and AI image editing. The transcription trimming feature automatically eliminates filler words like "um" and "ah," along with long pauses, significantly enhancing video quality. Users can also generate professional-grade voiceovers from text scripts, choosing from seven different voice options, many of which sound natural. Additionally, the AI image editing tool allows for easy modifications such as background removal, descriptive editing, and transforming static photos into dynamic videos. Google aims to empower both beginners and experienced creators to produce high-quality video content, anticipating significant growth in the video editing market despite Vids being in its early stages.

Technology

Quora's Poe AI Platform Launches Group Chat Feature Supporting Up to 200 Users for Enhanced Collaborative AI Interactions

Quora has introduced a new group chat feature for its AI platform, Poe, allowing up to 200 users to collaborate with various AI models and bots in a single conversation. This innovation supports multi-modal interactions including text, image, video, and audio generation. The launch coincides with OpenAI's ChatGPT piloting similar group chat functionalities in select markets, signaling a shift in AI interaction methods. Quora highlights that this feature will offer new interactive experiences for AI users, such as family trip planning using Gemini 2.5 and o3Deep Research, or team brainstorming with image models to create mood boards. Users can also engage in intellectual games with Q&A bots. Group chats can be created from Poe's homepage, with real-time synchronization across devices, ensuring seamless transitions between desktop and mobile. Quora developed this feature over six months and plans to optimize it based on user feedback, emphasizing the unexplored potential for group interaction and collaboration in AI mediums. Poe also enables users to create and share custom bots.