Back to List
TechnologyAIInnovationVideo Editing

Google Gemini Update: New Multi-Reference Image Feature Empowers Users with Enhanced Control Over AI Video and Audio Generation

Google has rolled out an update for its Gemini application, introducing a novel method for AI video generation control. Users can now upload multiple reference images within a single video prompt, allowing the system to generate video and audio based on these images and accompanying text. This new functionality grants users more direct control over the final visual and auditory output of their videos. Previously, Google had tested this feature within its extended video AI platform, Flow, which offers higher video quotas and supports extending existing video clips and stitching multiple scenes. The Veo 3.1 version, released in mid-October, reportedly shows significant improvements in texture realism, input fidelity, and audio quality compared to Veo 3.0. This update aims to provide creators with greater flexibility and customization in AI-powered content creation.

AI新闻资讯 - AI Base

Google has recently updated its Gemini application, introducing a significant new feature that enhances user control over AI video generation. Users can now upload multiple reference images within a single video prompt. The system will then generate both video and audio content based on these uploaded images and any accompanying text, giving users more direct influence over the final appearance and sound of their videos.

This functionality was previously tested by Google within its extended video AI platform, Flow. The Flow platform not only supports extending existing video clips and stitching multiple scenes together but also provides higher video quotas compared to the Gemini application.

According to Google, the Veo 3.1 version, which was released in mid-October, demonstrates notable improvements over its predecessor, Veo 3.0. These enhancements are particularly evident in areas such as texture realism, input fidelity, and overall audio quality. This update allows users to leverage AI tools with greater flexibility, enabling them to create content that more closely aligns with their specific requirements.

The ability to upload multiple reference images means that creators can integrate more personalized elements into their video productions, thereby offering audiences a richer visual and auditory experience. In the rapidly evolving landscape of AI technology, Google's latest move underscores its ongoing commitment to innovation in the video generation domain. As user demands become increasingly diverse, the flexibility and customizability of AI tools are becoming paramount, and Gemini's new feature is expected to attract considerable attention and adoption from creators.

Related News

Technology

Open-Mercato: AI-Powered CRM/ERP Framework for R&D, Operations, and Growth – Enterprise-Grade, Modular, and Highly Customizable

Open-Mercato is an AI-supported CRM/ERP foundational framework designed to empower research and development, new processes, operations, and growth. It boasts a modular and scalable architecture, specifically tailored for teams seeking robust default functionalities alongside extensive customization options. The framework positions itself as a superior enterprise-grade alternative to solutions like Django and Retool, offering a powerful platform for businesses.

Technology

Heretic: Fully Automated Censorship Removal for Language Models Trending on GitHub

Heretic, a new project by p-e-w, has recently gained traction on GitHub Trending. Published on February 21, 2026, this tool focuses on the fully automated removal of censorship from language models. The project's primary aim is to provide a solution for users seeking to bypass restrictions within these AI systems, as indicated by its brief description and prominent GitHub presence.

Technology

Superpowers: A Comprehensive Software Development Workflow and Skill Framework for Coding Agents on GitHub Trending

Superpowers, recently featured on GitHub Trending, introduces an effective agent skill framework and a complete software development methodology. Designed for coding agents, this workflow is built upon a foundation of composable 'skills' and includes an initial set of these skills. It aims to streamline the development process for AI-driven coding agents by providing a structured and modular approach to their capabilities.