
VideoPoet by Google

VideoPoet – Google Research


VideoPoet is an advanced tool by Google Research for zero-shot video generation, converting text prompts into high-quality videos.

Added On:


Monthly Visitors:


VideoPoet by Google

VideoPoet by Google Product Information

VideoPoet – Google Research

What's VideoPoet

VideoPoet is an innovative modeling method developed by Google Research that transforms autoregressive language models or large language models (LLMs) into high-quality video generators. It utilizes a pre-trained MAGVIT V2 video tokenizer and a SoundStream audio tokenizer to convert images, video, and audio clips into discrete codes. These codes integrate seamlessly with text-based language models, facilitating multimodal generation.


  • Text-to-Video: Generate high-motion variable length videos from text prompts.
  • Image-to-Video: Create videos from any image input, guided by text prompts.
  • Video Editing: Edit videos by specifying different motions and styles.
  • Stylization: Apply stylistic transformations to videos based on text prompts.
  • Inpainting: Fill in missing video parts seamlessly.
  • Zero-shot Capabilities: Perform tasks like text-to-audio without additional training data.
  • Video-to-Audio: Generate matching audio for videos without text guidance.

Use Case

VideoPoet can be used for various creative and professional applications, such as:

  • Content Creation: Generate engaging videos for social media or marketing campaigns.
  • Entertainment: Produce short films or animated stories from simple text prompts.
  • Education: Create educational videos and visual content dynamically.
  • Research: Explore new ways of integrating multimodal data for research purposes.


How does VideoPoet work?

VideoPoet uses pre-trained video and audio tokenizers to convert multimedia inputs into a unified code vocabulary. These codes are then processed by an autoregressive language model to predict the next video or audio token, generating new content based on the input prompts.

What are the benefits of using VideoPoet?

VideoPoet offers high-quality, consistent video generation with a wide range of creative possibilities. It supports various formats and styles, making it a versatile tool for different applications.

Can VideoPoet generate long videos?

Yes, VideoPoet can generate longer videos by predicting sequential video frames, ensuring strong object identity preservation and temporal consistency.

Where can I see examples of VideoPoet's capabilities?

Visit the VideoPoet website to see various examples of text-to-video, image-to-video, video editing, stylization, and more.

For more information and resources, check out the Research Blog and the VideoPoet Paper.

Loading related products...