marks a turning point in the development of creative technology as we provide a range of innovative generative media models and tools that will revolutionize the way that musicians, filmmakers, artists, and other creators realize their ideas. We’re pushing the boundaries of creativity and expression by enabling everyone to create stunning pictures, engaging films, and enthralling music thanks to important advancements in artificial intelligence.
Our newest video and image generation models, Veo 3 and Imagen 4, which are designed to provide unmatched quality and ground-breaking capabilities, are at the center of this announcement. Additionally, we’re making Lyria 2 more widely available, giving musicians access to more advanced composition and exploration capabilities. Additionally, we present Flow, an easy-to-use AI filmmaking tool that uses the power of our most advanced AI.
Our journey has been deeply collaborative, forged in partnership with the creative industries — filmmakers, musicians, artists, and YouTube creators. Their invaluable feedback has been instrumental in shaping these models and products responsibly, ensuring they empower creators to harness the full potential of AI in their art.
Veo 3: Where Video Meets Audio in Unprecedented Ways
The most significant leap in video generation is here with Veo 3, our new state-of-the-art model. Beyond dramatically improving on the visual quality of its predecessor, Veo 3 introduces a revolutionary capability: the generation of videos with integrated, contextually relevant audio. Imagine a bustling city street clip complete with the authentic sounds of traffic, a serene park scene echoing with birdsong, or even characters engaging in natural, lip-synced dialogue – Veo 3 makes this a reality.
Veo 3 excels across the board, from understanding nuanced text and image prompts to accurately simulating real-world physics and achieving precise lip-syncing. It possesses an exceptional ability to grasp narrative; simply tell a short story in your prompt, and Veo 3 will craft a clip that vividly brings it to life. Veo 3 is available today for Ultra subscribers in the United States via the Gemini app and integrated within Flow. It’s also accessible for enterprise users on Vertex AI.
Veo 2 Updates: Precision Control, Built With and For Filmmakers
As Veo 3 pushes new boundaries, we’re also enhancing our popular Veo 2 model with new capabilities directly informed by our ongoing collaboration with creators and filmmakers. We’re excited to launch several of these highly anticipated features:
State-of-the-Art Reference-Powered Video: Gain unprecedented creative control and consistency. Provide Veo with reference images of characters, scenes, objects, or even specific artistic styles, and watch as your video maintains visual fidelity to your vision.
Precision Camera Controls: Define exact camera movements, including rotations, dollies, and zooms, to achieve the perfect shot and cinematic flow.
Seamless Outpainting: Broaden your video’s frame effortlessly, transforming portrait orientations into landscape, intelligently adding to the scene to fit any screen size.
Intelligent Object Add and Remove: Easily add or erase objects from your videos. Veo understands scale, interactions, and shadows, ensuring the modified scene looks natural and realistic.
Reference-powered video and camera controls are available now within Flow. We’re also thrilled to bring all these new capabilities to the Vertex AI API in the coming weeks, expanding access to more products over the next few months.
Flow: The AI Filmmaking Command Center Designed for Veo
The centerpiece of this new creative ecosystem is Flow, an intuitive AI filmmaking tool built with and for creatives. Flow allows you to seamlessly create cinematic clips, intricate scenes, and compelling stories by intelligently weaving together Google DeepMind’s most advanced models: Veo, Imagen, and Gemini.
Using natural language, you can describe your desired shots to Flow, manage all the “ingredients” for your story—including your cast, locations, objects, and visual styles—in a single, convenient place. Flow then takes these elements and transforms your narrative into beautifully rendered scenes.
Flow is available today for Google AI Pro and Ultra plan subscribers in the U.S., with plans to expand to more countries soon.
Imagen 4: Visuals Redefined, Typography Mastered
Setting a new benchmark for visual fidelity, our latest Imagen 4 model combines remarkable speed with exquisite precision to create truly stunning images. Imagen 4 boasts remarkable clarity in fine details, rendering intricate fabrics, delicate water droplets, and realistic animal fur with breathtaking accuracy. It excels in both photorealistic and abstract styles, offering unparalleled versatility.
Imagen 4 can create images in a wide range of aspect ratios and at impressive resolutions of up to 2K, making them ideal for high-quality printing or professional presentations. Crucially, it’s also significantly better at spelling and typography, simplifying the creation of custom greeting cards, posters, and even intricate comic book panels.
Imagen 4 is available today within the Gemini app, Whisk, Vertex AI, and is seamlessly integrated across Workspace applications like Slides, Vids, and Docs. Soon, we’ll also be launching a fast variant of Imagen 4 that’s up to 10 times faster than Imagen 3, allowing for even quicker exploration of creative ideas.
Lyria 2: Composing the Future of Music with Powerful Composition
For the world of sound and music, we continue to advance Lyria 2, a model designed for powerful composition and endless exploration. In April, we expanded access to the Music AI Sandbox, powered by Lyria 2, offering musicians, producers, and songwriters a suite of experimental tools that spark new creative possibilities and help artists explore unique musical ideas. The expertise and valuable feedback from the music industry remain crucial in ensuring our tools genuinely empower creators, inviting them to realize the transformative possibilities of AI in their art.
Lyria 2 is now available for creators through YouTube Shorts and for enterprises in Vertex AI. Additionally, we’ve made Lyria RealTime, our interactive music generation model which powers MusicFX DJ, available via an API and in AI Studio. Lyria RealTime allows anyone to interactively create, control, and perform generative music in real time, opening up entirely new avenues for musical expression.
Responsible Creation and Deep Collaboration with the Creative Community
At Google, innovation goes hand-in-hand with responsibility. Since its launch in 2023, SynthID has watermarked over 10 billion images, videos, audio files, and texts, serving as a critical tool to identify AI-generated content and mitigate the risks of misinformation and misattribution. We are unwavering in our commitment to transparency and ethical AI. As such, all outputs generated by Veo 3, Imagen 4, and Lyria 2 will continue to be protected with SynthID watermarks.
Today, we’re further bolstering this commitment by launching SynthID Detector, a new verification portal designed to help people identify AI-generated content. Simply upload a piece of content, and the SynthID Detector will analyze it to determine if either the entire file or a specific part of it contains a SynthID watermark.
With all our generative AI models, our core aim is to unleash human creativity and enable artists and creators to bring their ideas to life faster and more easily than ever before. This new suite of tools represents our dedication to building a future where technology amplifies human ingenuity, opening up limitless possibilities for artistic expression and storytelling.
We invite creators, artists, and storytellers to explore these new frontiers and experience the transformative power of Veo 3, Imagen 4, Flow, and Lyria 2. The next chapter of creative media generation has begun.