Veo 3.1
Google DeepMind's full-quality video model with multi-image reference support. Supply multiple reference images to guide subject, lighting, and environment in a single generation.
What is Veo 3.1?
Veo 3.1 is Google DeepMind's flagship video generation model in the Veo 3 generation. Unlike its speed-optimised sibling Veo 3.1 Fast, the standard Veo 3.1 is engineered for maximum output quality and creative control — particularly through multi-image reference support that lets you supply several images in a single generation request.
Multi-image references unlock a new level of consistency. You can provide a product shot, a brand environment image, and a lighting reference simultaneously. The model synthesises all of them into a coherent, cinematic video clip — meaning you spend less time on post-production corrections and more time on creative direction.
Veo 3.1 also supports start and end frames plus optional audio generation, making it a complete solution for teams that need both visual precision and accompanying sound without switching to a separate tool.
Multi-image reference
Guide subject, scene, lighting together
Up to 1080p
720p default, 1080p available
Start + End frame
Anchor beginning and end visually
Optional audio
AI audio in the same generation pass
How to generate video with Veo 3.1 on project.video
Open the composer
Go to your dashboard and open the generation composer. All models are accessible from a single unified workspace without re-entering your prompt.
Select Veo 3.1
Click the model selector and choose Veo 3.1. If you were on Veo 3.1 Fast, your prompt transfers automatically for easy comparison.
Upload reference images
Drag up to multiple reference images — start frame, end frame, subject references, environment shots. Veo 3.1 synthesises them into one coherent output.
Set specs and generate
Choose aspect ratio (16:9 or 9:16), resolution, duration (4s/6s/8s), toggle audio, then generate. Results arrive in your gallery.
Technical specs
Best use cases
Brand-consistent product video
Supply multiple reference images — product, environment, lighting — in a single generation request. Veo 3.1 synthesises a coherent clip that matches your brand identity without manual compositing.
Complex character or subject shots
When you need a specific person, object, or environment to look a particular way, multi-image reference allows multiple angles and context to guide the final output with far greater accuracy than a single reference.
Premium ad creative
For final-quality ad deliverables, Veo 3.1 offers the maximum output quality in the Veo 3 line. When you need to move beyond prototyping to production-ready assets, Veo 3.1 is the right choice.
Reference-directed transitions
Combine multi-image reference with start and end frames for precisely directed transitions. Anchor the visual context at both ends and let Veo 3.1 fill in the motion with full quality.
Example prompts
Copy any of these to get started on project.video.
"Luxury skincare serum on a marble surface, golden hour light from the left, close-up pull-back reveal, mist particles in the air, cinematic depth of field, 16:9"
"A chef's hands plating a gourmet dish in a high-end restaurant kitchen, ambient overhead lighting, smoke wisps, tight mid-shot to wide pull-out, 16:9"
"Fashion model walks through an art gallery corridor, backlit glass panels, slow-motion, hair and fabric movement, vertical social format, 9:16"