Veo 2

by Google

Veo 2 is a text-to-video and image-to-video model that generates high-quality eight-second video clips with extensive camera controls and cinematic understanding. It interprets filmmaking language, letting users specify lens types, depth-of-field effects, genres, and camera movements like zoom, pan, and dolly. The model shows improved understanding of real-world physics and human movement, reducing hallucinations like extra fingers or unexpected objects common in earlier video generators. It handles both straightforward and complex instructions, capturing diverse visual and cinematic styles with temporal consistency. Veo 2 supports text-to-video, turning detailed descriptions into dynamic scenes, and image-to-video, animating static images with optional text guidance for style and motion. Outputs reach up to 4K resolution, include person generation controls, and carry invisible SynthID watermarks for transparency.

Key info

Input
Output
Features
Context window
480
Max output
8K

Available routes

No routes currently available — Veo 2 isn't routed through the Opper gateway right now. It may return.

Contact us about this model →

Available models from Google

Start building with 300+ models

One API key. Every major provider. Up and running in minutes.

Get startedView Documentation
Veo 2 by Google — not currently on Opper | Opper AI