Gemini Omni is Google’s new multimodal video model, unveiled at Google I/O 2026 on May 19, 2026. The first version, Gemini Omni Flash, started rolling out the same day to subscribers worldwide through the Gemini app and Google Flow.
The headline is bigger than a single product launch. Google folded its standalone Veo video pipeline into the core Gemini system, then put a frontier-grade video model in front of every Google AI Plus, Pro, and Ultra subscriber on day one. YouTube Shorts and YouTube Create users get the same model at no extra cost.
Most early coverage focuses on the clips, and the clips are good. The more important shift is positioning: Google is calling Omni a “world model,” not a text-to-video tool.
This article breaks down:
- What that means in practice
- How Omni compares to Sora and other video models
- How to access it right now
- What it signals for the next twelve months of AI media
What Is Gemini Omni?
Gemini Omni is a family of multimodal AI models from Google DeepMind, built to generate and edit video from any combination of:
- Text
- Image
- Audio
- Video inputs
The first model in the family, Gemini Omni Flash, started shipping on May 19, 2026.
Google positions Omni as more than a text-to-video generator. The model is designed to understand:
- Physics
- Cause and effect
- Cultural context
- Historical accuracy
It then produces video that respects those rules.
What Is a “World Model”?
A world model is an AI system that simulates how the physical world behaves and predicts what happens next based on a user’s actions.
For Gemini Omni, that means generating clips where:
- Gravity
- Fluid dynamics
- Object permanence
stay consistent across the scene, not just visually plausible at first glance.
What Gemini Omni Can Do
The shortest description:
Any input in, video out, edited by talking to it.
Core Capabilities
- Generate video from text, images, audio, video, or any combination of these inputs in a single prompt
- Edit existing footage conversationally, including changes to:
- Characters
- Backgrounds
- Lighting
- Camera angles
- On-screen action
- Maintain visual consistency for the same characters, objects, and backgrounds across edits
- Apply physics-aware behavior to scenes, covering:
- Gravity
- Kinetic energy
- Fluid dynamics
- Anchor outputs in Gemini’s broader knowledge of:
- History
- Science
- Culture
- Power new tooling inside Google Flow, including a Flow Agent that brainstorms scenes and batch-edits projects
- Run inside:
- The Gemini app
- Google Flow
- YouTube Shorts
- The YouTube Create app
A separate avatar mode that lets users build a likeness from a handful of photos was demonstrated on stage but held back at launch over misuse concerns.
Gemini Omni vs. Sora and Other Video AI Models
Gemini Omni’s clearest differentiation is the world-model framing.
Where Sora and Runway focus on prompt-to-clip fidelity, Omni leans on Gemini 3.5’s reasoning to interpret instructions and keep scenes internally consistent.
| Feature | Gemini Omni Flash | OpenAI Sora | Runway Gen-4 | ByteDance Seedance 2.0 |
|---|---|---|---|---|
| Max clip length at launch | 10 seconds | 60 seconds | ~10 seconds | ~10 seconds |
| Inputs supported | Text, image, audio, video | Text, image | Text, image | Text, image |
| Conversational editing | Yes, in chat | Limited | Limited | No |
| Distribution surface | Gemini app, Flow, YouTube | Sora app, ChatGPT | Runway web | Doubao app |
| Physics reasoning | Yes, world-model design | Spatiotemporal patches | Implicit | Implicit |
| Free public tier | YouTube Shorts, YouTube Create | None | Limited | Limited |
Google has not published:
- Per-clip cost
- Compute footprint
- Head-to-head benchmarks against Sora
Independent tests of Omni’s predecessor pipeline, Veo 3.1, had it trailing Seedance 2.0 on raw output quality, so real-world rankings will likely shift in the coming weeks.
How to Access Gemini Omni Today
Gemini Omni Flash is live as of May 19, 2026.
Access depends on which Google plan you have.
1. Google AI Ultra Subscribers
- $100 per month after Google’s I/O 2026 price drop
- Full access through:
- Gemini app
- Google Flow
- Highest usage limits
2. Google AI Pro Subscribers
- Full access through:
- Gemini app
- Google Flow
- Raised but capped limits
3. Google AI Plus Subscribers
- Access through:
- Gemini app
- Google Flow
- Standard usage tiers
4. YouTube Shorts and YouTube Create Users
- Free access to Omni’s generation and editing features
- No Gemini subscription required
API access for developers and enterprise customers is set to follow in the coming weeks, according to Google.
Pricing for the API tier has not been published.
Heads Up on the Avatar Feature
Google demonstrated a personal-avatar capability on stage, then chose not to ship it at launch.
The company cited misuse risk and said access would be reviewed before any future rollout, so any tool claiming to provide Omni avatar generation today is not running on official Google infrastructure.
Why Gemini Omni Matters for Creators and Developers
The launch sits inside a wider I/O 2026 message:
Google wants Gemini to move from chat to agency.
Gemini 3.5 Flash, Gemini Spark, and Antigravity 2.0 were all announced the same day, each pushing toward AI that plans, acts, and creates with minimal hand-holding.
For Creators
The practical change is the editing loop.
Conversational edits inside the Gemini app cut a step out of a workflow that used to require pulling clips into a timeline tool.
Pair that with free distribution through YouTube Shorts, and the floor for short-form video production drops noticeably.
For Developers
The bigger question is the API.
Until access opens up in the coming weeks, production builds will keep running on:
- Veo 3.1
- Sora
- Seedance
with Omni as a planned upgrade rather than a deployable one today.
If you want to see the model handle your own footage before the API arrives, the free tier inside YouTube Shorts is the fastest entry point to try Gemini Omni Flash directly.
Frequently Asked Questions
When will Gemini Omni Pro be released?
Google confirmed a higher-tier Gemini Omni Pro is planned but has not given a release date.
Product management director Nicole Brichtova said it will ship when the team sees a step change above Flash.
Expect updates over the following quarters rather than within weeks.
Does Gemini Omni replace Veo 3?
Yes, in practice.
Google moved its generative video effort out of the standalone Veo product line and into the core Gemini system with Omni.
Veo 3.1 still powers parts of Flow today, but new development is centered on the Omni family.
Can Gemini Omni generate audio along with video?
Google has stated Omni accepts audio as an input alongside text, image, and video.
Native audio generation paired with video output has been demonstrated but is not confirmed as a guaranteed launch feature on the Flash tier.
Expect richer audio-video pairing on higher tiers and in the API release.
Is there an API for Gemini Omni yet?
Not at launch.
Google said developer and enterprise API access will roll out in the coming weeks after the May 19, 2026 announcement, with:
- No firm date
- No published pricing
Builders looking to ship in the meantime can:
- Keep using Veo 3.1 via Vertex AI
- Evaluate Sora and Seedance as interim options