Models

All models

All models below are called via the unified async endpoint POST /v1/videos. Open one to see its sizes, duration, reference-image rules and billing.

General image generation, text/img-to-image, 27 fixed sizes.

Any-size image generation, auto-snaps to nearest ratio.

nano-banana family, any-size image generation.

Veo 3.1 text-to-video, 8 seconds.

Veo 3.1 image-to-video (first/last frame).

Veo 3.1 image-to-video (reference images).

Sora 2 text/img-to-video, 12 seconds.

Grok Imagine text/img-to-video, discrete durations.

Grok Imagine 1.5, img-to-video only, continuous duration.

Omni video (8s): text · image · first/last frame.

Omni video (10s): text · image · first/last frame.

Omni video editing, per-resolution · per-call.