X

Grok Imagine Video

Per Second:$0.04
Generate videos from text prompts, animate still images, or edit existing videos with natural language. The API supports configurable duration, aspect ratio, and resolution for generated videos — with the SDK handling the asynchronous polling automatically.
New
Commercial Use

📘 Technical Specifications of Grok Imagine Video

SpecificationDetails
Model IDgrok-imagine-video
ProviderxAI
TypeVideo generation & editing AI
Input TypesText (prompt); optional image or video Text prompts (natural language); optional image input (image→video); optional video_url for editing existing clips. Editing input video max durations differ by endpoint — reported ~8.7s for some editing flows.
Output Types.mp4 video via temporary URL
Duration Range (generate)1–15 seconds
Resolution480p, 720p (configurable)
Aspect Ratios1:1, 16:9, 9:16
Edit SupportYes — animates & modifies videos up to 8.7s
ModerationContent moderation included
PricingCharged per second, varies by resolution

🚀 What is Grok Imagine Video?

Grok Imagine Video is xAI’s advanced video generation and editing AI model exposed through CometAPI. It lets developers generate short, custom videos from natural language prompts and optionally animate still images or edit existing clips. The model supports configurable output length, resolution, and aspect ratio, with built-in content moderation to ensure policy compliance.

🧠Main features (what differentiates Grok Imagine)

  • Native audio + lip-sync: Generates synchronized ambient audio, effects, and short speech / narration with approximate lip synchronization.
  • Image→Video / prompt editing: Animate a still or edit existing footage via text prompts (remove/replace objects, retime, restyle).
  • Fast iteration & low latency: Designed for quick feedback loops suitable for creative workflows and product prototyping.
  • Production API: Imagine API exposes programmatic endpoints for batch generation, integration into editing pipelines, and enterprise controls.
  • Multiple “modes” / styles: User-facing modes (reported examples: Normal / Fun / Spicy or similar presets) to bias outputs for style or permissiveness (note: “Spicy” mode historically enabled NSFW).
Model (company)Max res (public)Max clip len (public)Native audio?StrengthsCaveats
Grok Imagine (xAI)720p6–15sYesFast iteration, strong cost/latency, integrated editing, native audio720p cap; moderation concerns; varying real-world fidelity
Sora (OpenAI)720p–1080p (depends on tier)short (6–15s)YesHigh visual fidelity; strong integration with OpenAI stackMore expensive; constrained moderation/controls
Veo (Google DeepMind)Up to 1080p+short (varies)YesStrong photorealism, stable motionHigher cost; less public experimentation
Runway Gen-4.51080p+short (varies)YesIndustry adoption for creative workflows, high fidelityCostlier; focused on creative tooling
Vidu / Kling / Pika (various specialists)up to 1080pshort (varies)MixedSome offer niche features (Smart Cuts, multi-shot chaining)Varied audio support; differing API maturity

⚠️ Limitations

  • Maximum video length is capped at 15 seconds.
  • Editing retains input video length (≤ 8.7s).
  • Generated URLs are ephemeral — download promptly.

How to access and integrate Grok Imagine Video

Step 1: Sign Up for API Key

Log in to cometapi.com. If you are not our user yet, please register first. Sign into your CometAPI console. Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.

Step 2: Send Requests to Grok Imagine Video API

Select the “grok-imagine-video” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience. Replace <YOUR_API_KEY> with your actual CometAPI key from your account. Where to call it: GROKVideo Generation and Video Edit.

Step 3: Send Requests to Grok Imagine Video API

Enter text or upload an image(You can optionally provide a source image to animate.). The Grok Imagine AI API analyzes your input and prepares the content for url. Both text-to-video and image-to-video conversion are supported.

The source image can be provided as:

  • A public URL pointing to an image
  • A base64-encoded data URI( e.g., data:image/jpeg;base64,<YOUR_BASE64_IMAGE>)

Step 4: Retrieve and Verify Results

Process the API response to get the generated answer. After processing, the API responds with the task status and output data. It returns a request_id immediately upon submission; use the GET endpoint to check status and retrieve the generated video. Video editing is asynchronous, you may need to poll this endpoint multiple times until the task is complete. Please download promptly.

FAQ

More Models