Model family: Sora 2 (base) and Sora 2 Pro (high-quality variant).
Input modalities: text prompts, image reference, and short recorded cameo-video/audio for likeness.
Output modalities: encoded video (with audio) — parameters exposed through /v1/videos endpoints (model selection via model: "sora-2-pro"). API surface follows OpenAI’s videos endpoint family for create/retrieve/list/delete operations.
Training & architecture (public summary): OpenAI describes Sora 2 as trained on large-scale video data with post-training to improve world simulation; specifics (model size, exact datasets, and tokenization) are not publicly enumerated in line-by-line detail. Expect heavy compute, specialized video tokenizers/architectures and multi-modal alignment components.
API endpoints & workflow: show a job-based workflow: submit a POST creation request (model="sora-2-pro"), receive a job id or location, then poll or wait for completion and download the resulting file(s). Common parameters in published examples include prompt, seconds/duration, size/resolution, and input_reference for image-guided starts.
Typical parameters :
model: "sora-2-pro"prompt: natural language scene description, optionally with dialogue cuesseconds / duration: target clip length ( Pro supports the highest quality in available durations)size / resolution: community reports indicate Pro supports up to 1080p in many use cases.Content inputs: image files (JPEG/PNG/WEBP) can be supplied as a frame or reference; when used, the image should match the target resolution and act as a composition anchor.
Rendering behavior: Pro is tuned to prioritize frame-to-frame coherence and realistic physics; this typically implies longer compute time and higher cost per clip than non-Pro variants.
Qualitative strengths: OpenAI improved realism, physics consistency, and synchronized audio** versus prior video models. Other VBench results indicate Sora-2 and derivatives sit at or near the top of contemporary closed-source and temporal coherence.
Independent timing/throughput (example bench): Sora-2-Pro averaged ~2.1 minutes for 20-second 1080p clips in one comparison, while a competitor (Runway Gen-3 Alpha Turbo) was faster (~1.7 minutes) on the same task — tradeoffs are quality vs render latency and platform optimization.
Use cases:
When not to
| Model Name | Tags | Orientation | Resolution | Price |
|---|---|---|---|---|
| sora-2-pro | videos | Portrait | 720x1280 | $0.24 / sec |
| sora-2-pro | videos | Landscape | 1280x720 | $0.24 / sec |
| sora-2-pro | videos | Portrait (High Res) | 1024x1792 | $0.40 / sec |
| sora-2-pro | videos | Landscape (High Res) | 1792x1024 | $0.40 / sec |
| sora-2-pro-all | - | Universal / All | - | $0.80000 |
# Create a video with sora-2-pro
# Step 1: Submit the video generation request
echo "Submitting video generation request..."
response=$(curl -s https://api.cometapi.com/v1/videos \
-H "Authorization: Bearer $COMETAPI_KEY" \
-F "model=sora-2-pro" \
-F "prompt=A calico cat playing a piano on stage")
echo "Response: $response"
# Extract video_id from response (handle JSON with spaces like "id": "xxx")
video_id=$(echo "$response" | tr -d '
' | sed 's/.*"id"[[:space:]]*:[[:space:]]*"\([^"]*\)".*/\1/')
echo "Video ID: $video_id"
# Step 2: Poll for progress until 100%
echo ""
echo "Checking video generation progress..."
while true; do
status_response=$(curl -s "https://api.cometapi.com/v1/videos/$video_id" \
-H "Authorization: Bearer $COMETAPI_KEY")
# Parse progress from "progress": "0%" format
progress=$(echo "$status_response" | grep -o '"progress":"[^"]*"' | head -1 | sed 's/"progress":"//;s/"$//')
# Parse status from the outer level
status=$(echo "$status_response" | grep -o '"status":"[^"]*"' | head -1 | sed 's/"status":"//;s/"$//')
echo "Progress: $progress, Status: $status"
if [ "$progress" = "100%" ]; then
echo "Video generation completed!"
break
fi
if [ "$status" = "FAILURE" ] || [ "$status" = "failed" ]; then
echo "Video generation failed!"
echo "$status_response"
exit 1
fi
sleep 10
done
# Step 3: Download the video to output directory
echo ""
echo "Downloading video to ./output/$video_id.mp4..."
mkdir -p ./output
curl -s "https://api.cometapi.com/v1/videos/$video_id/content" \
-H "Authorization: Bearer $COMETAPI_KEY" \
-o "./output/$video_id.mp4"
if [ -f "./output/$video_id.mp4" ]; then
echo "Video saved to ./output/$video_id.mp4"
ls -la "./output/$video_id.mp4"
else
echo "Failed to download video"
exit 1
fi