OpenAI Models

Start with GPT-5.6 Sol for complex reasoning and coding, choose GPT-5.6 Terra to balance intelligence and cost, or use GPT-5.6 Luna for cost-sensitive, high-volume workloads.

GPT Image 2

Input:$4/M

GPT Image 2 is openai state-of-the-art image generation model for fast, high-quality image generation and editing. It supports flexible image sizes and high-fidelity image inputs.

GPT 5.5 Pro

Input:$24/M

Output:$144/M

GPT-5.5 Pro combines state-of-the-art intelligence, precision, and efficiency to tackle sophisticated challenges. From software development and data analysis to research and decision support, it delivers expert-level assistance with speed and consistency.

GPT 5.5

Input:$4/M

Model 5.5 is a next-generation AI model designed for stronger reasoning, faster responses, and improved accuracy across a wide range of tasks. It excels at understanding complex instructions, generating high-quality content, and assisting with coding, analysis, and problem-solving.

GPT-5.4 nano

Input:$0.16/M

Output:$1/M

GPT-5.4 Nano is an ultra-lightweight AI model built for maximum speed and efficiency. It is optimized for simple tasks, real-time interactions, and large-scale deployments where low latency and minimal resource consumption are essential.

GPT-5.4 mini

Input:$0.6/M

Output:$3.6/M

GPT-5.4 Mini is a lightweight and efficient AI model optimized for speed and everyday productivity. It provides reliable conversational capabilities, content generation, and task assistance while maintaining low latency and resource usage.

GPT-5.4 pro

Context:1,050,000

Input:$24/M

Output:$144/M

GPT-5.4 Pro is a high-performance AI model designed for professional and business applications. It offers strong reasoning, reliable accuracy, and efficient execution across tasks such as content creation, coding, research, and data analysis.

GPT Image 2 ALL

Per Request:$0.04

GPT Image 2 ALL is a comprehensive image generation model designed to handle a wide range of creative and professional visual tasks. It combines high-quality image creation, advanced prompt understanding, and versatile style support to deliver exceptional results across diverse use cases.

GPT 5.5 ALL

Input:$2.4/M

Output:$14.4/M

GPT-5.5 excels in code writing, online research, data analysis, and cross-tool operations. The model not only improves its autonomy in handling complex multi-step tasks but also significantly improves reasoning capabilities and execution efficiency while maintaining the same latency as its predecessor, marking an important step towards automated office automation in AI.

Sora 2 Pro

Per Second:$0.24

Sora 2 Pro is our most advanced and powerful media generation model, capable of generating videos with synchronized Audio. It can create detailed, dynamic video clips from natural language or images.

Sora 2

Per Second:$0.08

Super powerful video generation model, with sound effects, supports chat format.

GPT-5.4

Context:1,050,000

Output:$12/M

GPT-5.4 is the frontier model for complex professional work. Reasoning.effort supports: none (default), low, medium, high and xhigh.

GPT-5.3 Chat

Input:$1.4/M

Output:$11.2/M

GPT-5.3 Instant model used in ChatGPT

gpt-audio-1.5

The best voice model for audio in, audio out with Chat Completions.

gpt-realtime-1.5

Context:32,000

Input:$3.2/M

Output:$12.8/M

The best voice model for audio in, audio out.

GPT 5.3 Codex

Input:$1.4/M

Output:$11.2/M

GPT-5.3-Codex is optimized for agentic coding tasks in Codex or similar environments. GPT-5.3-Codex supports low, medium, high, and xhigh reasoning effort settings.

GPT Image 1.5

Input:$6.4/M

Output:$25.6/M

GPT-Image-1.5 is OpenAI’s image model in the GPT Image family . It is a natively multimodal GPT model designed to generate images from text prompts and to perform high-fidelity edits of input images while following user instructions closely.

GPT-5.2 Pro

Input:$16.8/M

Output:$134.4/M

gpt-5.2-pro is the highest-capability, production-oriented member of OpenAI’s GPT-5.2 family, exposed through the Responses API for workloads that demand maximal fidelity, multi-step reasoning, extensive tool use and the largest context/throughput budgets OpenAI offers.

GPT-5.1 Chat

Context:400.0k

Input:$1/M

GPT-5.1 Chat is an instruction-tuned conversational language model for general-purpose chat, reasoning, and writing. It supports multi-turn dialogue, summarization, drafting, knowledge-base QA, and lightweight code assistance for in-app assistants, support automation, and workflow copilots. Technical highlights include chat-optimized alignment, controllable and structured outputs, and integration paths for tool invocation and retrieval workflows when available.

GPT-5.1

Input:$1/M

GPT-5.1 is a general-purpose instruction-tuned language model focused on text generation and reasoning across product workflows. It supports multi-turn dialogue, structured output formatting, and code-oriented tasks such as drafting, refactoring, and explanation. Typical uses include chat assistants, retrieval-augmented QA, data transformation, and agent-style automation with tools or APIs when supported. Technical highlights include text-centric modality, instruction following, JSON-style outputs, and compatibility with function calling in common orchestration frameworks.

GPT Image 1 mini

Context:2M

Input:$6.4/M

Output:$25.6/M

Cost-optimized version of GPT Image 1. It is a native Multimodal language model that accepts both text and image input and generates image output.

GPT-5

Context:400K

Input:$1/M

GPT-5 is OpenAI's most powerful coding model to date. It shows significant improvements in complex front-end generation and debugging large codebases. It can transform ideas into reality with intuitive and aesthetically pleasing results, creating beautiful and responsive websites, applications, and games with a keen sense of aesthetics, all from a single prompt. Early testers have also noted its design choices, with a deeper understanding of elements like spacing, typography, and white space.

GPT-5 nano

Context:400K

Input:$0.04/M

Output:$0.32/M

GPT-5 Nano is an artificial intelligence model provided by OpenAI.

GPT-5 mini

Context:400K

Input:$0.2/M

Output:$1.6/M

GPT-5 mini is OpenAI’s cost- and latency-optimized member of the GPT-5 family, intended to deliver much of GPT-5’s multimodal and instruction-following strengths at substantially lower cost for large-scale production use. It targets environments where throughput, predictable per-token pricing, and fast responses are the primary constraints while still providing strong general-purpose capabilities.

GPT 5.3

Coming soon

Output:$480/M

coming soon

GPT 4o Image

Per Request:$0.04

gpt-4o-image generate images as output, optionally using images as input

GPT-5.2

Input:$1.4/M

Output:$11.2/M

GPT-5.2 is a multi-flavored model suite (Instant, Thinking, Pro) engineered for better long-context understanding, stronger coding and tool use, and materially higher performance on professional “knowledge-work” benchmarks.

O3 Pro

Context:200K

Input:$16/M

Output:$64/M

OpenAI o3‑pro is a “pro” variant of the o3 reasoning model engineered to think longer and deliver the most dependable responses by employing private chain‑of‑thought reinforcement learning and setting new state‑of‑the‑art benchmarks across domains like science, programming, and business—while autonomously integrating tools such as web search, file analysis, Python execution, and visual reasoning within API.

TTS

Output:$12/M

OpenAI Text-to-Speech

Whisper-1

Input:$24/M

Speech to text, creating translations

tts-1

Output:$12/M

GPT-4o mini TTS

Input:$9.6/M

Output:$9.6/M

GPT-4o mini TTS is a neural text-to-speech model designed for natural, low-latency voice generation in user-facing applications. It converts text to natural-sounding speech with selectable voices, multi-format output, and streaming synthesis for responsive experiences. Typical uses include voice assistants, IVR and contact flows, product read-aloud, and media narration. Technical highlights include API-based streaming and export to common audio formats such as MP3 and WAV.

o1-pro-2025-03-19

Input:$120/M

Output:$480/M

GPT-4o Transcribe

GPT-4o Transcribe is an audio-to-text model for multilingual, low-latency speech recognition. It supports real-time streaming and batch transcription from common audio formats with punctuation and sentence segmentation. Typical uses include live captions, voice assistant input, meeting notes, and media or call recording transcription. Technical highlights include audio modality support, long-form processing, and APIs suited for interactive and server-side workflows.

GPT-4o mini Search Preview

Output:$240/M

GPT-4o mini Search Preview is a compact multimodal model in the GPT-4o family geared toward search-oriented interactions and retrieval workflows. It interprets and reformulates queries, synthesizes concise answers, and can ground responses via external search when integrated through tool/function calling. Typical uses include in-product search assistants, knowledge-base QA, e-commerce discovery, and query understanding for ranking and routing. Technical highlights include text-and-image inputs, instruction following, structured output formats, and tool use integration for RAG pipelines.

GPT-4o mini Audio Preview

Output:$240/M

GPT-4o mini Audio Preview is a compact multimodal model for building conversational audio applications. It supports speech input and output alongside text, enabling speech recognition, speech synthesis, and mixed text-audio dialogs with tool/function calling for structured actions. Typical uses include voice assistants, streaming transcription with summarization, IVR and call-bot workflows, and audio-enabled in-app helpers. Technical highlights include audio I/O, streaming responses, instruction following, and integration via chat and tools APIs.

GPT-4o mini Realtime Preview

Output:$240/M

GPT-4o mini Realtime Preview is a real-time multimodal model for interactive voice and visual experiences. It handles speech, text, and images with streaming input and output, plus tool/function calling for grounded actions. Typical uses include voice assistants, live call handling, real-time captioning, and visual question answering over camera or screen content. Technical highlights include bidirectional audio, vision understanding, streaming responses, and structured outputs via functions.

o1-2024-12-17

Output:$48/M

GPT-4o mini Audio

Input:$0.12/M

Output:$0.48/M

GPT-4o mini Audio is a multimodal model for speech and text interactions. It performs speech recognition, translation, and text-to-speech, follows instructions, and can call tools for structured actions with streaming responses. Typical uses include real-time voice assistants, live captioning and translation, call summarization, and voice-controlled applications. Technical highlights include audio input and output, streaming responses, function calling, and structured JSON output.

An Ada-based text embedding model optimized for various NLP tasks.

text-embedding-3-small

Input:$0.016/M

Output:$0.016/M

A small text embedding model for efficient processing.

text-embedding-3-large

Input:$0.104/M

Output:$0.104/M

A large text embedding model for a wide range of natural language processing tasks.

GPT Image 1

Input:$8/M

Output:$32/M

An advanced AI model for generating images from text descriptions.

dall-e-3

Per Request:$0.016

New version of DALL-E for image generation.

o4-mini

Input:$0.88/M

Output:$3.52/M

O4-mini is an artificial intelligence model provided by OpenAI.

o3-mini

Input:$0.88/M

Output:$3.52/M

O3-mini is an artificial intelligence model provided by OpenAI.

o3

Input:$1.6/M

Output:$6.4/M

O3 is an artificial intelligence model provided by OpenAI.

o1-pro

Input:$120/M

Output:$480/M

O1-pro is an artificial intelligence model provided by OpenAI.

o1

Output:$48/M

O1 is an artificial intelligence model provided by OpenAI.

gpt-oss-20b

Input:$0.08/M

Output:$0.32/M

gpt-oss-20b is an artificial intelligence model provided by cloudflare-workers-ai.

gpt-oss-120b

Input:$0.16/M

Output:$0.8/M

gpt-oss-120b is an artificial intelligence model provided by cloudflare-workers-ai.

GPT-4o mini

Input:$0.12/M

Output:$0.48/M

GPT-4o mini is an artificial intelligence model provided by OpenAI.

GPT-4o

GPT-4o is OpenAI's most advanced Multimodal model, faster and cheaper than GPT-4 Turbo, with stronger visual capabilities. This model has a 128K context and a knowledge cutoff of October 2023. Models in the 1106 series and above support tool_calls and function_call. This model supports a maximum context length of 128,000 tokens.

GPT-4.1 nano

Context:1.0M

Input:$0.08/M

Output:$0.32/M

GPT-4.1 nano is an artificial intelligence model provided by OpenAI. gpt-4.1-nano: Features a larger context window—supporting up to 1 million context tokens and capable of better utilizing that context through improved long-context understanding. Has an updated knowledge cutoff time of June 2024. This model supports a maximum context length of 1,047,576 tokens.

GPT 4.1 mini

Context:1.0M

Input:$0.32/M

Output:$1.28/M

GPT-4.1 mini is an artificial intelligence model provided by OpenAI. gpt-4.1-mini: A significant leap in small model performance, even beating GPT-4o in many benchmarks. It meets or exceeds GPT-4o in intelligence evaluation while reducing latency by nearly half and cost by 83%. This model supports a maximum context length of 1,047,576 tokens.

GPT-4.1

Context:1.0M

Input:$1.6/M

Output:$6.4/M

GPT-4.1 is an artificial intelligence model provided by OpenAI. gpt-4.1-nano: Features a larger context window—supporting up to 1 million context tokens and capable of better utilizing that context through improved long-context understanding. Has an updated knowledge cutoff time of June 2024. This model supports a maximum context length of 1,047,576 tokens.

GPT 6

Coming soon