O

GPT-5.4 nano

Context:400,000
Input:$0.16/M
Output:$1/M
GPT-5.4 nano is designed for tasks where speed and cost matter most like classification, data extraction, ranking, and sub-agents.
O

GPT-5.4 mini

Context:400,000
Input:$0.6/M
Output:$3.6/M
GPT-5.4 mini brings the strengths of GPT-5.4 to a faster, more efficient model designed for high-volume workloads.
O

GPT-5.4 pro

Context:1,050,000
Input:$24/M
Output:$144/M
Version of GPT-5.4 that produces smarter and more precise responses.
O

GPT-5.4

Context:1,050,000
Input:$2/M
Output:$12/M
GPT-5.4 is the frontier model for complex professional work. Reasoning.effort supports: none (default), low, medium, high and xhigh.
O

GPT-5.3 Chat

Input:$1.4/M
Output:$11.2/M
GPT-5.3 Instant model used in ChatGPT
O

Sora 2 Pro

Per Second:$0.24
Sora 2 Pro is our most advanced and powerful media generation model, capable of generating videos with synchronized Audio. It can create detailed, dynamic video clips from natural language or images.
O

Sora 2

Per Second:$0.08
Super powerful video generation model, with sound effects, supports chat format.
O

gpt-realtime-1.5

Context:32,000
Input:$3.2/M
Output:$12.8/M
The best voice model for audio in, audio out.
O

gpt-audio-1.5

Input:$2/M
Output:$8/M
The best voice model for audio in, audio out with Chat Completions.
O

GPT 5.3 Codex

Context:400,000
Input:$1.4/M
Output:$11.2/M
GPT-5.3-Codex is optimized for agentic coding tasks in Codex or similar environments. GPT-5.3-Codex supports low, medium, high, and xhigh reasoning effort settings.
O

GPT-5.2 Codex

Context:400,000
Input:$1.4/M
Output:$11.2/M
GPT-5.2-Codex is an upgraded version of GPT-5.2 optimized for agentic coding tasks in Codex or similar environments. GPT-5.2-Codex supports low, medium, high, and xhigh reasoning effort settings.
O

GPT Image 1.5

Input:$6.4/M
Output:$25.6/M
GPT-Image-1.5 is OpenAI’s image model in the GPT Image family . It is a natively multimodal GPT model designed to generate images from text prompts and to perform high-fidelity edits of input images while following user instructions closely.
O

GPT-5.2 Pro

Context:400,000
Input:$16.8/M
Output:$134.4/M
gpt-5.2-pro is the highest-capability, production-oriented member of OpenAI’s GPT-5.2 family, exposed through the Responses API for workloads that demand maximal fidelity, multi-step reasoning, extensive tool use and the largest context/throughput budgets OpenAI offers.
O

GPT-5.2 Chat

Context:128,000
Input:$1.4/M
Output:$11.2/M
gpt-5.2-chat-latest is the Chat-optimized snapshot of OpenAI’s GPT-5.2 family (branded in ChatGPT as GPT-5.2 Instant). It is the model for interactive/chat use cases that need a blend of speed, long-context handling, multimodal inputs and reliable conversational behaviour.
O

GPT 5.1 Codex Max

O

GPT 5.1 Codex Max

Context:400K
Input:$1/M
Output:$8/M
GPT-5.1-Codex-Max is OpenAI’s purpose-built agentic coding model in the GPT-5.1 family, optimized to execute long-running software engineering workflows (refactors, multi-hour agent loops, terminal automation, test runs and code review) with higher reliability and token efficiency than its predecessors.
O

GPT 5.1 Codex

O

GPT 5.1 Codex

Context:400K
Input:$1/M
Output:$8/M
GPT-5.1-Codex is a high-performance large language model focused on code generation and understanding, with enhanced capabilities for complex programming tasks, code reasoning, and production-level applications.
O

GPT-5.1 Chat

O

GPT-5.1 Chat

Context:400.0k
Input:$1/M
Output:$8/M
GPT-5.1 Chat is an instruction-tuned conversational language model for general-purpose chat, reasoning, and writing. It supports multi-turn dialogue, summarization, drafting, knowledge-base QA, and lightweight code assistance for in-app assistants, support automation, and workflow copilots. Technical highlights include chat-optimized alignment, controllable and structured outputs, and integration paths for tool invocation and retrieval workflows when available.
O

GPT-5.1

O

GPT-5.1

Input:$1/M
Output:$8/M
GPT-5.1 is a general-purpose instruction-tuned language model focused on text generation and reasoning across product workflows. It supports multi-turn dialogue, structured output formatting, and code-oriented tasks such as drafting, refactoring, and explanation. Typical uses include chat assistants, retrieval-augmented QA, data transformation, and agent-style automation with tools or APIs when supported. Technical highlights include text-centric modality, instruction following, JSON-style outputs, and compatibility with function calling in common orchestration frameworks.
X

GPT Image 1 mini

X

GPT Image 1 mini

Context:2M
Input:$6.4/M
Output:$25.6/M
Cost-optimized version of GPT Image 1. It is a native Multimodal language model that accepts both text and image input and generates image output.
O

GPT-5 nano

O

GPT-5 nano

Context:400K
Input:$0.04/M
Output:$0.32/M
GPT-5 Nano is an artificial intelligence model provided by OpenAI.
O

GPT-5 mini

O

GPT-5 mini

Context:400K
Input:$0.2/M
Output:$1.6/M
GPT-5 mini is OpenAI’s cost- and latency-optimized member of the GPT-5 family, intended to deliver much of GPT-5’s multimodal and instruction-following strengths at substantially lower cost for large-scale production use. It targets environments where throughput, predictable per-token pricing, and fast responses are the primary constraints while still providing strong general-purpose capabilities.
O

GPT 5 Codex

O

GPT 5 Codex

Context:400K
Input:$1/M
Output:$8/M
GPT-5-Codex is a high-performance large language model focused on code generation and understanding, with enhanced capabilities for complex programming tasks, code reasoning, and production-level applications.
O

GPT 5 Chat

O

GPT 5 Chat

Context:400K
Input:$1/M
Output:$8/M
GPT-5 Chat (latest) is an artificial intelligence model provided by OpenAI.
O

GPT-5

O

GPT-5

Context:400K
Input:$1/M
Output:$8/M
GPT-5 is OpenAI's most powerful coding model to date. It shows significant improvements in complex front-end generation and debugging large codebases. It can transform ideas into reality with intuitive and aesthetically pleasing results, creating beautiful and responsive websites, applications, and games with a keen sense of aesthetics, all from a single prompt. Early testers have also noted its design choices, with a deeper understanding of elements like spacing, typography, and white space.
O

Whisper-1

Input:$24/M
Output:$24/M
Speech to text, creating translations
O

tts-1-hd-1106

O

tts-1-hd-1106

Input:$24/M
Output:$24/M
O

tts-1-hd

O

tts-1-hd

Input:$24/M
Output:$24/M
O

tts-1-1106

O

tts-1-1106

Input:$12/M
Output:$12/M
O

tts-1

O

tts-1

Input:$12/M
Output:$12/M
O

TTS

Input:$12/M
Output:$12/M
OpenAI Text-to-Speech
O

text-embedding-ada-002

O

text-embedding-ada-002

Input:$0.08/M
Output:$0.08/M
An Ada-based text embedding model optimized for various NLP tasks.
O

text-embedding-3-small

O

text-embedding-3-small

Input:$0.016/M
Output:$0.016/M
A small text embedding model for efficient processing.
O

text-embedding-3-large

O

text-embedding-3-large

Input:$0.104/M
Output:$0.104/M
A large text embedding model for a wide range of natural language processing tasks.
O

omni-moderation-latest

O

omni-moderation-latest

Per Request:$0.0016
O

omni-moderation-2024-09-26

O

omni-moderation-2024-09-26

Per Request:$0.0016
O

o4-mini-deep-research

O

o4-mini-deep-research

Context:200K
Input:$1.6/M
Output:$6.4/M
O4-Mini-Deep-Research is OpenAI’s latest agentic reasoning model, combining the lightweight o4-mini backbone with the advanced Deep Research framework. Designed to deliver fast, cost-efficient deep information synthesis, it enables developers and researchers to perform automated web searches, data analysis, and chain-of-thought reasoning within a single API call.
O

o4-mini

O

o4-mini

Input:$0.88/M
Output:$3.52/M
O4-mini is an artificial intelligence model provided by OpenAI.
O

O3 Pro

O

O3 Pro

Context:200K
Input:$16/M
Output:$64/M
OpenAI o3‑pro is a “pro” variant of the o3 reasoning model engineered to think longer and deliver the most dependable responses by employing private chain‑of‑thought reinforcement learning and setting new state‑of‑the‑art benchmarks across domains like science, programming, and business—while autonomously integrating tools such as web search, file analysis, Python execution, and visual reasoning within API.
O

o3-mini

O

o3-mini

Input:$0.88/M
Output:$3.52/M
O3-mini is an artificial intelligence model provided by OpenAI.
O

o3

O

o3

Input:$1.6/M
Output:$6.4/M
O3 is an artificial intelligence model provided by OpenAI.
O

o1-pro-all

O

o1-pro-all

Input:$120/M
Output:$480/M
O

o1-pro-2025-03-19

O

o1-pro-2025-03-19

Input:$120/M
Output:$480/M
O

o1-pro

O

o1-pro

Input:$120/M
Output:$480/M
O1-pro is an artificial intelligence model provided by OpenAI.
O

o1-preview-all

O

o1-preview-all

Per Request:$0.16
O

o1-preview-2024-09-12

O

o1-preview-2024-09-12

Input:$12/M
Output:$48/M
O

o1-preview

O

o1-preview

Input:$12/M
Output:$48/M
O1-preview is an artificial intelligence model provided by OpenAI.
O

o1-mini-all

O

o1-mini-all

Per Request:$0.08
O

o1-mini-2024-09-12

O

o1-mini-2024-09-12

Input:$0.88/M
Output:$3.52/M
O

o1-mini

O

o1-mini

Input:$0.88/M
Output:$3.52/M
O1-mini is an artificial intelligence model provided by OpenAI.
O

o1-all

O

o1-all

Per Request:$0.16
O

o1-2024-12-17

O

o1-2024-12-17

Input:$12/M
Output:$48/M
O

o1

O

o1

Input:$12/M
Output:$48/M
O1 is an artificial intelligence model provided by OpenAI.
O

gpt-realtime-mini

O

gpt-realtime-mini

Input:$0.48/M
Output:$0.96/M
An economical version of the real-time GPT—capable of responding to Audio and text input in real-time via WebRTC, WebSocket, or SIP connections.
C

gpt-oss-20b

C

gpt-oss-20b

Input:$0.08/M
Output:$0.32/M
gpt-oss-20b is an artificial intelligence model provided by cloudflare-workers-ai.
C

gpt-oss-120b

C

gpt-oss-120b

Input:$0.16/M
Output:$0.8/M
gpt-oss-120b is an artificial intelligence model provided by cloudflare-workers-ai.
O

GPT Image 1

O

GPT Image 1

Input:$8/M
Output:$32/M
An advanced AI model for generating images from text descriptions.
O

GPT 5.3

O

GPT 5.3

Coming soon
Input:$60/M
Output:$480/M
coming soon
O

GPT-5.2

Context:400,000
Input:$1.4/M
Output:$11.2/M
GPT-5.2 is a multi-flavored model suite (Instant, Thinking, Pro) engineered for better long-context understanding, stronger coding and tool use, and materially higher performance on professional “knowledge-work” benchmarks.
O

GPT-4o Transcribe

O

GPT-4o Transcribe

Input:$60/M
Output:$240/M
GPT-4o Transcribe is an audio-to-text model for multilingual, low-latency speech recognition. It supports real-time streaming and batch transcription from common audio formats with punctuation and sentence segmentation. Typical uses include live captions, voice assistant input, meeting notes, and media or call recording transcription. Technical highlights include audio modality support, long-form processing, and APIs suited for interactive and server-side workflows.
O

GPT-4o Search

O

GPT-4o Search

Input:$60/M
Output:$240/M
GPT-4o Search is a GPT-4o-based multimodal model configured for search-augmented reasoning and grounded, current answers. It follows instructions and uses web search tools to retrieve, evaluate, and synthesize external information, with source context when available. Typical uses include research assistance, fact-checking, news and trend monitoring, and answering time-sensitive queries. Technical highlights include tool/function calling for browsing and retrieval, long-context handling, and structured outputs suitable for citations and links.
O

GPT-4o Realtime

O

GPT-4o Realtime

Input:$60/M
Output:$240/M
The Realtime API allows developers to build low-latency, Multimodal experiences, including speech-to-speech functionality. Text and Audio processed by the Realtime API are priced separately. This model supports a maximum context length of 128,000 tokens.
O

GPT-4o mini TTS

O

GPT-4o mini TTS

Input:$9.6/M
Output:$9.6/M
GPT-4o mini TTS is a neural text-to-speech model designed for natural, low-latency voice generation in user-facing applications. It converts text to natural-sounding speech with selectable voices, multi-format output, and streaming synthesis for responsive experiences. Typical uses include voice assistants, IVR and contact flows, product read-aloud, and media narration. Technical highlights include API-based streaming and export to common audio formats such as MP3 and WAV.
O

GPT-4o mini Search Preview

O

GPT-4o mini Search Preview

Input:$60/M
Output:$240/M
GPT-4o mini Search Preview is a compact multimodal model in the GPT-4o family geared toward search-oriented interactions and retrieval workflows. It interprets and reformulates queries, synthesizes concise answers, and can ground responses via external search when integrated through tool/function calling. Typical uses include in-product search assistants, knowledge-base QA, e-commerce discovery, and query understanding for ranking and routing. Technical highlights include text-and-image inputs, instruction following, structured output formats, and tool use integration for RAG pipelines.
O

GPT-4o mini Realtime Preview

O

GPT-4o mini Realtime Preview

Input:$60/M
Output:$240/M
GPT-4o mini Realtime Preview is a real-time multimodal model for interactive voice and visual experiences. It handles speech, text, and images with streaming input and output, plus tool/function calling for grounded actions. Typical uses include voice assistants, live call handling, real-time captioning, and visual question answering over camera or screen content. Technical highlights include bidirectional audio, vision understanding, streaming responses, and structured outputs via functions.
O

GPT-4o mini Audio

Input:$0.12/M
Output:$0.48/M
GPT-4o mini Audio is a multimodal model for speech and text interactions. It performs speech recognition, translation, and text-to-speech, follows instructions, and can call tools for structured actions with streaming responses. Typical uses include real-time voice assistants, live captioning and translation, call summarization, and voice-controlled applications. Technical highlights include audio input and output, streaming responses, function calling, and structured JSON output.
O

GPT-4o mini Audio Preview

O

GPT-4o mini Audio Preview

Input:$60/M
Output:$240/M
GPT-4o mini Audio Preview is a compact multimodal model for building conversational audio applications. It supports speech input and output alongside text, enabling speech recognition, speech synthesis, and mixed text-audio dialogs with tool/function calling for structured actions. Typical uses include voice assistants, streaming transcription with summarization, IVR and call-bot workflows, and audio-enabled in-app helpers. Technical highlights include audio I/O, streaming responses, instruction following, and integration via chat and tools APIs.
O

GPT-4o mini

O

GPT-4o mini

Input:$0.12/M
Output:$0.48/M
GPT-4o mini is an artificial intelligence model provided by OpenAI.
O

GPT 4o Image

O

GPT 4o Image

Per Request:$0.04
gpt-4o-image generate images as output, optionally using images as input
O

GPT-4o Audio Preview

O

GPT-4o Audio Preview

Input:$60/M
Output:$240/M
This model supports a maximum context length of 128,000 tokens.
O

gpt-4o-all

O

gpt-4o-all

Input:$2/M
Output:$8/M
<div>GPT-4o is OpenAI's most advanced Multimodal model, faster and cheaper than GPT-4 Turbo, with stronger visual capabilities. This model has a 128K context and a knowledge cutoff of October 2023. Models in the 1106 series and above support tool_calls and function_call.</div> This model supports a maximum context length of 128,000 tokens.
O

GPT-4o

O

GPT-4o

Input:$2/M
Output:$8/M
GPT-4o is OpenAI's most advanced Multimodal model, faster and cheaper than GPT-4 Turbo, with stronger visual capabilities. This model has a 128K context and a knowledge cutoff of October 2023. Models in the 1106 series and above support tool_calls and function_call. This model supports a maximum context length of 128,000 tokens.
O

GPT-4.1 nano

O

GPT-4.1 nano

Context:1.0M
Input:$0.08/M
Output:$0.32/M
GPT-4.1 nano is an artificial intelligence model provided by OpenAI. gpt-4.1-nano: Features a larger context window—supporting up to 1 million context tokens and capable of better utilizing that context through improved long-context understanding. Has an updated knowledge cutoff time of June 2024. This model supports a maximum context length of 1,047,576 tokens.
O

GPT 4.1 mini

O

GPT 4.1 mini

Context:1.0M
Input:$0.32/M
Output:$1.28/M
GPT-4.1 mini is an artificial intelligence model provided by OpenAI. gpt-4.1-mini: A significant leap in small model performance, even beating GPT-4o in many benchmarks. It meets or exceeds GPT-4o in intelligence evaluation while reducing latency by nearly half and cost by 83%. This model supports a maximum context length of 1,047,576 tokens.
O

GPT-4.1

O

GPT-4.1

Context:1.0M
Input:$1.6/M
Output:$6.4/M
GPT-4.1 is an artificial intelligence model provided by OpenAI. gpt-4.1-nano: Features a larger context window—supporting up to 1 million context tokens and capable of better utilizing that context through improved long-context understanding. Has an updated knowledge cutoff time of June 2024. This model supports a maximum context length of 1,047,576 tokens.
O

gpt-4-vision-preview

O

gpt-4-vision-preview

Input:$8/M
Output:$32/M
This model supports a maximum context length of 128,000 tokens.
O

gpt-4-vision

O

gpt-4-vision

Input:$8/M
Output:$24/M
This model supports a maximum context length of 128,000 tokens.
O

gpt-4-v

O

gpt-4-v

Per Request:$0.04
O

gpt-4-turbo-preview

O

gpt-4-turbo-preview

Input:$8/M
Output:$24/M
<div>gpt-4-turbo-preview Upgraded version, stronger code generation capabilities, reduced model "laziness", fixed non-English UTF-8 generation issues.</div> This model supports a maximum context length of 128,000 tokens.
O

gpt-4-turbo-2024-04-09

O

gpt-4-turbo-2024-04-09

Input:$8/M
Output:$24/M
<div>gpt-4-turbo-2024-04-09 Upgraded version, stronger code generation capabilities, reduced model "laziness", fixed non-English UTF-8 generation issues.</div> This model supports a maximum context length of 128,000 tokens.
O

gpt-4-turbo

O

gpt-4-turbo

Input:$8/M
Output:$24/M
GPT-4 Turbo is an artificial intelligence model provided by OpenAI.
O

gpt-4-search

O

gpt-4-search

Per Request:$0.04
O

gpt-4-gizmo-*

O

gpt-4-gizmo-*

Input:$24/M
Output:$48/M
O

gpt-4-gizmo

O

gpt-4-gizmo

Input:$24/M
Output:$48/M
O

gpt-4-dalle

O

gpt-4-dalle

Per Request:$0.04
O

gpt-4-all

O

gpt-4-all

Input:$24/M
Output:$48/M
A

gpt-4-32k

A

gpt-4-32k

Input:$48/M
Output:$96/M
GPT-4 32K is an artificial intelligence model provided by Azure.
O

gpt-4-1106-preview

O

gpt-4-1106-preview

Input:$8/M
Output:$16/M
O

gpt-4-0613

O

gpt-4-0613

Input:$24/M
Output:$48/M
O

gpt-4-0314

O

gpt-4-0314

Input:$24/M
Output:$48/M
O

gpt-4-0125-preview

O

gpt-4-0125-preview

Input:$8/M
Output:$16/M
O

gpt-4

O

gpt-4

Input:$24/M
Output:$48/M
GPT-4 is an artificial intelligence model provided by OpenAI.
O

gpt-3.5-turbo-0125

O

gpt-3.5-turbo-0125

Input:$0.4/M
Output:$1.2/M
GPT-3.5 Turbo 0125 is an artificial intelligence model provided by OpenAI. A pure official high-speed GPT-3.5 series, supporting tools_call. This model supports a maximum context length of 4096 tokens.
O

gpt-3.5-turbo

O

gpt-3.5-turbo

Input:$0.4/M
Output:$1.2/M
GPT-3.5 Turbo is an artificial intelligence model provided by OpenAI. A pure official high-speed GPT-3.5 series, supporting tools_call. This model supports a maximum context length of 4096 tokens.
O

dall-e-3

O

dall-e-3

Per Request:$0.016
New version of DALL-E for image generation.
O

dall-e-2

O

dall-e-2

Input:$8/M
Output:$32/M
An AI model that generates images from text descriptions.
O

Codex Mini

O

Codex Mini

Input:$1.2/M
Output:$4.8/M
O

ChatGPT-4o

O

ChatGPT-4o

Input:$4/M
Output:$12/M
Based on the latest iteration of GPT-4o, a Multimodal large language model (LLM) that supports text, image, Audio, and video input/output.