OpenAI 模型 - CometAPI

GPT-5.4 nano

上下文:400,000

GPT-5.4 nano 专为速度和成本最为关键的任务而设计，例如分类、数据提取、排序以及子智能体。

GPT-5.4 mini

上下文:400,000

GPT-5.4 mini 将 GPT-5.4 的优势融入到一款更快速、更高效、专为大规模工作负载设计的模型中。

GPT-5.4 pro

上下文:1,050,000

可生成更智能、更精准回复的 GPT-5.4 版本。

GPT-5.4

上下文:1,050,000

GPT-5.4 是面向复杂专业工作的前沿模型。Reasoning.effort 支持：none（默认）、low、medium、high 和 xhigh。

GPT-5.3 Chat

ChatGPT 使用的 GPT-5.3 Instant 模型

Sora 2 Pro

Sora 2 Pro 是我们最先进、最强大的媒体生成模型，可生成带有同步音频的视频。它可以根据自然语言或图像创建细致、动态的视频片段。

Sora 2

超级强大的视频生成模型，带有音效，支持聊天格式。

gpt-realtime-1.5

上下文:32,000

用于音频输入与音频输出的最佳语音模型。

gpt-audio-1.5

用于在 Chat Completions 中实现音频输入与音频输出的最佳语音模型。

GPT 5.3 Codex

上下文:400,000

GPT-5.3-Codex 专为在 Codex 或类似环境中的代理式编码任务进行优化。GPT-5.3-Codex 支持 low、medium、high 和 xhigh 的推理强度设置。

GPT-5.2 Codex

上下文:400,000

GPT-5.2-Codex 是 GPT-5.2 的升级版，针对 Codex 或类似环境中的代理型编码任务进行了优化。GPT-5.2-Codex 支持 low、medium、high 和 xhigh 的推理力度设置。

GPT Image 1.5

GPT-Image-1.5 是 GPT Image 系列中的 OpenAI 图像模型。它是一个原生多模态的 GPT 模型，旨在根据文本提示生成图像，并对输入图像进行高保真编辑，同时严格遵循用户指令。

GPT-5.2 Pro

上下文:400,000

输出:$134.4/M

gpt-5.2-pro 是 OpenAI 的 GPT-5.2 系列中能力最强的生产级成员，通过 Responses API 对外提供，适用于需要最高保真度、多步推理、广泛工具使用，以及 OpenAI 所提供的最大上下文/吞吐量预算的工作负载。

GPT-5.2 Chat

上下文:128,000

gpt-5.2-chat-latest 是 OpenAI 的 GPT-5.2 系列中针对聊天优化的快照（在 ChatGPT 中的品牌名为 GPT-5.2 Instant）。它适用于需要兼顾速度、长上下文处理、多模态输入以及可靠对话行为的交互/聊天用例。

GPT 5.1 Codex Max

GPT 5.1 Codex Max

GPT-5.1-Codex-Max 是 OpenAI 在 GPT-5.1 系列中专为智能体编程打造的模型，优化用于以比其前代更高的可靠性与令牌效率执行长时间运行的软件工程工作流（重构、持续数小时的智能体循环、终端自动化、测试运行与代码审查）。

GPT 5.1 Codex

GPT 5.1 Codex

GPT-5.1-Codex 是一款专注于代码生成与理解的高性能大型语言模型，在复杂编程任务、代码推理与生产级应用方面具备更强能力。

GPT-5.1 Chat

GPT-5.1 Chat

上下文:400.0k

GPT-5.1 Chat 是一种经指令微调的对话式语言模型，适用于通用聊天、推理与写作。它支持多轮对话、摘要、起草、知识库问答，并为应用内助理、客服自动化和工作流副驾提供轻量级代码辅助。技术亮点包括面向聊天优化的对齐、可控且结构化的输出，以及在可用时用于工具调用和检索工作流的集成路径。

GPT-5.1

GPT-5.1

GPT-5.1 是一款通用型指令微调语言模型，专注于跨产品工作流程的文本生成与推理。它支持多轮对话、结构化输出格式，以及面向代码的任务，如起草、重构和解释。典型用例包括聊天助手、检索增强的问答、数据转换，以及在支持的情况下基于工具或 API 的代理式自动化。技术亮点包括以文本为中心的模态、指令遵循、JSON 风格的输出，以及与常见编排框架中的函数调用兼容。

GPT Image 1 mini

GPT Image 1 mini

Cost-optimized version of GPT Image 1. It is a native Multimodal language model that accepts both text and image input and generates image output.

GPT-5 nano

GPT-5 nano

GPT-5 Nano 是由 OpenAI 提供的人工智能模型。

GPT-5 mini

GPT-5 mini

GPT-5 mini is OpenAI’s cost- and latency-optimized member of the GPT-5 family, intended to deliver much of GPT-5’s multimodal and instruction-following strengths at substantially lower cost for large-scale production use. It targets environments where throughput, predictable per-token pricing, and fast responses are the primary constraints while still providing strong general-purpose capabilities.

GPT 5 Codex

GPT 5 Codex

GPT-5-Codex 是一款高性能的大型语言模型，专注于代码生成与理解，并在复杂编程任务、代码推理和生产级应用方面具备增强的能力。

GPT 5 Chat

GPT 5 Chat

GPT-5 Chat (latest) is an artificial intelligence model provided by OpenAI.

GPT-5

GPT-5

GPT-5 是 OpenAI 迄今为止最强大的代码模型。它在复杂前端生成和大型代码库调试方面表现出显著改进。它能以直观且悦目的结果将想法变为现实，凭借敏锐的审美，创建精美且响应迅速的网站、应用和游戏，而这一切只需一条提示即可完成。早期测试者也注意到其在设计选择上的表现，对间距、排版和留白等元素有更深入的理解。

Whisper-1

Speech to text, creating translations

tts-1-hd-1106

tts-1-hd-1106

tts-1-hd

tts-1-hd

tts-1-1106

tts-1-1106

tts-1

tts-1

TTS

OpenAI Text-to-Speech

text-embedding-ada-002

text-embedding-ada-002

An Ada-based text embedding model optimized for various NLP tasks.

text-embedding-3-small

text-embedding-3-small

输入:$0.016/M

输出:$0.016/M

A small text embedding model for efficient processing.

text-embedding-3-large

text-embedding-3-large

输入:$0.104/M

输出:$0.104/M

A large text embedding model for a wide range of natural language processing tasks.

omni-moderation-latest

omni-moderation-latest

每次请求:$0.0016

omni-moderation-2024-09-26

omni-moderation-2024-09-26

每次请求:$0.0016

o4-mini-deep-research

o4-mini-deep-research

O4-Mini-Deep-Research is OpenAI’s latest agentic reasoning model, combining the lightweight o4-mini backbone with the advanced Deep Research framework. Designed to deliver fast, cost-efficient deep information synthesis, it enables developers and researchers to perform automated web searches, data analysis, and chain-of-thought reasoning within a single API call.

o4-mini

o4-mini

O4-mini is an artificial intelligence model provided by OpenAI.

O3 Pro

O3 Pro

OpenAI o3‑pro is a “pro” variant of the o3 reasoning model engineered to think longer and deliver the most dependable responses by employing private chain‑of‑thought reinforcement learning and setting new state‑of‑the‑art benchmarks across domains like science, programming, and business—while autonomously integrating tools such as web search, file analysis, Python execution, and visual reasoning within API.

o3-mini

o3-mini

O3-mini is an artificial intelligence model provided by OpenAI.

o3

o3

O3 is an artificial intelligence model provided by OpenAI.

o1-pro-all

o1-pro-all

o1-pro-2025-03-19

o1-pro-2025-03-19

o1-pro

o1-pro

O1-pro is an artificial intelligence model provided by OpenAI.

o1-preview-all

o1-preview-all

每次请求:$0.16

o1-preview-2024-09-12

o1-preview-2024-09-12

o1-preview

o1-preview

O1-preview is an artificial intelligence model provided by OpenAI.

o1-mini-all

o1-mini-all

每次请求:$0.08

o1-mini-2024-09-12

o1-mini-2024-09-12

o1-mini

o1-mini

O1-mini is an artificial intelligence model provided by OpenAI.

o1-all

o1-all

每次请求:$0.16

o1-2024-12-17

o1-2024-12-17

o1

o1

O1 is an artificial intelligence model provided by OpenAI.

gpt-realtime-mini

gpt-realtime-mini

An economical version of the real-time GPT—capable of responding to Audio and text input in real-time via WebRTC, WebSocket, or SIP connections.

gpt-oss-20b

gpt-oss-20b

gpt-oss-20b 是由 cloudflare-workers-ai 提供的人工智能模型。

gpt-oss-120b

gpt-oss-120b

gpt-oss-120b 是由 cloudflare-workers-ai 提供的人工智能模型。

GPT Image 1

GPT Image 1

一种先进的 AI 模型，用于根据文本描述生成图像。

GPT 5.3

GPT 5.3

GPT-5.2

上下文:400,000

GPT-5.2 是一款多变体的模型套件（Instant、Thinking、Pro），专为更好的长上下文理解、更强的编码与工具使用，以及在专业“知识工作”基准测试上实现实质性更高的性能而设计。

GPT-4o Transcribe

GPT-4o Transcribe

GPT-4o Transcribe is an audio-to-text model for multilingual, low-latency speech recognition. It supports real-time streaming and batch transcription from common audio formats with punctuation and sentence segmentation. Typical uses include live captions, voice assistant input, meeting notes, and media or call recording transcription. Technical highlights include audio modality support, long-form processing, and APIs suited for interactive and server-side workflows.

GPT-4o Search

GPT-4o Search

GPT-4o Search is a GPT-4o-based multimodal model configured for search-augmented reasoning and grounded, current answers. It follows instructions and uses web search tools to retrieve, evaluate, and synthesize external information, with source context when available. Typical uses include research assistance, fact-checking, news and trend monitoring, and answering time-sensitive queries. Technical highlights include tool/function calling for browsing and retrieval, long-context handling, and structured outputs suitable for citations and links.

GPT-4o Realtime

GPT-4o Realtime

The Realtime API allows developers to build low-latency, Multimodal experiences, including speech-to-speech functionality. Text and Audio processed by the Realtime API are priced separately. This model supports a maximum context length of 128,000 tokens.

GPT-4o mini TTS

GPT-4o mini TTS

GPT-4o mini TTS is a neural text-to-speech model designed for natural, low-latency voice generation in user-facing applications. It converts text to natural-sounding speech with selectable voices, multi-format output, and streaming synthesis for responsive experiences. Typical uses include voice assistants, IVR and contact flows, product read-aloud, and media narration. Technical highlights include API-based streaming and export to common audio formats such as MP3 and WAV.

GPT-4o mini Search Preview

GPT-4o mini Search Preview

GPT-4o mini Search Preview is a compact multimodal model in the GPT-4o family geared toward search-oriented interactions and retrieval workflows. It interprets and reformulates queries, synthesizes concise answers, and can ground responses via external search when integrated through tool/function calling. Typical uses include in-product search assistants, knowledge-base QA, e-commerce discovery, and query understanding for ranking and routing. Technical highlights include text-and-image inputs, instruction following, structured output formats, and tool use integration for RAG pipelines.

GPT-4o mini Realtime Preview

GPT-4o mini Realtime Preview

GPT-4o mini Realtime Preview is a real-time multimodal model for interactive voice and visual experiences. It handles speech, text, and images with streaming input and output, plus tool/function calling for grounded actions. Typical uses include voice assistants, live call handling, real-time captioning, and visual question answering over camera or screen content. Technical highlights include bidirectional audio, vision understanding, streaming responses, and structured outputs via functions.

GPT-4o mini Audio

GPT-4o mini Audio is a multimodal model for speech and text interactions. It performs speech recognition, translation, and text-to-speech, follows instructions, and can call tools for structured actions with streaming responses. Typical uses include real-time voice assistants, live captioning and translation, call summarization, and voice-controlled applications. Technical highlights include audio input and output, streaming responses, function calling, and structured JSON output.

GPT-4o mini Audio Preview

GPT-4o mini Audio Preview

GPT-4o mini Audio Preview is a compact multimodal model for building conversational audio applications. It supports speech input and output alongside text, enabling speech recognition, speech synthesis, and mixed text-audio dialogs with tool/function calling for structured actions. Typical uses include voice assistants, streaming transcription with summarization, IVR and call-bot workflows, and audio-enabled in-app helpers. Technical highlights include audio I/O, streaming responses, instruction following, and integration via chat and tools APIs.

GPT-4o mini

GPT-4o mini

GPT-4o mini is an artificial intelligence model provided by OpenAI.

GPT 4o Image

GPT 4o Image

每次请求:$0.04

gpt-4o-image 可生成图像作为输出，并可选择使用图像作为输入

GPT-4o Audio Preview

GPT-4o Audio Preview

This model supports a maximum context length of 128,000 tokens.

gpt-4o-all

gpt-4o-all

<div>GPT-4o is OpenAI's most advanced Multimodal model, faster and cheaper than GPT-4 Turbo, with stronger visual capabilities. This model has a 128K context and a knowledge cutoff of October 2023. Models in the 1106 series and above support tool_calls and function_call.</div> This model supports a maximum context length of 128,000 tokens.

GPT-4o

GPT-4o

GPT-4o 是 OpenAI 最先进的多模态模型，比 GPT-4 Turbo 更快且成本更低，并具备更强的视觉能力。该模型拥有 128K 上下文窗口，知识截止日期为 2023 年 10 月。1106 系列及以上的模型支持 tool_calls 和 function_call。该模型支持的最大上下文长度为 128,000 令牌。

GPT-4.1 nano

GPT-4.1 nano

GPT-4.1 nano is an artificial intelligence model provided by OpenAI. gpt-4.1-nano: Features a larger context window—supporting up to 1 million context tokens and capable of better utilizing that context through improved long-context understanding. Has an updated knowledge cutoff time of June 2024. This model supports a maximum context length of 1,047,576 tokens.

GPT 4.1 mini

GPT 4.1 mini

GPT-4.1 mini is an artificial intelligence model provided by OpenAI. gpt-4.1-mini: A significant leap in small model performance, even beating GPT-4o in many benchmarks. It meets or exceeds GPT-4o in intelligence evaluation while reducing latency by nearly half and cost by 83%. This model supports a maximum context length of 1,047,576 tokens.

GPT-4.1

GPT-4.1

GPT-4.1 is an artificial intelligence model provided by OpenAI. gpt-4.1-nano: Features a larger context window—supporting up to 1 million context tokens and capable of better utilizing that context through improved long-context understanding. Has an updated knowledge cutoff time of June 2024. This model supports a maximum context length of 1,047,576 tokens.

gpt-4-vision-preview

gpt-4-vision-preview

This model supports a maximum context length of 128,000 tokens.

gpt-4-vision

gpt-4-vision

This model supports a maximum context length of 128,000 tokens.

gpt-4-v

gpt-4-v

每次请求:$0.04

gpt-4-turbo-preview

gpt-4-turbo-preview

<div>gpt-4-turbo-preview Upgraded version, stronger code generation capabilities, reduced model "laziness", fixed non-English UTF-8 generation issues.</div> This model supports a maximum context length of 128,000 tokens.

gpt-4-turbo-2024-04-09

gpt-4-turbo-2024-04-09

<div>gpt-4-turbo-2024-04-09 Upgraded version, stronger code generation capabilities, reduced model "laziness", fixed non-English UTF-8 generation issues.</div> This model supports a maximum context length of 128,000 tokens.

gpt-4-turbo

gpt-4-turbo

GPT-4 Turbo 是 OpenAI 提供的人工智能模型。

gpt-4-search

gpt-4-search

每次请求:$0.04

gpt-4-gizmo-*

gpt-4-gizmo-*

gpt-4-gizmo

gpt-4-gizmo

gpt-4-dalle

gpt-4-dalle

每次请求:$0.04

gpt-4-all

gpt-4-all

gpt-4-32k

gpt-4-32k

GPT-4 32K is an artificial intelligence model provided by Azure.

gpt-4-1106-preview

gpt-4-1106-preview

gpt-4-0613

gpt-4-0613

gpt-4-0314

gpt-4-0314

gpt-4-0125-preview

gpt-4-0125-preview

gpt-4

gpt-4

GPT-4 is an artificial intelligence model provided by OpenAI.

gpt-3.5-turbo-0125

gpt-3.5-turbo-0125

GPT-3.5 Turbo 0125 is an artificial intelligence model provided by OpenAI. A pure official high-speed GPT-3.5 series, supporting tools_call. This model supports a maximum context length of 4096 tokens.

gpt-3.5-turbo

gpt-3.5-turbo

GPT-3.5 Turbo is an artificial intelligence model provided by OpenAI. A pure official high-speed GPT-3.5 series, supporting tools_call. This model supports a maximum context length of 4096 tokens.

dall-e-3

dall-e-3

每次请求:$0.016

New version of DALL-E for image generation.

dall-e-2

dall-e-2

An AI model that generates images from text descriptions.

Codex Mini

Codex Mini

ChatGPT-4o

ChatGPT-4o

基于最新迭代的 GPT-4o，这是一款支持文本、图像、音频和视频输入/输出的多模态大语言模型（LLM）。