模型支持企业博客
500+ AI 模型 API,一次搞定,就在 CometAPI
模型 API
开发者
快速入门文档API 仪表板
资源
AI 模型博客企业更新日志关于
2025 CometAPI。保留所有权利。隐私政策服务条款
Home/Models/OpenAI/GPT 4o Image
O

GPT 4o Image

每次请求:$0.04
gpt-4o-image 可生成图像作为输出,并可选择使用图像作为输入
新
商用
概览
功能亮点
定价
API

Technical Specifications of gpt-4o-image

SpecificationDetails
Model IDgpt-4o-image
Model TypeMultimodal image generation model
Input ModalitiesText, image
Output ModalitiesImage
Primary Use CasesText-to-image generation, image-to-image generation, visual editing, creative asset production
Context SupportText prompts with optional image inputs
StreamingNot typically required for image output workflows
Tool / Function CallingNot applicable for core image generation
Response FormatGenerated image output, typically returned through API response payload or referenced asset data
Best ForApplications that need generated images from prompts, optionally guided by input images

What is gpt-4o-image?

gpt-4o-image is a multimodal image generation model exposed through CometAPI that is designed to generate images as output, with support for optional image inputs alongside text prompts. It is well suited for products that need to create visual content from natural language descriptions, transform existing images, or build image-driven creative workflows.

Because it can work from prompt-only input or combine prompt instructions with reference imagery, gpt-4o-image fits a wide range of use cases such as concept art generation, marketing creatives, product mockups, design exploration, and iterative visual editing. Through CometAPI, developers can access gpt-4o-image using a consistent API integration pattern across providers and models.

Main features of gpt-4o-image

  • Text-to-image generation: Create original images from natural language prompts for creative, design, and production workflows.
  • Image-conditioned generation: Use one or more input images to guide composition, style, subject matter, or transformations.
  • Visual iteration: Refine outputs across repeated requests by adjusting prompt details and image references.
  • Creative flexibility: Support a broad range of visual use cases, including illustrations, marketing assets, mockups, and conceptual design.
  • Multimodal prompting: Combine descriptive text with image inputs to achieve more controlled and context-aware results.
  • Developer-friendly access: Integrate gpt-4o-image through CometAPI’s unified model access layer and standardized API workflow.

How to access and integrate gpt-4o-image

Step 1: Sign Up for API Key

Sign up on CometAPI and create an API key from the dashboard. After generating your key, store it securely and use it to authenticate requests to the CometAPI endpoint.

Step 2: Send Requests to gpt-4o-image API

Use CometAPI’s OpenAI-compatible API format and set the model field to gpt-4o-image.

curl --request POST \
  --url https://api.cometapi.com/v1/responses \
  --header "Authorization: Bearer $COMETAPI_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "gpt-4o-image",
    "input": [
      {
        "role": "user",
        "content": [
          { "type": "input_text", "text": "Generate a clean modern product poster for a smartwatch on a soft studio background." }
        ]
      }
    ]
  }'

You can also include image inputs in the request when building image-to-image or guided generation workflows, depending on your application’s needs.

Step 3: Retrieve and Verify Results

Read the API response, extract the generated image result from the returned output structure, and verify that the image matches your prompt, formatting expectations, and application requirements before displaying it to end users or storing it in your system.

GPT 4o Image 的功能

了解 GPT 4o Image 的核心能力,帮助提升性能与可用性,并改善整体体验。

GPT 4o Image 的定价

查看 GPT 4o Image 的竞争性定价,满足不同预算与使用需求,灵活方案确保随需求扩展。
Comet 价格 (USD / M Tokens)官方定价 (USD / M Tokens)折扣
每次请求:$0.04
每次请求:$0.05
-20%

GPT 4o Image 的示例代码与 API

获取完整示例代码与 API 资源,简化 GPT 4o Image 的集成流程,我们提供逐步指导,助你发挥模型潜能。
POST
/v1/chat/completions

更多模型

G

Nano Banana 2

输入:$0.4/M
输出:$2.4/M
核心能力概览:分辨率:最高 4K(4096×4096),与 Pro 相当。参考图像一致性:最多 14 张参考图像(10 个物体 + 4 个角色),保持风格/角色一致性。极端纵横比:新增 1:4、4:1、1:8、8:1 比例,适用于长图、海报和横幅。文本渲染:高级文本生成,适用于信息图和营销海报版式。搜索增强:集成 Google Search + 图片搜索。Grounding:内置思维过程;生成前先对复杂提示进行推理。
D

Doubao Seedream 5

每次请求:$0.028
Seedream 5.0 Lite 是一款统一的多模态图像生成模型,具备深度思考和在线搜索能力,在理解、推理与生成能力方面实现了全方位升级。
F

FLUX 2 MAX

每次请求:$0.008
FLUX.2 [max] 是 Black Forest Labs(BFL)推出的顶级视觉智能模型,面向生产级工作流程:市场营销、产品摄影、电子商务、创意流程,以及任何需要角色/产品形象一致性、精确文字渲染和在多百万像素分辨率下呈现照片级细节的应用。其架构经过工程化设计,具备强大的提示跟随能力、支持多参考融合(最多 10 张输入图像),并能实现 grounded generation(在生成图像时能够纳入最新的网络上下文)。
X

Black Forest Labs/FLUX 2 MAX

每次请求:$0.056
FLUX.2 [max] 是 Black Forest Labs(BFL)推出的 FLUX.2 系列中旗舰级、最高质量的变体。其定位为专业级文本→图像生成与图像编辑模型,专注于最大保真度、对提示词的遵从性,以及在角色、物体、光照与色彩方面的编辑一致性。BFL 及其合作伙伴的注册库将 FLUX.2 [max] 描述为 FLUX.2 系列的顶级变体,具备多参考编辑与有据可依的生成等功能。
O

GPT Image 1.5

输入:$6.4/M
输出:$25.6/M
GPT-Image-1.5 是 GPT Image 系列中的 OpenAI 图像模型。它是一个原生多模态的 GPT 模型,旨在根据文本提示生成图像,并对输入图像进行高保真编辑,同时严格遵循用户指令。
D

Doubao Seedream 4.5

每次请求:$0.032
Seedream 4.5 是 ByteDance/Seed 的多模态图像模型(文本→图像 + 图像编辑),专注于生产级图像保真度、更强的提示词遵从性,以及大幅提升的编辑一致性(主体保留、文本/排版渲染和面部真实感)。