模型支援企業部落格
500+ AI 模型 API,全部整合在一個 API 中。就在 CometAPI
模型 API
開發者
快速入門說明文件API 儀表板
資源
AI模型部落格企業更新日誌關於
2025 CometAPI. 保留所有權利。隱私政策服務條款
Home/Models/OpenAI/GPT 4o Image
O

GPT 4o Image

每次請求:$0.04
gpt-4o-image 可生成圖像作為輸出,並可選擇使用圖像作為輸入。
新
商業用途
概覽
功能
定價
API

Technical Specifications of gpt-4o-image

SpecificationDetails
Model IDgpt-4o-image
Model TypeMultimodal image generation model
Input ModalitiesText, image
Output ModalitiesImage
Primary Use CasesText-to-image generation, image-to-image generation, visual editing, creative asset production
Context SupportText prompts with optional image inputs
StreamingNot typically required for image output workflows
Tool / Function CallingNot applicable for core image generation
Response FormatGenerated image output, typically returned through API response payload or referenced asset data
Best ForApplications that need generated images from prompts, optionally guided by input images

What is gpt-4o-image?

gpt-4o-image is a multimodal image generation model exposed through CometAPI that is designed to generate images as output, with support for optional image inputs alongside text prompts. It is well suited for products that need to create visual content from natural language descriptions, transform existing images, or build image-driven creative workflows.

Because it can work from prompt-only input or combine prompt instructions with reference imagery, gpt-4o-image fits a wide range of use cases such as concept art generation, marketing creatives, product mockups, design exploration, and iterative visual editing. Through CometAPI, developers can access gpt-4o-image using a consistent API integration pattern across providers and models.

Main features of gpt-4o-image

  • Text-to-image generation: Create original images from natural language prompts for creative, design, and production workflows.
  • Image-conditioned generation: Use one or more input images to guide composition, style, subject matter, or transformations.
  • Visual iteration: Refine outputs across repeated requests by adjusting prompt details and image references.
  • Creative flexibility: Support a broad range of visual use cases, including illustrations, marketing assets, mockups, and conceptual design.
  • Multimodal prompting: Combine descriptive text with image inputs to achieve more controlled and context-aware results.
  • Developer-friendly access: Integrate gpt-4o-image through CometAPI’s unified model access layer and standardized API workflow.

How to access and integrate gpt-4o-image

Step 1: Sign Up for API Key

Sign up on CometAPI and create an API key from the dashboard. After generating your key, store it securely and use it to authenticate requests to the CometAPI endpoint.

Step 2: Send Requests to gpt-4o-image API

Use CometAPI’s OpenAI-compatible API format and set the model field to gpt-4o-image.

curl --request POST \
  --url https://api.cometapi.com/v1/responses \
  --header "Authorization: Bearer $COMETAPI_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "gpt-4o-image",
    "input": [
      {
        "role": "user",
        "content": [
          { "type": "input_text", "text": "Generate a clean modern product poster for a smartwatch on a soft studio background." }
        ]
      }
    ]
  }'

You can also include image inputs in the request when building image-to-image or guided generation workflows, depending on your application’s needs.

Step 3: Retrieve and Verify Results

Read the API response, extract the generated image result from the returned output structure, and verify that the image matches your prompt, formatting expectations, and application requirements before displaying it to end users or storing it in your system.

GPT 4o Image 的功能

探索 GPT 4o Image 的核心功能,專為提升效能和可用性而設計。了解這些功能如何為您的專案帶來效益並改善使用者體驗。

GPT 4o Image 的定價

探索 GPT 4o Image 的競爭性定價,專為滿足各種預算和使用需求而設計。我們靈活的方案確保您只需為實際使用量付費,讓您能夠隨著需求增長輕鬆擴展。了解 GPT 4o Image 如何在保持成本可控的同時提升您的專案效果。
彗星價格 (USD / M Tokens)官方價格 (USD / M Tokens)折扣
每次請求:$0.04
每次請求:$0.05
-20%

GPT 4o Image 的範例程式碼和 API

存取完整的範例程式碼和 API 資源,以簡化您的 GPT 4o Image 整合流程。我們詳盡的文件提供逐步指引,協助您在專案中充分發揮 GPT 4o Image 的潛力。
POST
/v1/chat/completions

更多模型

G

Nano Banana 2

輸入:$0.4/M
輸出:$2.4/M
核心能力概覽:解析度:最高可達 4K(4096×4096),與 Pro 相當。參考圖片一致性:最多支援 14 張參考圖片(10 個物件 + 4 個角色),維持風格與角色一致性。極端寬高比:新增 1:4、4:1、1:8、8:1 比例,適合長圖、海報與橫幅。文字渲染:進階文字生成,適用於資訊圖表與行銷海報版面。搜尋強化:整合 Google Search + Image Search。Grounding:內建思考過程;在生成前會先對複雜提示進行推理。
D

Doubao Seedream 5

每次請求:$0.028
Seedream 5.0 Lite 是一款統一式多模態圖像生成模型,具備深度思考與線上搜尋能力,在理解、推理與生成能力上全方位升級。
F

FLUX 2 MAX

每次請求:$0.008
FLUX.2 [max] 是來自 Black Forest Labs(BFL)的頂級視覺智能模型,專為生產級工作流程設計:行銷、產品攝影、電子商務、創意製作流程,以及任何需要一致的角色/產品形象、精準文字渲染,並在多百萬像素解析度下呈現照片級寫實細節的應用。其架構經過精心設計,具備強大的提示遵循能力、多參考融合(最多可輸入十張圖像),以及有根據的生成(在產生圖像時能夠納入最新的網路脈絡)。
X

Black Forest Labs/FLUX 2 MAX

每次請求:$0.056
FLUX.2 [max] 是 Black Forest Labs(BFL)推出的 FLUX.2 系列中旗艦級、最高品質的變體。其定位為專業級的文字→圖像生成與圖像編輯模型,重點在於極致保真度、對提示詞的遵從度,以及在角色、物件、光照與色彩上的編輯一致性。BFL 與合作夥伴的註冊目錄將 FLUX.2 [max] 描述為頂級的 FLUX.2 變體,具備多重參考編輯與有據生成等特性。
O

GPT Image 1.5

輸入:$6.4/M
輸出:$25.6/M
GPT-Image-1.5 是 OpenAI 的 GPT Image 系列中的圖像模型。它是一個原生多模態的 GPT 模型,旨在根據文字提示生成圖像,並在嚴格遵循使用者指示的同時對輸入圖像進行高保真編輯。
D

Doubao Seedream 4.5

每次請求:$0.032
Seedream 4.5 是 ByteDance/Seed 的多模態圖像模型(文字→圖像 + 圖像編輯),專注於生產級的圖像保真度、更強的提示詞遵循度,以及大幅改進的編輯一致性(主體保留、文字/字體排版渲染與人臉真實感)。