ModelsPricingEnterprise
500+ AI Model API, All In One API.Just In CometAPI
Models API
Developer
Quick StartDocumentationAPI Dashboard
Company
About usEnterprise
Resources
AI ModelsBlogChangelogSupport
Terms of ServicePrivacy Policy
© 2026 CometAPI · All rights reserved
Home/Models/OpenAI/GPT 5.1 Codex Max
O

GPT 5.1 Codex Max

Input:$1/M
Output:$8/M
Context:400K
Max Output:128K
GPT-5.1-Codex-Max is OpenAI’s purpose-built agentic coding model in the GPT-5.1 family, optimized to execute long-running software engineering workflows (refactors, multi-hour agent loops, terminal automation, test runs and code review) with higher reliability and token efficiency than its predecessors.
New
Commercial Use
Playground
Overview
Features
Pricing
API

What is the GPT-5.1-Codex-Max ?

GPT-5.1-Codex-Max is a Codex-family model tuned and purpose-built for agentic coding workflows — i.e., autonomous multi-step engineering tasks such as repo-scale refactors, long debugging sessions, multi-hour agent loops, code review, and programmatic tool use. It is intended for developer workflows where the model must:

  • Maintain state across many edits and interactions;
  • Operate tools and terminals (run tests, compile, install, issue git commands) as part of an automated chain;
  • Produce patches, run tests, and provide traceable logs and citations for outputs

Main features

  • Compaction & Multi-window Context: Natively trained to compact history and coherently operate across multiple context windows, enabling project-scale continuity .
  • Agentic tool use (terminal + tooling): Improved capability to run terminal sequences, install/build/test, and react to program outputs.
  • Higher token efficiency: Designed to allocate tokens more efficiently for small tasks while using longer reasoning runs for complex tasks.
  • Refactoring & large edits: Better at cross-file refactors, migrations and repository-level patches (OpenAI internal evaluations).
  • Reasoning effort modes: New reasoning effort tiers for longer compute-heavy reasoning (e.g., Extra High / xhigh for non-latency-sensitive jobs).

Technical capabilities (what it does well)

  • Long-horizon refactoring & iterative loops: can sustain multi-hour (OpenAI reports >24h in internal demos) project-scale refactors and debugging sessions by iterating, running tests, summarizing failures and updating code.
  • Real-world bug fixing: strong performance on real-repo patching benchmarks (SWE-Bench Verified: OpenAI reports 77.9% for Codex-Max in xhigh/extra-effort settings).
  • Terminal/Tool proficiency: reads logs, invokes compilers/tests, edits files, creates PRs — i.e., functions as a terminal-native agent with explicit, inspectable tool calls.
  • Inputs accepted: standard text prompts plus code snippets, repository snapshots (via tool/IDE integrations), screenshots/windows in Codex surfaces where vision is enabled, and tool call requests (e.g., run npm test, open file, create PR).
  • Outputs produced: code patches (diffs or PRs), test reports, step-by-step run logs, natural-language explanations and annotated code review comments. When used as an agent, it can emit structured tool calls and follow-up actions.

Benchmark performance (selected results & context)

  • SWE-bench Verified (n=500) — GPT-5.1-Codex (high): 73.7%; GPT-5.1-Codex-Max (xhigh): 77.9%. This metric evaluates real-world engineering tasks drawn from GitHub / open-source issues.
  • SWE-Lancer IC SWE: GPT-5.1-Codex: 66.3% → GPT-5.1-Codex-Max: 79.9% (OpenAI reported improvements on certain leaderboards).
  • Terminal-Bench 2.0: GPT-5.1-Codex: 52.8% → GPT-5.1-Codex-Max: 58.1% (improvements on interactive terminal/tool-use evaluations).

Limitations and failure modes

  1. Dual-use / cybersecurity risk: Enhanced ability to operate terminals and run tooling raises dual-use concerns (the model can assist in both defensive and offensive security work); OpenAI emphasizes staged access controls and monitoring.
  2. Not perfectly deterministic or correct: Even with stronger engineering performance, the model can propose incorrect patches or miss subtle code semantics (false positives/negatives in bug detection), so human review and CI testing remain essential.
  3. Cost and latency tradeoffs: High-effort modes (xhigh) consume more compute/time; long multi-hour agent loops consume credits or budget. Plan for cost and rate limits.
  4. Context guarantees vs effective continuity: Compaction enables project continuity, but exact guarantees about which tokens are preserved and how compaction affects rare corner cases are not a substitute for versioned repo snapshots and reproducible pipelines. Use compaction as an assistant, not a sole source-of-truth.

Comparison vs Claude Opus 4.5 vs Gemini 3 Pro(high level)

  • Anthropic — Claude Opus 4.5: Community and press benchmarks generally place Opus 4.5 slightly ahead of Codex-Max on raw bug-fixing correctness (SWE-Bench), with strengths in scientific orchestration and very concise, token-efficient outputs. Opus is often priced higher per token but can be more token-efficient in practice. Codex-Max’s edge is long-horizon compaction, terminal tooling integration, and cost efficiency for long agent runs.
  • Google Gemini family (3 Pro etc.): Gemini variants remain strong on multimodal and general reasoning benchmarks; in the coding domain the results vary by harness. Codex-Max is purpose-built for agentic coding and integrates with DevTool workflows in ways generalist models are not by default.

How to access and use GPT-5.1 Codex Max API

Step 1: Sign Up for API Key

Log in to cometapi.com. If you are not our user yet, please register first. Sign into your CometAPI console. Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.

Step 2: Send Requests to GPT-5.1-Codex-Max API

Select the “ gpt-5.1-codex-max” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience. Replace <YOUR_API_KEY> with your actual CometAPI key from your account. Developers call these via the Responses API / Chat endpoints.

Insert your question or request into the content field—this is what the model will respond to . Process the API response to get the generated answer.

Step 3: Retrieve and Verify Results

Process the API response to get the generated answer. After processing, the API responds with the task status and output data.

Pricing for GPT 5.1 Codex Max

Explore competitive pricing for GPT 5.1 Codex Max, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how GPT 5.1 Codex Max can enhance your projects while keeping costs manageable.
Comet Price (USD / M Tokens)Official Price (USD / M Tokens)Discount
Input:$1/M
Output:$8/M
Input:$1.25/M
Output:$10/M
-20%

Sample code and API for GPT 5.1 Codex Max

GPT-5.1-Codex-Max is OpenAI’s purpose-built agentic coding model in the GPT-5.1 family, optimized to execute long-running software engineering workflows (refactors, multi-hour agent loops, terminal automation, test runs and code review) with higher reliability and token efficiency than its predecessors.
POST
/v1/responses
Python
JavaScript
Curl
from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com/v1"

client = OpenAI(base_url=BASE_URL, api_key=COMETAPI_KEY)
response = client.responses.create(
    model="gpt-5.1-codex-max", input="Tell me a three sentence bedtime story about a unicorn."
)

print(response)

Python Code Example

from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com/v1"

client = OpenAI(base_url=BASE_URL, api_key=COMETAPI_KEY)
response = client.responses.create(
    model="gpt-5.1-codex-max", input="Tell me a three sentence bedtime story about a unicorn."
)

print(response)

JavaScript Code Example

import OpenAI from "openai";

// Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
const api_key = process.env.COMETAPI_KEY;
const base_url = "https://api.cometapi.com/v1";

const openai = new OpenAI({
  apiKey: api_key,
  baseURL: base_url,
});

const response = await openai.responses.create({
  model: "gpt-5.1-codex-max",
  input: "Tell me a three sentence bedtime story about a unicorn.",
});

console.log(response);

Curl Code Example

curl https://api.cometapi.com/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $COMETAPI_KEY" \
  -d '{
    "model": "gpt-5.1-codex-max",
    "input": "Tell me a three sentence bedtime story about a unicorn."
  }'