ModelsPricingEnterprise
500+ AI Model API, All In One API.Just In CometAPI
Models API
Developer
Quick StartDocumentationAPI Dashboard
Company
About usEnterprise
Resources
AI ModelsBlogChangelogSupport
Terms of ServicePrivacy Policy
© 2026 CometAPI · All rights reserved
Home/Models/Google/Gemini 3 Pro
G

Gemini 3 Pro

Input:$1.6/M
Output:$9.6/M
Context:200.0k
Max Output:200.0k
Gemini 3 Pro is a general-purpose model in the Gemini family, available in preview for evaluation and prototyping. It supports instruction following, multi-turn reasoning, and code and data tasks, with structured outputs and tool/function calling for workflow automation. Typical uses include chat assistants, summarization and rewriting, retrieval-augmented QA, data extraction, and lightweight coding help across apps and services. Technical highlights include API-based deployment, streaming responses, safety controls, and integration readiness, with multimodal capabilities depending on preview configuration.
New
Commercial Use
Playground
Overview
Features
Pricing
API
Versions

Gemini 3 Pro (Preview) is Google/DeepMind’s newest flagship multimodal reasoning model in the Gemini 3 family. It is positioned as their “most intelligent model yet,” designed for deep reasoning, agentic workflows, advanced coding, and long-context multimodal understanding (text, images, audio, video, code and tool integrations).

Key features

  • Modalities: Text, image, video, audio, PDFs (and structured tool outputs).
  • Agentic/tooling: Built-in function calling, search-as-tool, code execution, URL context, and support for orchestrating multi-step agents. Thought-signature mechanism preserves multi-step reasoning across calls.
  • Coding & “vibe coding”: Optimized for front-end generation, interactive UI generation, and agentic coding (it tops relevant leaderboards reported by Google). It’s marketed as their strongest “vibe-coding” model yet.
  • New developer controls: thinking_level (low|high) to trade off cost/latency vs reasoning depth, and media_resolution controls multimodal fidelity per image or video frame. These help balance performance, latency, and cost.

Benchmark performance

  • The Gemini3Pro achieved first place in LMARE with a score of 1501, surpassing Grok-4.1-thinking’s 1484 points and also leading Claude Sonnet 4.5 and Opus 4.1.
  • It also achieved first place in the WebDevArena programming arena with a score of 1487.
  • In Humanity’s Last Exam academic reasoning, it achieved 37.5% (without tools); in GPQA Diamond science, 91.9%; and in the MathArena Apex math competition, 23.4%, setting a new record.
  • In multimodal capabilities, the MMMU-Pro achieved 81%; and in Video-MMMU video comprehension, 87.6%.

Technical details & architecture

  • “Thinking level” parameter: Gemini 3 exposes a thinking_level control that lets developers trade off depth of internal reasoning vs latency/cost. The model treats thinking_level as a relative allowance for internal multi-step reasoning rather than a strict token guarantee. Default is typically high for Pro. This is an explicit new control for developers to tune multi-step planning and chain-of-thought depth.
  • Structured outputs & tools: The model supports structured JSON outputs and can be combined with built-in tools (Google Search grounding, URL context, code execution, etc.). Some structured-output+tools features are preview-only for gemini-3-pro-preview.
  • Multimodal and agentic integrations: Gemini 3 Pro is explicitly built for agentic workflows (tooling + multiple agents over code/terminals/browser).

Limitations & known caveats

  1. Not perfect factuality — hallucinations remain possible. Despite strong factuality improvements claimed by Google, grounded verification and human review are still necessary in high-stakes settings (legal, medical, financial).
  2. Long-context performance varies by task. Support for a 1M input window is a hard capability, but empirical effectiveness can drop on some benchmarks at extreme lengths (observed pointwise declines at 1M on some long-context tests).
  3. Cost & latency trade-offs. Large contexts and higher thinking_level settings increase compute, latency and cost; pricing tiers apply based on token volumes. Use thinking_level and chunking strategies to manage costs.
  4. Safety & content filters. Google continues to apply safety policies and moderation layers; certain content and actions remain restricted or will trigger refusal modes.

How Gemini 3 Pro Preview compares to other top models

High level comparison (preview → qualitative):

Against Gemini 2.5 Pro: Step-change improvements in reasoning, agentic tool use, and multimodal integration; much larger context handling and better long-form understanding. DeepMind shows consistent gains across academic reasoning, coding, and multimodal tasks.

Against GPT-5.1 and Claude Sonnet 4.5 (as reported): On Google/DeepMind’s benchmark slate Gemini 3 Pro is presented as leading on several agentic, multimodal, and long-context metrics (see Terminal-Bench, MMMU-Pro, AIME). Comparative results vary by task.


Typical and high-value use cases

  • Large document / book summarization & Q&A: long context support makes it attractive for legal, research, and compliance teams.
  • Code understanding & generation at repo scale: integration with coding toolchains and improved reasoning helps large codebase refactors and automated code review workflows.
  • Multimodal product assistants: image + text + audio workflows (customer support that ingests screenshots, call snippets, and documents).
  • Media generation & editing (photo → video): earlier Gemini family features now include Veo / Flow-style photo→video capabilities; preview suggests deeper multimedia generation for prototypes and media workflows.

How to access Gemini 3 Pro API

Step 1: Sign Up for API Key

Log in to cometapi.com. If you are not our user yet, please register first. Sign into your CometAPI console. Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.

Step 2: Send Requests to Gemini 3 Pro API

Select the “gemini-3-pro” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience. Replace <YOUR_API_KEY> with your actual CometAPI key from your account. base url is Gemini Generating Content and Chat

Insert your question or request into the content field—this is what the model will respond to . Process the API response to get the generated answer.

Step 3: Retrieve and Verify Results

Process the API response to get the generated answer. After processing, the API responds with the task status and output data.

FAQ

What is the context window and output limit for Gemini 3 Pro?

Gemini 3 Pro supports a 1 million token input context window with up to 64,000 tokens of output, making it ideal for analyzing entire codebases or lengthy documents.

How does the thinking_level parameter work in Gemini 3 Pro?

Gemini 3 Pro uses dynamic thinking by default. Set thinking_level to 'low' for faster responses when complex reasoning isn't needed, or 'high' (default) to maximize reasoning depth for complex tasks.

Does Gemini 3 Pro support Google Search grounding?

Yes, Gemini 3 Pro supports Google Search grounding, File Search, Code Execution, and URL Context tools. Note that Google Maps grounding and Computer Use are not yet supported in Gemini 3.

What makes Gemini 3 Pro different from Gemini 2.5 Pro?

Gemini 3 Pro offers stepwise improvements in agentic workflows and autonomous coding. It uses thought signatures for reasoning context across API calls, and has a knowledge cutoff of January 2025.

Can Gemini 3 Pro combine structured outputs with built-in tools?

Yes, Gemini 3 models allow combining structured outputs (JSON schema) with built-in tools like Google Search, URL Context, and Code Execution in the same request.

Why should I keep temperature at 1.0 for Gemini 3 Pro?

Google strongly recommends keeping temperature at the default 1.0. Lower values may cause unexpected looping or degraded performance on mathematical and complex reasoning tasks.

What are thought signatures and why are they important?

Thought signatures are encrypted representations of the model's internal reasoning. For function calling, they're strictly enforced—missing signatures return 400 errors. Official SDKs handle them automatically.

Features for Gemini 3 Pro

Model id (preview): `gemini-3-pro-preview`. Input types: Text, Image, Video, Audio, PDF. Output: Text Context / token limits: Input ≈ 1,048,576 tokens; Output ≤ 65,536 tokens. Knowledge cutoff:January 2025 (uses Search Grounding for newer information). Capabilities (selected): function calling, code execution, file search, structured outputs, search grounding. Not supported: audio generation, image generation, live API, image segmentation, Google Maps grounding (some features differ from Gemini 2.5).

Pricing for Gemini 3 Pro

Explore competitive pricing for Gemini 3 Pro , designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how Gemini 3 Pro can enhance your projects while keeping costs manageable.

gemini-3-pro (same price across variants shown)

Model familyVariant (model name)Input price (USD / 1M tokens)Output price (USD / 1M tokens)
gemini-3-progemini-3-pro-preview$1.60$9.60
gemini-3-progemini-3-pro-preview-thinking$1.60$9.60
gemini-3-progemini-3-pro-all$1.60$9.60

Sample code and API for Gemini 3 Pro

Gemini 3 Pro is Google/DeepMind’s newest flagship multimodal reasoning model in the Gemini 3 family. It is positioned as their “most intelligent model yet,” designed for deep reasoning, agentic workflows, advanced coding, and long-context multimodal understanding (text, images, audio, video, code and tool integrations).
POST
/v1/chat/completions
POST
/v1beta/models/{model}:{operator}
Python
JavaScript
Curl
from google import genai
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com"

client = genai.Client(
    http_options={"api_version": "v1beta", "base_url": BASE_URL},
    api_key=COMETAPI_KEY,
)

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="Explain how AI works in a few words",
)

print(response.text)

Python Code Example

from google import genai
import os

# Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com"

client = genai.Client(
    http_options={"api_version": "v1beta", "base_url": BASE_URL},
    api_key=COMETAPI_KEY,
)

response = client.models.generate_content(
    model="gemini-3-pro-preview",
    contents="Explain how AI works in a few words",
)

print(response.text)

JavaScript Code Example

// Get your CometAPI key from https://api.cometapi.com/console/token, and paste it here
const api_key = process.env.COMETAPI_KEY;
const base_url = "https://api.cometapi.com/v1beta";
const model = "gemini-3-pro-preview";
const operator = "generateContent";

async function main() {
  const response = await fetch(`${base_url}/models/${model}:${operator}`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: api_key,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [{ text: "Explain how AI works in a few words" }],
        },
      ],
    }),
  });

  const data = await response.json();
  console.log(data.candidates[0].content.parts[0].text);
}

await main();

Curl Code Example

curl "https://api.cometapi.com/v1beta/models/gemini-3-pro-preview:generateContent" \
  -H "Authorization: $COMETAPI_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain how AI works in a few words"
          }
        ]
      }
    ]
  }'

Versions of Gemini 3 Pro

The reason Gemini 3 Pro has multiple snapshots may include potential factors such as variations in output after updates requiring older snapshots for consistency, providing developers a transition period for adaptation and migration, and different snapshots corresponding to global or regional endpoints to optimize user experience. For detailed differences between versions, please refer to the official documentation.
Model idDescriptionAvailabilityRequst
gemini-3-pro-allThe technology used is unofficial and the generation is unstable etc✅Chat format
gemini-3-proRecommend, Pointing to the latest model❌Gemini Generating Content
gemini-3-pro-previewOfficial Preview❌Gemini Generating Content

More Models

C

Claude Opus 4.7

Input:$4/M
Output:$20/M
Claude Opus 4.7 is a hybrid reasoning model designed specifically for frontier-level coding, AI agents, and complex multi-step professional work. Unlike lighter models (e.g., Sonnet or Haiku variants), Opus 4.7 prioritizes depth, consistency, and autonomy on the hardest tasks.
A

Claude Sonnet 4.6

Input:$2.4/M
Output:$12/M
Claude Sonnet 4.6 is our most capable Sonnet model yet. It’s a full upgrade of the model’s skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. Sonnet 4.6 also features a 1M token context window in beta.
X

Grok 4.3

Input:$1/M
Output:$2/M
Excels at agentic reasoning, knowledge work, and tool use.
O

GPT 5.5 Pro

Input:$24/M
Output:$144/M
An advanced model engineered for extremely complex logic and professional demands, representing the highest standard of deep reasoning and precise analytical capabilities.
O

GPT 5.5

Input:$4/M
Output:$24/M
A next-generation multimodal flagship model balancing exceptional performance with efficient response, dedicated to providing comprehensive and stable general-purpose AI services.
O

GPT Image 2 ALL

Per Request:$0.04
GPT Image 2 is openai state-of-the-art image generation model for fast, high-quality image generation and editing. It supports flexible image sizes and high-fidelity image inputs.

Related Blog

How to Get Gemini 3.1 Deep Think 
Mar 13, 2026

How to Get Gemini 3.1 Deep Think 

Gemini 3.1 Deep Think is an advanced reasoning mode developed by Google and Google DeepMind that enables AI systems to perform multi-step reasoning, scientific analysis, and complex problem solving. It is currently available primarily through Google AI Ultra subscriptions, the Gemini app, and developer tools such as Gemini API and AI Studio.
Google Shopping Guide: How Google AI Helps Consumers Shop Better
Feb 22, 2026
google

Google Shopping Guide: How Google AI Helps Consumers Shop Better

Google’s push to fold generative AI into the shopping experience has shifted what “search + buy” looks like. Over the last 18 months the company has layered Gemini-powered AI, a re-built Shopping experience, agentic checkout standards and merchant tools that let AI act — within limits — on shoppers’ behalf. For consumers that means conversational discovery, personalized briefs, virtual try-ons, price-tracking and, increasingly, the option for AI to complete purchases; for retailers it means new integrations, richer product data demands, and attention to privacy and payment protocols.
Gemini 3.1 Pro is Now Live on CometAPI: What it is and how to access
Feb 19, 2026
gemini-3-1-pro

Gemini 3.1 Pro is Now Live on CometAPI: What it is and how to access

The Gemini 3.1 Pro is now available on CometAPI, and you can start using it through CometAPI's services—at a more affordable launch price than the official price. CometAPI already exposes the Gemini 3 family and provides an OpenAI-compatible path to call those models from a single unified gateway; that makes it quick to experiment with Gemini models using existing OpenAI SDKs
Qwen-3.5 on Lunar New Year — does it beat the closed-source top tier in 2026
Feb 16, 2026
qwen-3-5

Qwen-3.5 on Lunar New Year — does it beat the closed-source top tier in 2026

Alibaba’s new Qwen3.5 is a major step forward — it closes the gap with, and in some agentic / multimodal workloads claims parity or advantage over, certain frontier closed-source models on a number of public benchmarks and internal tests. However, “outperform” depends on the workload: on agentic tool-use, multimodal document/video understanding, and cost-per-inference Qwen3.5 is reported to be extremely competitive (and in some vendor charts ahead). The practical takeaway: Qwen3.5 appears to be a genuine frontier contender in early 2026 — for many enterprise agentic and multimodal use cases it is now viable as a primary option.
Cursor vs Claude Code vs Codex: Which Is Better for Vibe Coding in 2026?
Feb 2, 2026

Cursor vs Claude Code vs Codex: Which Is Better for Vibe Coding in 2026?

In the rapidly evolving world of Vibe Code development, developers are debating which tools enable the most productive, intuitive, and reliable workflows. Today’s comparison puts three leading agents — Cursor, Claude Code, and OpenAI Codex — head-to-head, focusing on the emergent paradigm of “vibe coding,” pricing, features, operations, usage, and real-world performance.