什麼是 Gemini 3 Flash

「Gemini 3 Flash」是 Gemini-3 家族中的 Flash/fast 成員：Google 的 Gemini-3 模型的輕量、低延遲、具成本效率的變體，面向高吞吐量、即時與對規模敏感的應用。它是 Gemini API 模型家族的一個變體，允許開發者透過 CometAPI 的 API 呼叫低延遲、成本最佳化的 Gemini 3 風格模型（與其他 Gemini 模型具有相同的 API 介面）。它提供相同的多模態輸入與結構化輸出工具，但更優先考量推論速度與吞吐量。

主要功能：

低延遲 / 高吞吐量：為快速回應與成本效率而調校（Flash 設計取向）。
多模態輸入支援：在多個 Flash 變體中支援文字、影像、影片片段與音訊（API 模型條目會列出各變體支援的輸入類型）。
函式呼叫與結構化輸出：可強制 JSON/結構化輸出，便於與工具與代理整合。
代理/工具支援：可與 Google Search grounding、函式/工具呼叫，以及 Gemini 生態系中的代理框架整合。

Gemini 3 Flash 與其他模型的比較

相較於 Gemini-3 Pro（同一系列）：Flash = 速度/成本最佳化；Pro = 更強的推理、多模態保真度與 Deep Think。即時介面選擇 Flash；對準確度敏感的任務選擇 Pro。
相較於先前的 Gemini（2.5 Flash）：Gemini-3 家族在推理與多模態表現上有所提升；Flash 的設計取向持續聚焦於價格/效能。若你目前使用 2.5 Flash，Gemini-3 Fast/Flash 旨在以相近的延遲/成本提供更佳品質。

實際使用情境（Flash 的優勢場景）

即時聊天機器人與語音代理：為對話式介面與串流音訊應用提供低延遲。
客戶支援與大規模摘要：以高成本效率對長篇逐字稿進行規模化摘要。
回應時間至關重要的邊緣或嵌入式推論：對嚴格 SLA 使用 flash/lite 風格的變體。
大量文件解析 / 匯入管線：使用 Flash 進行索引與前處理；高價值的擷取/分析再升級至 Pro。
即時程式碼助理 / IDE 外掛：以較低費用提供快速程式碼補全（複雜重構可用 Pro 驗證）。

如何存取 Gemini 3 Flash API

步驟 1：註冊取得 API 金鑰

登入 cometapi.com。若您尚未成為使用者，請先註冊。登入您的 CometAPI console。取得介面的存取憑證 API 金鑰。在個人中心的 API token 處點選「Add Token」，取得 token 金鑰：sk-xxxxx 並提交。

步驟 2：向 Gemini 3 Flash API 發送請求

選擇「gemini-3-flash」端點發送 API 請求並設定請求本文。請求方法與請求本文可參考我們網站的 API 文件。我們也提供 Apifox 測試以便您使用。將 <YOUR_API_KEY> 替換為您帳戶中的實際 CometAPI 金鑰。基底 URL 為 Gemini Generating Content 與 Chat。

將您的問題或請求填入 content 欄位——模型會回應此內容。處理 API 回應以取得產生的答案。

步驟 3：擷取並驗證結果

處理 API 回應以取得產生的答案。處理完成後，API 會回傳任務狀態與輸出資料。

另請參閱 Gemini 3 Pro Preview API

Gemini 3 Flash is Google's most balanced model, offering frontier-level reasoning capabilities at $0.50/$3 per million tokens—approximately 4x cheaper than Gemini 3 Pro while maintaining comparable intelligence for most tasks.

Gemini 3 Flash supports four thinking levels: minimal (near-zero latency), low, medium, and high—giving developers granular control over the reasoning depth vs. speed tradeoff that Gemini 3 Pro doesn't offer.

Yes, Gemini 3 Flash (gemini-3-flash-preview) has a free tier in the Gemini API, unlike Gemini 3 Pro which currently requires paid usage for API access.

Thought Signatures are encrypted representations of the model's internal reasoning that must be circulated back in multi-turn conversations—required even at minimal thinking level for Gemini 3 Flash to maintain reasoning context and enable function calling.

Yes, Gemini 3 Flash uniquely supports combining structured outputs (JSON schema) with built-in tools like Google Search, URL Context, and Code Execution in the same request—enabling grounded, type-safe responses.

The media_resolution parameter controls token usage per image/video frame: low (280 tokens), medium (560), high (1120), or ultra_high for images. For video, low and medium are both capped at 70 tokens per frame to optimize context usage.

Gemini 3 Flash supports Google Search, File Search, Code Execution, URL Context, and standard function calling. However, Google Maps grounding and Computer Use are not yet supported in Gemini 3 models.

Correction: gemini-3-flash variants (same price across variants)

Model family	Variant (model name)	Input price (USD / 1M tokens)	Output price (USD / 1M tokens)
gemini-3-flash	gemini-3-flash	$0.40	$2.40
gemini-3-flash	gemini-3-flash-preview	$0.40	$2.40
gemini-3-flash	gemini-3-flash-all	$0.40	$2.40
gemini-3-flash	gemini-3-flash-thinking	$0.40	$2.40
gemini-3-flash	gemini-3-flash-preview-thinking	$0.40	$2.40

模型 ID	描述	可用性	請求
gemini-3-flash-all	所使用的技術為非官方，生成不穩定，但支援 Direct Internet 等，Chat 格式	✅	Chat 格式
gemini-3-flash	自動指向最新模型	✅	Gemini Generating Content
gemini-3-flash-preview	官方預覽	✅	Gemini Generating Content