如何除錯失敗的 AI API 生成

AI API 的失敗與一般 API 的失敗並不相同。200 回應並不代表你的生成成功。content 欄位為 null 並不一定是錯誤。而且昨天可行的相同提示，今天可能因為供應商更新了內容政策而失敗。

本指南說明如何解讀 AI API 錯誤、各種失敗模式的含義，以及如何建立能告訴你「哪裡壞了」而不只是「壞了」的錯誤處理。

注意：本文中的模型名稱如 gpt-5.4 與 gpt-5.4-mini 是 CometAPI 的平台識別符，只能透過 https://api.cometapi.com/v1 使用——不能直接用於 OpenAI 或 Anthropic 的 API。見完整模型清單：full model list。

為什麼 AI API 除錯比一般 API 更難

對典型的 REST API 而言，200 代表成功，4xx 代表你做錯了什麼。AI API 則多了一種情況：軟性失敗——回應是 200，但沒有可用內容。

可能出錯的三件事：

硬性失敗——HTTP 錯誤（4xx、5xx）。請求未完成。
軟性失敗——HTTP 200，但 finish_reason 為 content_filter 或 length，或 content 為 null。
靜默失敗——HTTP 200，內容看起來正常，但輸出在應用層面上是錯的。

大多數錯誤處理只涵蓋第 1 種。第 2 與第 3 種才是大多數生產環境錯誤的來源。

了解錯誤回應格式

文字補全端點會回傳一致的錯誤結構：

{  "error": {    "message": "human-readable description (often includes request id)",    "type": "comet_api_error",    "param": "the_problematic_parameter_or_null",    "code": "error_code_or_null"  }}

影像與影片端點的錯誤格式不同——請一律解析原始回應本文，而不要假設各端點都有固定結構。

message 欄位通常會清楚告訴你問題所在。param 欄位會指出是哪些參數導致的。務必同時記錄這兩者。

瞭解各 HTTP 狀態碼的意義

狀態	意義	常見原因	修正方法
400	錯誤請求	缺少模型、該模型不支援的參數	檢查回應中的 error.param
401	未授權	API 金鑰錯誤或缺失	確認 Authorization: Bearer 格式
429	速率限制	請求過多	指數退避（見步驟 4）
500	伺服器錯誤	供應端問題，或格式錯誤的請求本文	退避重試；檢查請求格式
504	閘道逾時	供應商處理過久	重試；考慮使用更快的模型

來源**: CometAPI chat completions docs

在重試邏輯上，區分 400 與 500 很重要。400 表示你的請求有誤——重試同一請求沒有用。500 或 504 表示伺服器端出問題——重試是合理的。

檢查 `finish_reason` —— 最常被忽略的欄位

200 回應且 finish_reason: "content_filter" 代表你的生成被攔截。content 欄位會是 null 或空。若不檢查這點，你的應用會悄悄地回傳空結果。

finish_reason	意義	如何處理	修正方法
stop	正常完成	無需處理——這是成功	檢查回應中的 error.param
length	觸及 token 上限	提高 max_tokens 或縮短提示詞	驗證 Authorization: Bearer 格式
content_filter	被安全政策攔截	重寫提示詞；避免特定名稱/主題	指數退避（見步驟 4）
tool_calls	模型呼叫了工具而非回傳文字	處理工具呼叫；`content` 會是 null	退避重試；檢查請求格式
504	閘道逾時	供應商處理過久	重試；考慮使用更快的模型

來源**: CometAPI chat completions docs

import osimport loggingfrom openai import OpenAI, APIStatusError, APIConnectionError, APITimeoutErrorfrom dotenv import load_dotenvload_dotenv()api_key = os.environ.get("COMETAPI_KEY")if not api_key:    raise ValueError("COMETAPI_KEY is not set")client = OpenAI(    base_url="https://api.cometapi.com/v1",    api_key=api_key,)def safe_complete(messages: list, model: str = "gpt-5.4-mini", **kwargs) -> dict:    """    Complete a chat request with full error and finish_reason handling.    Returns {"content": str, "finish_reason": str, "tool_calls": list | None}    Raises on API errors.    """    try:        response = client.chat.completions.create(            model=model,            messages=messages,            **kwargs        )    except APIStatusError as e:        error_body = {}        try:            error_body = e.response.json().get("error", {})        except Exception:            pass        logging.error(            f"API error status={e.status_code} "            f"message={error_body.get('message')} "            f"param={error_body.get('param')}"        )        raise    except (APIConnectionError, APITimeoutError) as e:        logging.error(f"Network/timeout error: {e}")        raise    choice = response.choices[0]    finish_reason = choice.finish_reason    if finish_reason == "content_filter":        raise ValueError(            f"Generation blocked by content filter. "            f"Model: {model}. Rephrase the prompt."        )    if finish_reason == "length":        used = response.usage.completion_tokens if response.usage else "unknown"        logging.warning(f"Output truncated at token limit. Used {used} tokens.")    # Return structured result so callers can handle tool_calls explicitly    return {        "content": choice.message.content or "",        "finish_reason": finish_reason,        "tool_calls": choice.message.tool_calls,    }# Usageresult = safe_complete(    messages=[{"role": "user", "content": "Summarize this article: [text]"}],    model="gpt-5.4-mini")if result["finish_reason"] == "tool_calls":    # Handle tool call — content will be empty    print("Model wants to call a tool:", result["tool_calls"])else:    print(result["content"])

在應用層偵測靜默失敗

靜默失敗最難抓。API 回傳 200、finish_reason 是 stop，但輸出語義上是錯的。這只能在應用層偵測。

常見模式：

def validate_completion(result: dict, task: str) -> str:    """    Application-layer validation for silent failures.    Raises ValueError if the output doesn't meet basic expectations.    """    content = result["content"].strip()    # Empty output that isn't a tool call    if not content and result["finish_reason"] != "tool_calls":        raise ValueError(f"Empty output for task '{task}' with finish_reason='{result['finish_reason']}'")    # Task-specific checks    if task == "classify":        valid_labels = {"positive", "negative", "neutral"}        if content.lower() not in valid_labels:            logging.warning(                f"Unexpected classification output: '{content}'. "                f"Expected one of {valid_labels}. "                f"Model may have returned explanation instead of label."            )    if task == "json_extract":        import json        try:            json.loads(content)        except json.JSONDecodeError:            raise ValueError(                f"Expected JSON output but got: '{content[:100]}...'. "                f"Try adding 'respond with valid JSON only' to the prompt, "                f"or use response_format={{\"type\": \"json_object\"}}."            )    if task == "summarize" and len(content.split()) < 10:        logging.warning(            f"Suspiciously short summary ({len(content.split())} words). "            f"Check if the input was too short or the model misunderstood the task."        )    return content# Full flow with silent failure detectionresult = safe_complete(    messages=[{"role": "user", "content": "Classify as positive/negative/neutral: 'Great product!'"}],    model="claude-haiku-4-5")label = validate_completion(result, task="classify")

靜默失敗通常來自三個來源之一：提示詞含糊、模型忽略了你的格式指示，或輸入對任務而言太短/太長。當驗證失敗時記錄完整輸出，是最快的診斷方式。

為速率限制加入指數退避

速率限制錯誤（429）是暫時性的。正確做法是以逐步增加的延遲等待後重試——這是任何有速率限制的 API 的標準作法：

import timeimport randomfrom openai import RateLimitErrordef complete_with_retry(    messages: list,    model: str = "gpt-5.4-mini",    max_retries: int = 3,    **kwargs) -> dict:    """Retry on rate limits and server errors with exponential backoff."""    last_error = None    for attempt in range(max_retries):        try:            return safe_complete(messages, model=model, **kwargs)        except APIStatusError as e:            if e.status_code < 500:                raise  # 4xx: don't retry, request is wrong            last_error = e        except RateLimitError as e:            last_error = e        except (APIConnectionError, APITimeoutError) as e:            last_error = e        if attempt < max_retries - 1:            wait = (2 ** attempt) + random.random()  # jitter prevents thundering herd            logging.warning(f"Attempt {attempt + 1} failed. Waiting {wait:.1f}s before retry.")            time.sleep(wait)    raise RuntimeError(f"All {max_retries} attempts failed") from last_error

不要對 400 或 401 重試——那些是用戶端錯誤，不會自己修好。例外是你在輪替 API 金鑰的情況。

偵錯影像生成失敗

影像生成除了標準 HTTP 錯誤外，還有自己的失敗模式：

import base64import requestsdef generate_image_safe(prompt: str, model: str = "dall-e-3") -> dict:    """    Generate an image with full error handling.    Returns {"url": str | None, "bytes": bytes | None, "blocked": bool}    """    api_key = os.environ.get("COMETAPI_KEY")    if not api_key:        raise ValueError("COMETAPI_KEY is not set")    BASE64_MODELS = {"gpt-image-2", "qwen-image"}    headers = {        "Authorization": f"Bearer {api_key}",        "Content-Type": "application/json"    }    payload = {"model": model, "prompt": prompt, "size": "1024x1024"}    if model in BASE64_MODELS:        payload["output_format"] = "png"    else:        payload["response_format"] = "url"    try:        response = requests.post(            "https://api.cometapi.com/v1/images/generations",            json=payload,            headers=headers,            timeout=60        )        response.raise_for_status()    except requests.exceptions.HTTPError as e:        logging.error(f"Image generation HTTP error: {e.response.status_code} {e.response.text}")        raise    except requests.exceptions.Timeout:        logging.error("Image generation timed out after 60s")        raise    data = response.json().get("data", [])    if not data:        logging.warning("Image generation returned empty data — prompt may have been filtered.")        return {"url": None, "bytes": None, "blocked": True}    item = data[0]    if "revised_prompt" in item:        logging.info(f"Provider revised prompt to: {item['revised_prompt']}")    if "url" in item:        return {"url": item["url"], "bytes": None, "blocked": False}    return {        "url": None,        "bytes": base64.b64decode(item["b64_json"]),        "blocked": False    }

影像專屬的注意事項：

症狀	原因	修正方法
`data` 陣列為空	提示詞被過濾	檢查 `revised\_prompt`；重寫提示
GPT Image 2 出現 response_format 錯誤	不支援的參數	改用 `output_format`
Qwen Image 設定 n > 1 出錯	模型限制	以迴圈方式分批請求
之後存取 URL 得到 403	URL 已過期	生成後立即下載

來源**: CometAPI image generation docs

偵錯影片生成失敗

影片生成是非同步的，因此失敗模式不同。請在迴圈前初始化狀態變數，確保逾時錯誤訊息總是格式正確：

def submit_and_poll_video(    prompt: str,    model: str = "veo3-fast",    max_wait: int = 600) -> str:    """Submit video task and poll to completion. Returns video URL."""    api_key = os.environ.get("COMETAPI_KEY")    if not api_key:        raise ValueError("COMETAPI_KEY is not set")    headers = {"Authorization": f"Bearer {api_key}"}    try:        response = requests.post(            "https://api.cometapi.com/v1/videos",            headers=headers,            files={                "prompt": (None, prompt),                "model": (None, model),                "size": (None, "16x9")            },            timeout=30        )        response.raise_for_status()    except requests.exceptions.HTTPError as e:        logging.error(f"Video submit failed: {e.response.status_code} {e.response.text}")        raise    task_id = response.json()["id"]    logging.info(f"Video task submitted: {task_id}")    poll_url = f"https://api.cometapi.com/v1/videos/{task_id}"    elapsed = 0    interval = 10    status = "unknown"   # initialize before loop    progress = 0         # initialize before loop    while elapsed < max_wait:        try:            poll_response = requests.get(poll_url, headers=headers, timeout=30)            poll_response.raise_for_status()        except requests.exceptions.HTTPError as e:            logging.error(f"Poll request failed: {e.response.status_code}")            raise        result = poll_response.json()        status = result.get("status", "unknown")        progress = result.get("progress", 0)        logging.info(f"Task {task_id}: status={status} progress={progress}%")        if status == "succeeded":            return result["output"][0]        elif status in ("failed", "cancelled"):            error_detail = result.get("error", "no error detail returned")            raise RuntimeError(f"Video task {task_id} failed: {error_detail}")        time.sleep(interval)        elapsed += interval    raise TimeoutError(        f"Video task {task_id} did not complete within {max_wait}s. "        f"Last status: {status}, progress: {progress}%"    )

影片專屬問題：

症狀	原因	修正方法
任務卡在 queued 超過 10 分鐘	伺服器負載	換一個模型重試
失敗但沒有錯誤細節	提示詞被過濾或模型錯誤	重寫提示詞
影片 URL 之後回傳 403	URL 已過期	立即下載
Runway 第一次輪詢回傳 task_not_exist	任務仍在初始化（CometAPI 文件所述的行為）	等待 5 秒後重試
Kling 回傳 "succeed" 而非 "succeeded"	Kling 的 API 使用非標準的狀態字串	輪詢邏輯同時處理兩者

來源**: CometAPI video generation docs**, Kling Video docs

Node.js 版本

import OpenAI from 'openai';const apiKey = process.env.COMETAPI_KEY;if (!apiKey) throw new Error('COMETAPI_KEY is not set');const client = new OpenAI({  baseURL: 'https://api.cometapi.com/v1',  apiKey,});async function safeComplete(messages, model = 'gpt-5.4-mini', options = {}) {  let response;  try {    response = await client.chat.completions.create({ model, messages, ...options });  } catch (err) {    if (err.status && err.status < 500) {      console.error(`Client error ${err.status}: ${err.message}`);    } else {      console.error(`Server/network error: ${err.message}`);    }    throw err;  }  const choice = response.choices[0];  const finishReason = choice.finish_reason;  if (finishReason === 'content_filter') {    throw new Error(`Generation blocked by content filter. Model: ${model}`);  }  if (finishReason === 'length') {    console.warn(`Output truncated. Used ${response.usage?.completion_tokens ?? 'unknown'} tokens.`);  }  return {    content: choice.message.content ?? '',    finishReason,    toolCalls: choice.message.tool_calls ?? null,  };}async function completeWithRetry(messages, model = 'gpt-5.4-mini', maxRetries = 3) {  let lastError;  for (let attempt = 0; attempt < maxRetries; attempt++) {    try {      return await safeComplete(messages, model);    } catch (err) {      // Don't retry 4xx client errors      if (err.status && err.status < 500) throw err;      lastError = err;      if (attempt < maxRetries - 1) {        const wait = (2 ** attempt + Math.random()) * 1000;        console.warn(`Attempt ${attempt + 1} failed. Retrying in ${(wait / 1000).toFixed(1)}s`);        await new Promise(r => setTimeout(r, wait));      }    }  }  throw new Error(`All ${maxRetries} attempts failed: ${lastError?.message}`);}// Usageconst result = await safeComplete(  [{ role: 'user', content: 'Classify as positive/negative/neutral: "Great product!"' }],  'claude-haiku-4-5');if (result.finishReason === 'tool_calls') {  console.log('Tool call requested:', result.toolCalls);} else {  console.log(result.content);}

除錯檢查清單

當生成失敗而不知從何檢查時：

文字生成：

API 金鑰是否已設定且為 Authorization: Bearer <key> 格式？
finish_reason 是否不是 stop？
content 是否為 null？檢查 finish_reason 是否為 tool_calls
輸出是否被截斷？檢查 finish_reason: "length" 與 usage.completion_tokens
錯誤是 4xx（修正請求）還是 5xx（重試）？
輸出是否通過你的應用層驗證？（靜默失敗）

影像生成：

data 陣列是否為空？（內容過濾）
你是否在 GPT Image 2 使用了 response_format？（不支援——改用 output_format）
你是否在 Qwen Image 設定了 n > 1？（不支援）
你是否在 URL 過期前就下載了影像？

影片生成：

任務是否卡在 queued？（嘗試不同模型）
你是否檢查了失敗任務回應中的 error 欄位？
你是否在 URL 過期前就下載了影片？
你是否同時處理 "succeed"（Kling）與 "succeeded"（Veo、Runway）？

FAQ

問：我的請求回傳 200 但沒有內容。發生什麼事了？

檢查 finish_reason。若為 content_filter，代表生成被攔截——請求成功但輸出被抑制。若為 tool_calls，代表模型呼叫了工具而非回傳文字，content 為 null 是預期行為。若 finish_reason 是 stop 但內容仍然為空，這是靜默失敗——記錄完整回應並檢查你的提示詞。

問：如何知道我的提示詞被過濾了？

文字：檢查 finish_reason === "content_filter"。影像：檢查 data 陣列是否為空。影片：提交後不久若任務轉為 failed 且無錯誤細節。以上情況請嘗試以更中性的方式重寫提示詞。

問：什麼情況應該重試失敗的請求？

對 429 與 5xx 使用指數退避重試。不要對 4xx 重試——錯誤的請求不會自己修好。例外是 401 若你在輪替 API 金鑰。

問：什麼是指數退避，為什麼重要？

不要立即重試，而是逐步延長等待時間：1s、2s、4s。加入隨機抖動（+ random.random()）能防止多個客戶端同步重試。這是任何有速率限制的 API 的標準作法——並非 CometAPI 特有。

問：影片任務卡在 `queued` 10 分鐘。是失敗了嗎？

不一定——在負載高時佇列會塞車。等待至你的 max_wait 閾值，然後拋出 TimeoutError，改用不同模型重試。記錄任務 ID 以便需要時手動查詢狀態。

問：如何捕捉靜默失敗？

靜默失敗需要應用層驗證——API 不會告訴你輸出語義上是錯的。檢查輸出是否符合預期格式（有效 JSON、預期標籤、最小長度）。當驗證失敗時記錄完整輸出。最常見原因是提示詞含糊、模型忽略格式指示，或輸入長度不適合該任務。

為什麼 AI API 除錯比一般 API 更難

了解錯誤回應格式

瞭解各 HTTP 狀態碼的意義

檢查 `finish_reason` —— 最常被忽略的欄位

在應用層偵測靜默失敗

為速率限制加入指數退避

偵錯影像生成失敗

偵錯影片生成失敗

Node.js 版本

除錯檢查清單

文字生成：

影像生成：

影片生成：

FAQ

問：我的請求回傳 200 但沒有內容。發生什麼事了？

問：如何知道我的提示詞被過濾了？

問：什麼情況應該重試失敗的請求？

問：什麼是指數退避，為什麼重要？

問：影片任務卡在 `queued` 10 分鐘。是失敗了嗎？

問：如何捕捉靜默失敗？

準備好將 AI 開發成本降低 20% 了嗎？

閱讀更多

如何除錯失敗的 AI API 生成

為什麼 AI API 除錯比一般 API 更難

了解錯誤回應格式

瞭解各 HTTP 狀態碼的意義

檢查 finish_reason —— 最常被忽略的欄位

在應用層偵測靜默失敗

為速率限制加入指數退避

偵錯影像生成失敗

偵錯影片生成失敗

Node.js 版本

除錯檢查清單

文字生成：

影像生成：

影片生成：

FAQ

問：我的請求回傳 200 但沒有內容。發生什麼事了？

問：如何知道我的提示詞被過濾了？

問：什麼情況應該重試失敗的請求？

問：什麼是指數退避，為什麼重要？

問：影片任務卡在 queued 10 分鐘。是失敗了嗎？

問：如何捕捉靜默失敗？

準備好將 AI 開發成本降低 20% 了嗎？

閱讀更多

檢查 `finish_reason` —— 最常被忽略的欄位

問：影片任務卡在 `queued` 10 分鐘。是失敗了嗎？