Can DeepSeek-V4-Pro handle 1M-token documents in the API?

Yes. DeepSeek-V4-Pro with a 1M-token context length and up to 384K output tokens, so it is built for very long documents and multi-file workflows.

Does DeepSeek-V4-Pro support thinking mode and tool calls?

Yes. DeepSeek-V4-Pro supports both thinking and non-thinking modes, plus JSON output and tool calls.

When should I use DeepSeek-V4-Pro instead of DeepSeek-V4-Flash?

Use DeepSeek-V4-Pro when accuracy and agentic coding matter more than speed. DeepSeek says V4-Flash is the faster, more economical option, while V4-Pro is stronger on coding and broader agent evaluations.

Is DeepSeek-V4-Pro good for coding agents like Claude Code or OpenCode?

Yes. DeepSeek-V4-Pro configured for Claude Code and OpenCode, with `reasoningEffort` set to `max` and thinking enabled.

How do I integrate DeepSeek-V4-Pro with OpenAI-compatible SDKs?

Use the CometAPI base URL `https://api.cometapi.com` with the model name `deepseek-v4-pro`

Is DeepSeek-V4-Pro suitable for search-heavy research workflows?

Yes. V4-Pro performs strongly on search and retrieval-style tasks, and it outperforms DeepSeek-V3.2 by a substantial margin in both objective and subjective Q&A categories.

Overkommelig DeepSeek V4 Pro API | text-to-text

기술 사양

항목	DeepSeek-V4-Pro
제공사	DeepSeek
API 모델명	deepseek-v4-pro
기본 URL	https://api.deepseek.com 및 https://api.deepseek.com/anthropic
입력 유형	텍스트
출력 유형	텍스트, 도구 호출, 추론 출력
컨텍스트 길이	1,000,000 토큰
최대 출력	384,000 토큰
추론 모드	비-thinking, thinking(기본)
에이전트/코딩 기본값	reasoning_effort를 높게 설정 가능; 복잡한 에이전트 요청은 max를 사용할 수 있음
지원 기능	JSON 출력, 도구 호출, Chat Prefix Completion(베타), FIM Completion(비-thinking 모드에서 베타)
로컬/오픈 웨이트 릴리스	총 파라미터 1.6T, 활성 파라미터 49B, FP4 + FP8 혼합 정밀도
라이선스(모델 카드)	MIT
참조 모델 카드	Hugging Face의 DeepSeek-V4-Pro 프리뷰

DeepSeek-V4-Pro란?

DeepSeek-V4-Pro는 DeepSeek의 V4 프리뷰 제품군 중 더 강력한 모델입니다. 공식 모델 카드는 1.6T 파라미터의 MoE 모델로, 활성 파라미터 49B와 100만 토큰 컨텍스트 윈도우를 갖추어 장기 지식 작업, 코드 생성 및 에이전트 작업을 목표로 한다고 설명합니다. API 문서는 표준 DeepSeek Chat Completions 인터페이스를 통해 제공되며 OpenAI 및 Anthropic SDK 스타일을 모두 지원합니다.

주요 기능

백만 토큰 컨텍스트: DeepSeek는 1M 토큰 컨텍스트 길이를 문서화했으며, 이로 인해 매우 큰 문서 세트, 리포지토리 및 다단계 에이전트 세션에 모델을 적합하게 만들었습니다.
두 가지 추론 모드: API는 비-thinking과 thinking 모드를 지원합니다. thinking이 기본이며, Claude Code나 OpenCode와 같은 복잡한 에이전트 요청은 자동으로 reasoning_effort가 max로 설정될 수 있다고 문서에 명시되어 있습니다.
도구 호출 지원: DeepSeek의 thinking 모드는 도구 호출을 지원하며, 검색, 파일 작업 또는 외부 함수가 필요한 에이전트에 중요합니다.
장문맥 효율성: 모델 카드는 V4가 Compressed Sparse Attention과 Heavily Compressed Attention을 결합한 하이브리드 어텐션 설계를 사용하여 V3.2 대비 장문맥 계산 및 KV 캐시 비용을 줄인다고 설명합니다. citeturn980363view2
코딩 및 추론 중심: DeepSeek는 V4-Pro-Max 추론 모드가 코딩 벤치마크를 향상시키고, 추론 및 에이전트 작업에서 선도적 폐쇄형 모델과의 격차를 상당 부분 좁힌다고 밝혔습니다. citeturn980363view2
SDK 유연성: 표준 OpenAI 호환 Chat Completions 또는 도구 지향 워크플로를 위한 DeepSeek의 Anthropic 호환 엔드포인트를 통해 접근할 수 있습니다.

벤치마크 성능

공식 DeepSeek 모델 카드는 기본 모델 제품군과 V4-Pro-Max 비교 세트에 대한 다음 평가 결과를 보고합니다. 기본 모델 표에서 V4-Pro는 MMLU-Pro(73.5 대 65.5), FACTS Parametric(62.6 대 27.1), LongBench-V2(51.5 대 40.2) 등 여러 지식/장문맥 벤치마크에서 V3.2-Base보다 높은 점수를 기록했습니다.

벤치마크	V3.2-Base	V4-Flash-Base	V4-Pro-Base
MMLU-Pro (EM)	65.5	68.3	73.5
FACTS Parametric (EM)	27.1	33.9	62.6
HumanEval (Pass@1)	62.8	69.5	76.8
LongBench-V2 (EM)	40.2	44.7	51.5

같은 모델 카드는 선택된 작업에서 V4-Pro-Max가 최전선 모델들과 경쟁력을 유지한다고도 보여줍니다. 예를 들어, 공개된 비교 표에서 MMLU-Pro 87.5, SimpleQA-Verified 57.9, GPQA Diamond 90.1, Terminal Bench 2.0 67.9를 기록합니다.

DeepSeek-V4-Pro vs DeepSeek-V4-Flash vs DeepSeek-V3.2

모델	최적 용도	컨텍스트	비고
DeepSeek-V4-Pro	고난도 추론, 코딩, 에이전트, 대형 문서	1M	가장 큰 V4 모델, 활성 파라미터 49B, 시리즈 중 전반적 역량 최고. citeturn980363view2turn980363view0
DeepSeek-V4-Flash	더 빠르고 가벼운 범용 용도	1M	더 작은 284B/13B 모델이지만 thinking 및 도구 호출을 계속 지원. citeturn980363view2turn980363view0
DeepSeek-V3.2	이전 세대 장문맥 베이스라인	이전 API 문서에서는 128K; V4는 다른 1M 컨텍스트 설계를 사용	효율 향상의 기준점으로 유용; V4-Pro의 모델 카드는 V3.2 대비 장문맥 FLOPs 및 KV 캐시의 큰 감소를 보고. citeturn321011view1turn980363view2

최적 사용 사례

리포지토리 규모의 코딩 어시스턴트 및 리팩터링 도구
장문서 분석 및 생성
다중 턴 추론이 필요한 도구를 사용하는 에이전트
긴 메모리와 구조화된 출력이 유리한 기술 지원 워크플로
모델 카드에서 강한 벤치마크 성능이 확인된 중국어 및 다국어 지식 과제

Deepseek v4 pro API에 접근하고 사용하는 방법

1단계: API 키 발급

cometapi.com에 로그인하세요. 아직 사용자라면 먼저 등록해 주세요. CometAPI 콘솔에 로그인합니다. 인터페이스의 접근 자격 API 키를 발급받습니다. 개인 센터의 API 토큰에서 “Add Token”을 클릭해 토큰 키 sk-xxxxx를 발급받은 뒤 제출하세요.

2단계: Deepseek v4 proAPI로 요청 보내기

API 요청을 보내기 위해 “deepseek-v4-pro” 엔드포인트를 선택하고 요청 본문을 설정하세요. 요청 메서드와 요청 본문은 당사 웹사이트의 API 문서에서 확인할 수 있습니다. 편의를 위해 Apifox 테스트도 제공합니다. <YOUR_API_KEY>를 계정의 실제 CometAPI 키로 바꾸세요. 호출 위치: Anthropic Messages 형식 및 Chat 형식.

질문이나 요청을 content 필드에 입력하세요—모델은 여기에 응답합니다. 생성된 답변을 얻기 위해 API 응답을 처리하세요.

3단계: 결과 조회 및 검증

생성된 답변을 얻기 위해 API 응답을 처리하세요. 처리 후, API는 작업 상태와 출력 데이터를 반환합니다. 표준 매개변수를 통해 스트리밍, 프롬프트 캐싱 또는 장문맥 처리를 활성화하세요.

Priser for DeepSeek V4 Pro

Udforsk konkurrencedygtige priser for DeepSeek V4 Pro, designet til at passe til forskellige budgetter og brugsbehov. Vores fleksible planer sikrer, at du kun betaler for det, du bruger, hvilket gør det nemt at skalere, efterhånden som dine krav vokser. Opdag hvordan DeepSeek V4 Pro kan forbedre dine projekter, mens omkostningerne holdes håndterbare.

Comet-pris (USD / M Tokens)	Officiel Pris (USD / M Tokens)	Rabat
Indtast:$0.416/M Output:$0.832/M	Indtast:$0.52/M Output:$1.04/M	-20%

Eksempelkode og API til DeepSeek V4 Pro

Få adgang til omfattende eksempelkode og API-ressourcer for DeepSeek V4 Pro for at strømline din integrationsproces. Vores detaljerede dokumentation giver trin-for-trin vejledning, der hjælper dig med at udnytte det fulde potentiale af DeepSeek V4 Pro i dine projekter.

Python
JavaScript
Curl

from openai import OpenAI
import os

# Get your CometAPI key from https://www.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com/v1"

client = OpenAI(base_url=BASE_URL, api_key=COMETAPI_KEY)

stream = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Which number is greater, 9.11 or 9.8? Answer with one sentence."},
    ],
    stream=True,
    max_tokens=256,
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content") or ""
    content = delta.content or ""

    if reasoning:
        if not thinking:
            print("<reasoning>")
            thinking = True
        print(reasoning, end="", flush=True)

    if content:
        if thinking:
            print("
</reasoning>

<answer>")
            thinking = False
        print(content, end="", flush=True)

print()

Python Code Example

from openai import OpenAI
import os

# Get your CometAPI key from https://www.cometapi.com/console/token, and paste it here
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com/v1"

client = OpenAI(base_url=BASE_URL, api_key=COMETAPI_KEY)

stream = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Which number is greater, 9.11 or 9.8? Answer with one sentence."},
    ],
    stream=True,
    max_tokens=256,
    reasoning_effort="high",
    extra_body={"thinking": {"type": "enabled"}},
)

thinking = False
for chunk in stream:
    delta = chunk.choices[0].delta
    reasoning = (delta.model_extra or {}).get("reasoning_content") or ""
    content = delta.content or ""

    if reasoning:
        if not thinking:
            print("<reasoning>")
            thinking = True
        print(reasoning, end="", flush=True)

    if content:
        if thinking:
            print("\n</reasoning>\n\n<answer>")
            thinking = False
        print(content, end="", flush=True)

print()

JavaScript Code Example

import OpenAI from "openai";

// Get your CometAPI key from https://www.cometapi.com/console/token, and paste it here
const api_key = process.env.COMETAPI_KEY || "<YOUR_COMETAPI_KEY>";
const base_url = "https://api.cometapi.com/v1";

const client = new OpenAI({
  apiKey: api_key,
  baseURL: base_url,
});

const stream = await client.chat.completions.create({
  model: "deepseek-v4-pro",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Which number is greater, 9.11 or 9.8? Answer with one sentence." },
  ],
  thinking: { type: "enabled" },
  reasoning_effort: "high",
  max_tokens: 256,
  stream: true,
});

let thinking = false;
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta ?? {};
  const reasoning = delta.reasoning_content ?? "";
  const content = delta.content ?? "";

  if (reasoning) {
    if (!thinking) {
      process.stdout.write("<reasoning>\n");
      thinking = true;
    }
    process.stdout.write(reasoning);
  }

  if (content) {
    if (thinking) {
      process.stdout.write("\n</reasoning>\n\n<answer>\n");
      thinking = false;
    }
    process.stdout.write(content);
  }
}

process.stdout.write("\n");

Curl Code Example

#!/usr/bin/env bash
# Get your CometAPI key from https://www.cometapi.com/console/token
# Export it as: export COMETAPI_KEY="your-key-here"

if ! command -v jq >/dev/null 2>&1; then
  echo "jq is required to parse streamed reasoning_content in this shell example." >&2
  exit 1
fi

thinking=false

curl --silent --no-buffer --location --request POST "https://api.cometapi.com/v1/chat/completions" \
  --header "Authorization: Bearer $COMETAPI_KEY" \
  --header "Content-Type: application/json" \
  --data-raw '{
    "model": "deepseek-v4-pro",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Which number is greater, 9.11 or 9.8? Answer with one sentence."}
    ],
    "thinking": {"type": "enabled"},
    "reasoning_effort": "high",
    "max_tokens": 256,
    "stream": true
  }' | while IFS= read -r line; do
    case "$line" in
      data:\ *) data=${line#data: } ;;
      *) continue ;;
    esac

    [ "$data" = "[DONE]" ] && break

    reasoning=$(printf '%s' "$data" | jq -r '.choices[0].delta.reasoning_content // empty')
    content=$(printf '%s' "$data" | jq -r '.choices[0].delta.content // empty')

    if [ -n "$reasoning" ]; then
      if [ "$thinking" = false ]; then
        printf '<reasoning>\n'
        thinking=true
      fi
      printf '%s' "$reasoning"
    fi

    if [ -n "$content" ]; then
      if [ "$thinking" = true ]; then
        printf '\n</reasoning>\n\n<answer>\n'
        thinking=false
      fi
      printf '%s' "$content"
    fi
  done

printf '\n'

Versioner af DeepSeek V4 Pro

Årsagen til, at DeepSeek V4 Pro har flere øjebliksbilleder kan omfatte potentielle faktorer såsom variationer i output efter opdateringer, der kræver ældre øjebliksbilleder for konsistens, at give udviklere en overgangsperiode til tilpasning og migration, og at forskellige øjebliksbilleder svarer til globale eller regionale slutpunkter for at optimere brugeroplevelsen. For detaljerede forskelle mellem versioner, henvises der til den officielle dokumentation.

version
deepseek-v4-pro