H

hunyuan-vision

مدخل:$2.00592/M
الإخراج:$2.00592/M
الاستخدام التجاري

Technical Specifications of hunyuan-vision

SpecificationDetails
Model IDhunyuan-vision
ProviderTencent Hunyuan
Model typeVision-language / multimodal chat model for image understanding and visual question answering
Primary capabilityAccepts image-plus-text input and returns natural-language responses about image content
API styleOpenAI-compatible Chat Completions API
Base URLhttps://api.hunyuan.cloud.tencent.com/v1
EndpointPOST /chat/completions
Input formatmessages array with mixed text and image_url content parts; image URL or base64 data URL supported in examples
AuthenticationBearer API key (HUNYUAN_API_KEY)
SDK compatibilityCan be called with OpenAI SDKs by changing base_url and api_key
Billing noteFor image input, Tencent documents that hunyuan-vision image tokens vary by image size, roughly 256–1280 tokens per image, with actual usage based on model-side calculation

What is hunyuan-vision?

hunyuan-vision is Tencent Hunyuan’s multimodal image-understanding model exposed through an OpenAI-compatible API. In Tencent’s official examples, it is used for “image-to-text” style tasks where a user sends a prompt together with an image and the model answers questions such as what is shown in the image.

Practically, this makes hunyuan-vision suitable for applications that need visual reasoning in a chat workflow, such as image captioning, scene description, UI or screenshot interpretation, product-image analysis, and general visual question answering. Its integration pattern is especially convenient for teams already using OpenAI-style clients, because Tencent states that developers can switch by replacing only the endpoint and API key configuration.

Main features of hunyuan-vision

  • Multimodal image understanding: hunyuan-vision accepts both text and image content in the same request, enabling image-aware conversations and question answering about uploaded visuals.
  • OpenAI-compatible interface: Tencent provides hunyuan-vision through the same general request structure as Chat Completions, which reduces migration effort for existing AI applications.
  • Flexible image input methods: Official examples show support for standard remote image URLs as well as base64-encoded data URLs, which helps when working with either public assets or locally processed files.
  • SDK-friendly integration: Tencent explicitly documents use with OpenAI SDKs in Python, Node.js, Go, and cURL-style HTTP requests, making it easy to embed into existing backend services.
  • Chat-based workflow: Because it is exposed as a chat completion model, hunyuan-vision fits naturally into conversational apps, assistants, and toolchains that already structure requests around messages.
  • Usage-based image token accounting: Tencent notes that image cost depends on image size, with per-image token consumption documented as a range rather than a flat amount.

How to access and integrate hunyuan-vision

Step 1: Sign Up for API Key

To access hunyuan-vision, first create and secure your API key through the provider’s console. Tencent documents API-key-based access for its OpenAI-compatible Hunyuan API, and the key is passed as a Bearer token in requests. Keep the key in an environment variable such as HUNYUAN_API_KEY, and avoid exposing it in client-side code or public repositories.

Step 2: Send Requests to hunyuan-vision API

Use the OpenAI-compatible endpoint and specify hunyuan-vision as the model name.

curl --location 'https://api.hunyuan.cloud.tencent.com/v1/chat/completions' \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $HUNYUAN_API_KEY" \
  --data '{
    "model": "hunyuan-vision",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://example.com/image.jpg"
            }
          }
        ]
      }
    ]
  }'

You can also use OpenAI-compatible SDKs by pointing the client to the Hunyuan base URL:

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("HUNYUAN_API_KEY"),
    base_url="https://api.hunyuan.cloud.tencent.com/v1",
)

response = client.chat.completions.create(
    model="hunyuan-vision",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/image.jpg"
                    },
                },
            ],
        }
    ],
)

print(response.choices[0].message.content)

This request structure follows Tencent’s official OpenAI-compatible examples for hunyuan-vision.

Step 3: Retrieve and Verify Results

Read the generated answer from the first completion choice, typically response.choices[0].message.content when using an OpenAI-compatible SDK. For production use, verify that the image URL is reachable or that your base64 payload is valid, then check the returned description against your application requirements for accuracy, safety, and formatting consistency. Tencent’s examples show standard chat-completions response handling, so existing validation and logging pipelines can usually be reused with minimal changes.