ModelSokonganPerusahaanBlog
500+ API Model AI, Semua Dalam Satu API. Hanya Di CometAPI
API Model
Pembangun
Mula PantasDokumentasiPapan Pemuka API
Sumber
Model AIBlogPerusahaanLog PerubahanTentang
2025 CometAPI. Hak cipta terpelihara.Dasar PrivasiTerma Perkhidmatan
Home/Models/Hunyuan/hunyuan-large-vision
H

hunyuan-large-vision

Masukan:$0.44576/M
Keluaran:$1.33728/M
Penggunaan komersial
Gambaran Keseluruhan
Ciri-ciri
Harga
API

Technical Specifications of hunyuan-large-vision

ItemDetails
Model IDhunyuan-large-vision
Provider / model familyTencent Hunyuan (Tencent HY), a general-purpose multimodal model family developed by Tencent.
Model typeLarge vision-language / multimodal model for image-to-text understanding, visual dialogue, analysis, and reasoning.
Primary modalitiesImage input with text output; designed for multimodal conversations that combine visual and textual context.
Notable family positioningTencent publicly presents its Vision line as a flagship visual-language offering within the Hunyuan family, alongside other text, image, video, and 3D models.
Typical capabilitiesImage understanding, image-based question answering, multimodal chat, content analysis, and visual reasoning.
Access patternAPI-based access through Tencent HY / Tencent Cloud-style model services; Tencent also highlights API integration as the standard way to use these capabilities.
Vendor-published exact low-level specsTencent’s public product pages clearly describe the Vision family and multimodal capabilities, but the exact public page-to-page technical specs for the specific identifier hunyuan-large-vision are limited; this CometAPI model ID should therefore be treated as the platform routing name for Tencent Hunyuan vision capability.

What is hunyuan-large-vision?

hunyuan-large-vision is CometAPI’s platform identifier for accessing Tencent Hunyuan’s large-scale vision-language capability. Tencent describes Hunyuan as a self-developed, general-purpose multimodal model family covering text, image, video, 3D, and related enterprise AI scenarios. Within that family, Tencent’s Vision offerings are positioned for multimodal interaction where users provide images and receive textual understanding, analysis, and reasoning results.

In practical terms, this model is suited for workflows where an application needs to “look” at an image and respond intelligently in natural language. That can include describing scenes, answering questions about uploaded images, extracting meaning from visual content, supporting image-grounded chat, and helping automate tasks that depend on visual understanding. Tencent’s public materials also emphasize end-to-end multimodal usability across document, OCR-like, and visual analysis scenarios, which reinforces the model family’s role in enterprise image understanding pipelines.

Main features of hunyuan-large-vision

  • Multimodal image understanding: Accepts image context and produces text responses for recognition, interpretation, and explanation tasks.
  • Visual question answering: Suitable for asking natural-language questions about an image and receiving grounded answers based on the visual input.
  • Image-grounded dialogue: Supports conversational use cases where users iteratively ask follow-up questions about the same visual input.
  • Reasoning over visual content: Tencent positions its vision models for analysis and reasoning, not just captioning, making them useful for higher-value automation scenarios.
  • Enterprise multimodal fit: Part of a broader Tencent HY ecosystem built for content production, knowledge workflows, and business automation.
  • API-friendly deployment: Tencent highlights API integration and ease of use as a core product trait, which aligns well with CometAPI-based integration.
  • Related document and OCR ecosystem: Public Hunyuan vision materials also show strong support for OCR, document parsing, information extraction, subtitle extraction, translation, and document QA in adjacent vision products, indicating a strong broader visual-understanding stack.

How to access and integrate hunyuan-large-vision

Step 1: Sign Up for API Key

Sign up on CometAPI and generate your API key from the dashboard. Once you have the key, store it securely and use it to authenticate all requests to the hunyuan-large-vision model.

Step 2: Send Requests to hunyuan-large-vision API

Use the standard OpenAI-compatible API format supported by CometAPI, and set model to hunyuan-large-vision.

curl https://api.cometapi.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $COMETAPI_API_KEY" \
  -d '{
    "model": "hunyuan-large-vision",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "Describe this image and identify the key objects." },
          { "type": "image_url", "image_url": { "url": "https://example.com/image.jpg" } }
        ]
      }
    ]
  }'

Step 3: Retrieve and Verify Results

Parse the JSON response, read the model output from the first choice, and verify that the returned result matches your task requirements. For production use, you should also validate output structure, add retries, and log responses for observability when calling hunyuan-large-vision.

Ciri-ciri untuk hunyuan-large-vision

Terokai ciri-ciri utama hunyuan-large-vision, yang direka untuk meningkatkan prestasi dan kebolehgunaan. Temui bagaimana keupayaan ini boleh memberi manfaat kepada projek anda dan meningkatkan pengalaman pengguna.

Harga untuk hunyuan-large-vision

Terokai harga yang kompetitif untuk hunyuan-large-vision, direka bentuk untuk memenuhi pelbagai bajet dan keperluan penggunaan. Pelan fleksibel kami memastikan anda hanya membayar untuk apa yang anda gunakan, menjadikannya mudah untuk meningkatkan skala apabila keperluan anda berkembang. Temui bagaimana hunyuan-large-vision boleh meningkatkan projek anda sambil mengekalkan kos yang terurus.
Harga Comet (USD / M Tokens)Harga Rasmi (USD / M Tokens)Diskaun
Masukan:$0.44576/M
Keluaran:$1.33728/M
Masukan:$0.5572/M
Keluaran:$1.6716/M
-20%

Kod contoh dan API untuk hunyuan-large-vision

Akses kod sampel yang komprehensif dan sumber API untuk hunyuan-large-vision bagi memperlancar proses integrasi anda. Dokumentasi terperinci kami menyediakan panduan langkah demi langkah, membantu anda memanfaatkan potensi penuh hunyuan-large-vision dalam projek anda.

Lebih Banyak Model

G

Nano Banana 2

Masukan:$0.4/M
Keluaran:$2.4/M
Gambaran Keseluruhan Keupayaan Teras: Resolusi: Sehingga 4K (4096×4096), setara dengan Pro. Ketekalan Imej Rujukan: Sehingga 14 imej rujukan (10 objek + 4 watak), mengekalkan ketekalan gaya/watak. Nisbah Aspek Melampau: Nisbah baharu 1:4, 4:1, 1:8, 8:1 ditambah, sesuai untuk imej panjang, poster dan sepanduk. Penjanaan Teks: Penjanaan teks lanjutan, sesuai untuk infografik dan susun atur poster pemasaran. Peningkatan Carian: Carian Google + Carian Imej bersepadu. Pembumian: Proses pemikiran terbina dalam; arahan kompleks dirasionalkan sebelum penjanaan.
A

Claude Opus 4.6

Masukan:$4/M
Keluaran:$20/M
Claude Opus 4.6 ialah model bahasa besar kelas “Opus” oleh Anthropic, dikeluarkan pada Februari 2026. Ia diposisikan sebagai tulang belakang untuk kerja berpengetahuan dan aliran kerja penyelidikan — menambah baik penaakulan berkonteks panjang, perancangan berbilang langkah, penggunaan alat (termasuk aliran kerja perisian berasaskan ejen), dan tugas penggunaan komputer seperti penjanaan slaid dan hamparan automatik.
A

Claude Sonnet 4.6

Masukan:$2.4/M
Keluaran:$12/M
Claude Sonnet 4.6 ialah model Sonnet kami yang paling berkeupayaan setakat ini. Ia merupakan peningkatan menyeluruh terhadap kemahiran model yang meliputi pengaturcaraan, penggunaan komputer, penaakulan konteks panjang, perancangan agen, kerja berasaskan pengetahuan, dan reka bentuk. Sonnet 4.6 turut menampilkan tetingkap konteks 1M token dalam beta.
O

GPT-5.4 nano

Masukan:$0.16/M
Keluaran:$1/M
GPT-5.4 nano direka untuk tugasan yang amat mengutamakan kelajuan dan kos, seperti pengelasan, pengekstrakan data, pemeringkatan dan sub-agen.
O

GPT-5.4 mini

Masukan:$0.6/M
Keluaran:$3.6/M
GPT-5.4 mini membawa kekuatan GPT-5.4 ke dalam model yang lebih pantas dan lebih cekap, direka untuk beban kerja berskala besar.
A

Claude Mythos Preview

A

Claude Mythos Preview

Akan datang
Masukan:$60/M
Keluaran:$240/M
Claude Mythos Preview ialah model terdepan kami yang paling berkemampuan setakat ini, dan menunjukkan lonjakan yang ketara dalam skor pada banyak penanda aras penilaian berbanding model terdepan kami sebelum ini, Claude Opus 4.6.