模型支持企业博客
500+ AI 模型 API,一次搞定,就在 CometAPI
模型 API
开发者
快速入门文档API 仪表板
资源
AI 模型博客企业更新日志关于
2025 CometAPI。保留所有权利。隐私政策服务条款
Home/Models/Kling/Kling Image Recognize
K

Kling Image Recognize

每次请求:$0.013216
Keling image element recognition API, usable for multi-image reference video generation, Multimodal video editing features ● Can recognize subjects, faces, clothing, etc., and can obtain 4 sets of results (if available) per request.
新
商用
概览
功能亮点
定价
API

Technical Specifications of kling-image-recognize

SpecificationDetails
Model IDkling-image-recognize
CategoryImage recognition / multimodal analysis
Primary CapabilityRecognizes image elements for downstream creative workflows, including multi-image reference video generation and multimodal video editing
Input TypeImage input
Output TypeStructured recognition results
Recognition ScopeSubjects, faces, clothing, and other visual elements
Result VolumeCan return up to 4 sets of results per request, if available
Use CasesVisual asset analysis, reference preparation for video generation, content understanding for editing pipelines, subject and apparel recognition

What is kling-image-recognize?

kling-image-recognize is a Keling image element recognition API designed to analyze visual content and identify important elements within an image. It is especially useful in workflows that require multi-image reference video generation or multimodal video editing, where understanding the contents of source images is an important preprocessing step.

The model can recognize a range of visual attributes such as subjects, faces, clothing, and related image components. Depending on the input, it can provide up to 4 sets of recognition results in a single request, helping developers capture multiple possible detections or interpretations when available.

Main features of kling-image-recognize

  • Image element recognition: Detects and identifies important visual elements contained in an input image.
  • Subject analysis: Recognizes primary subjects that can be used in downstream media generation or editing workflows.
  • Face recognition support: Extracts face-related recognition results when faces are present in the image.
  • Clothing identification: Detects apparel and clothing-related elements to support more detailed visual understanding.
  • Multi-image reference workflow support: Useful for preparing and analyzing image references used in video generation pipelines.
  • Multimodal video editing compatibility: Helps power editing scenarios where image content needs to be understood before transformation or composition.
  • Multiple result sets per request: Can obtain up to 4 sets of results per request, if available, enabling richer recognition output.
  • Integration-friendly API usage: Suitable for developers building automated media analysis and creative application pipelines.

How to access and integrate kling-image-recognize

Step 1: Sign Up for API Key

To get started, sign up on the CometAPI platform and generate your API key from the dashboard. After obtaining your key, store it securely and use it to authenticate every request to the kling-image-recognize API.

Step 2: Send Requests to kling-image-recognize API

Once you have your API key, send requests to the CometAPI endpoint using kling-image-recognize as the model ID. Include your authentication headers and provide the required image input payload based on your application workflow.

curl --request POST \
  --url https://api.cometapi.com/v1/responses \
  --header "Authorization: Bearer YOUR_COMETAPI_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "model": "kling-image-recognize",
    "input": [
      {
        "role": "user",
        "content": [
          {
            "type": "input_text",
            "text": "Recognize the main visual elements in this image."
          },
          {
            "type": "input_image",
            "image_url": "YOUR_IMAGE_URL"
          }
        ]
      }
    ]
  }'

Step 3: Retrieve and Verify Results

After submission, the API returns recognition results generated by kling-image-recognize. Parse the response in your application, verify the detected subjects or attributes, and store the returned data for use in video generation, editing, or other downstream automation tasks.

Kling Image Recognize 的功能

了解 Kling Image Recognize 的核心能力,帮助提升性能与可用性,并改善整体体验。

Kling Image Recognize 的定价

查看 Kling Image Recognize 的竞争性定价,满足不同预算与使用需求,灵活方案确保随需求扩展。
Comet 价格 (USD / M Tokens)官方定价 (USD / M Tokens)折扣
每次请求:$0.013216
每次请求:$0.01652
-20%

Kling Image Recognize 的示例代码与 API

获取完整示例代码与 API 资源,简化 Kling Image Recognize 的集成流程,我们提供逐步指导,助你发挥模型潜能。

更多模型

O

Sora 2 Pro

每秒:$0.24
Sora 2 Pro 是我们最先进、最强大的媒体生成模型,可生成带有同步音频的视频。它可以根据自然语言或图像创建细致、动态的视频片段。
O

Sora 2

每秒:$0.08
超级强大的视频生成模型,带有音效,支持聊天格式。
M

mj_fast_video

每次请求:$0.6
Midjourney video generation
X

Grok Imagine Video

每秒:$0.04
通过文本提示生成视频、为静态图像添加动画,或用自然语言编辑现有视频。该 API 支持配置生成视频的时长、长宽比和分辨率,并由 SDK 自动处理异步轮询。
G

Veo 3.1 Pro

每秒:$0.25
Veo 3.1-Pro 指的是 Google 的 Veo 3.1 系列的高能力访问/配置——这一代短时长、支持音频的视频模型带来更丰富的原生音频、改进的叙事/剪辑控制以及场景扩展工具。
G

Veo 3.1

每秒:$0.05
Veo 3.1 是 Google 针对其 Veo 文本与图像→视频系列的渐进但意义重大的更新,新增更丰富的原生音频、更长且可控性更高的视频输出,以及更精细的编辑与场景级控制。