Aliyun Models - CometAPI

GPT-5.6 Series is now live on CometAPI →

Happy Horse 1.1

Happy Horse 1.1

Per Second:$0.112

HappyHorse 1.1 is a multimodal video-generation model designed for professional content creation, advertising, short films, social media production, and storytelling. It extends the capabilities of HappyHorse 1.0—which gained significant attention after ranking highly in independent video-generation evaluations—with stronger scene coherence and improved visual fidelity.

Happy Horse 1.0

Happy Horse 1.0

Per Second:$0.112

Happy Horse 1.0 — A high-quality audio-video generation model that supports text-to-video and image-to-video creation. It can generate synchronized visuals, audio, and lip movements, making it suitable for short films, advertising creatives, and product showcases.

Qwen3.7 Plus

Qwen3.7 Plus

Qwen3.7 Plus is a high-performance large language model developed by Alibaba Cloud. It supports long-context understanding up to 128K tokens, function calling, and multilingual tasks. Designed for complex reasoning, coding, and instruction-following scenarios.

Qwen3.7-Max

Qwen3.7-Max

Qwen3.7-Max's core strength lies in the breadth and depth of its agentic capabilities. In coding, it handles everything from front-end prototyping to complex multi-file engineering projects. For office and productivity work, it enables workflow automation through MCP integration and multi-agent collaboration. In long-horizon autonomous execution, it maintained coherent reasoning throughout a 35-hour, fully autonomous kernel optimization experiment involving over 1,000 tool calls — convincingly demonstrating its sustained, stable execution. Furthermore, it delivers consistently strong cross-framework generalization, performing reliably whether deployed in Claude Code, OpenClaw, Qwen Code, or other frameworks.

Wan2.7

Wan2.7

Per Second:$0.08

Wan2.7 is a video generation model designed for high-quality visual synthesis and improved motion consistency. It is suitable for cinematic content creation and professional video production workflows.

Wan2.6

Wan2.6

Per Second:$0.08

Wan2.6 is a video generation model designed for stable and efficient video synthesis. It provides reliable visual quality and smooth motion generation for general video creation tasks.

Qwen3.6-Plus

Qwen3.6-Plus

Qwen 3.6-Plus is now available, featuring enhanced code development capabilities and improved efficiency in multimodal recognition and inference, making the Vibe Coding experience even better.

Qwen 3.5 Flash

Qwen 3.5 Flash

The Qwen-3.5 Flash Series is a production-oriented family of large language models (LLMs) developed by the Alibaba Group under its Qwen initiative. It represents the deployment (hosted/API) layer of the broader Qwen-3.5 model family, optimized for high speed, long-context processing, and agent-based applications. In simple terms: Qwen-3.5 Flash = fast, scalable, long-context, tool-using versions of Qwen-3.5 models designed for real-world production use.

qwen3.5-plus

qwen3.5-plus

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency.

qwen3.5-397b-a17b

qwen3.5-397b-a17b

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

qwen3 max

qwen3 max

- qwen3-max: Alibaba Tongyi Qianwen team's latest Qwen3-Max model, positioned as the series' performance peak. - 🧠 Powerful Multimodal and Inference: Supports ultra-long context (up to 128k tokens) and Multimodal input, excels at complex Inference, code generation, translation, and creative content. - ⚡️ Breakthrough Improvement: Significantly optimized across multiple technical indicators, faster response speed, knowledge cutoff up to 2025, suitable for enterprise-level high-precision AI applications.

Qwen Image

Qwen Image

Per Request:$0.028

Qwen-Image is a revolutionary image generation foundational model released by Alibaba's Tongyi Qianwen team in 2025. With a parameter scale of 20 billion, it is based on the MMDiT (Multimodal Diffusion Transformer) architecture. The model has achieved significant breakthroughs in complex text rendering and precise image editing, demonstrating exceptional performance particularly in Chinese text rendering. Translated with DeepL.com (free version)

qwen-image-2

qwen-image-2

qwen-image-2 coming soon

qwen3-vl-30b-a3b

qwen3-vl-30b-a3b

Qwen3-VL-30B-A3B is a state-of-the-art multimodal AI model in the Qwen3 AI family, developed by Alibaba’s Qwen team. It’s designed to unify language understanding and visual comprehension — including text, images, and video — in a single foundation model.

qwen3-vl-32b

qwen3-vl-32b

Qwen3-VL-32B is the 32-billion-parameter dense variant in Alibaba’s Qwen3 vision-language model family. It is a multimodal (vision + language + video) transformer designed for unified perception, long-context reasoning, robust OCR and visual grounding, and agentic/toolified workflows.

qwen3-vl-235b-a22b

qwen3-vl-235b-a22b

qwen3-vl-235b-a22b is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results.

qwen3-30b-a3b

qwen3-30b-a3b

Has 3 billion parameters, balancing performance and resource requirements, suitable for enterprise-level applications. - This model may employ MoE or other optimized architectures, suitable for scenarios requiring efficient processing of complex tasks, such as intelligent customer service and content generation.

qwen3-coder-plus

qwen3-coder-plus

qwen3-coder-480b-a35b-instruct

qwen3-coder-480b-a35b-instruct

qwen3-coder

qwen3-coder

CometAPI’s qwen3-coder is an affordable, OpenAI-compatible coding model API for Qwen3 Coder, optimized for code generation, debugging, and repository-level engineering workflows with ~20% lower pricing.

qwen3-235b-a22b

qwen3-235b-a22b

Output:$1.344/M

Qwen3-235B-A22B is the flagship model of the Qwen3 series, with 23.5 billion parameters, using a Mixture of Experts (MoE) architecture. - Particularly suitable for complex tasks requiring high-performance Inference, such as coding, mathematics, and Multimodal applications.

Qwen3.6-Max-Preview

Qwen3.6-Max-Preview

Output:$9.984/M

Qwen3.6-Max-Preview Compared with Qwen3.6-Plus, this preview version brings stronger world knowledge and instruction compliance capabilities, as well as significantly improved agent programming performance on multiple benchmarks