Models
GPT Image 2 is openai state-of-the-art image generation model for fast, high-quality image generation and editing. It supports flexible image sizes and high-fidelity image inputs. Per Second:$0.07
Seedance 2.0 is ByteDance’s next-generation multimodal video foundation model focused on cinematic, multi-shot narrative video generation. Unlike single-shot text-to-video demos, Seedance 2.0 emphasizes reference-based control (images, short clips, audio), coherent character/style consistency across shots, and native audio/video synchronization — aiming to make AI video useful for professional creative and previsualization workflows.Claude Opus 4.7 is a hybrid reasoning model designed specifically for frontier-level coding, AI agents, and complex multi-step professional work. Unlike lighter models (e.g., Sonnet or Haiku variants), Opus 4.7 prioritizes depth, consistency, and autonomy on the hardest tasks.
Claude Sonnet 4.6 is our most capable Sonnet model yet. It’s a full upgrade of the model’s skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. Sonnet 4.6 also features a 1M token context window in beta. An advanced model engineered for extremely complex logic and professional demands, representing the highest standard of deep reasoning and precise analytical capabilities. A next-generation multimodal flagship model balancing exceptional performance with efficient response, dedicated to providing comprehensive and stable general-purpose AI services. Per Request:$0.04
GPT Image 2 is openai state-of-the-art image generation model for fast, high-quality image generation and editing. It supports flexible image sizes and high-fidelity image inputs.GPT-5.5 excels in code writing, online research, data analysis, and cross-tool operations. The model not only improves its autonomy in handling complex multi-step tasks but also significantly improves reasoning capabilities and execution efficiency while maintaining the same latency as its predecessor, marking an important step towards automated office automation in AI. Input:$0.416/M
Output:$0.832/M
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, with strong performance across knowledge, math, and software engineering benchmarks.Input:$0.12/M
Output:$0.24/M
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window. It is designed for fast inference and high-throughput workloads, while maintaining strong reasoning and coding performance.Context:400,000
GPT-5.4 nano is designed for tasks where speed and cost matter most like classification, data extraction, ranking, and sub-agents.Context:400,000
Input:$0.6/M
Output:$3.6/M
GPT-5.4 mini brings the strengths of GPT-5.4 to a faster, more efficient model designed for high-volume workloads. Input:$0.4/M
Output:$2.4/M
Core Capabilities Overview: Resolution: Up to 4K (4096×4096), on par with Pro. Reference Image Consistency: Up to 14 reference images (10 objects + 4 characters), maintaining style/character consistency. Extreme Aspect Ratios: New 1:4, 4:1, 1:8, 8:1 ratios added, suitable for long images, posters, and banners. Text Rendering: Advanced text generation, suitable for infographics and marketing poster layouts. Search Enhancement: Integrated Google Search + Image Search. Grounding: Built-in thinking process; complex prompts are reasoned before generation.MiMo-V2.5-Pro is Xiaomi's flagship model, excelling in general-purpose agent capabilities and complex software engineering. MiMo-V2.5 is Xiaomi's native full-modal model. It achieves professional-grade agent performance at about half the cost of inference, while outperforming MiMo-V2-Omni in multimodal perception in image and video understanding tasks. Context:2,000,000
Input:$1.6/M
Output:$4.8/M
Grok 4.20 release introduces a multi-agent architecture (multiple specialized agents coordinated in real time), expanded context modes, and focused improvements to instruction-following, hallucination reduction, and structured/tooled outputs.Input:$0.32/M
Output:$1.92/M
Qwen 3.6-Plus is now available, featuring enhanced code development capabilities and improved efficiency in multimodal recognition and inference, making the Vibe Coding experience even better.Input:$0.76/M
Output:$3.19998/M
Kimi K2.6 is Kimi's latest and most intelligent model, possessing stronger and more stable long-term code writing capabilities, significantly improved instruction compliance and self-correction abilities, and supporting text, image, and video input, thinking and non-thinking modes, and dialogue and agent tasks.
Input:$0.8/M
Output:$3.2/M
GLM-5.1 (released April 2026), purpose-built for long-horizon autonomous tasks. Unlike traditional models optimized for short interactions, GLM-5.1 excels at maintaining goal alignment, reducing strategy drift, and delivering production-grade results over extended periods — up to 8 hours of continuous autonomous work on a single complex task. It represents a major leap in agentic engineering, shifting evaluation from single-turn intelligence to real-world sustained execution.Claude Mythos Preview is our most capable frontier model to date, and shows a
striking leap in scores on many evaluation benchmarks compared to our previous frontier model, Claude Opus 4.6. Input:$0.8/M
Output:$2.4/M
MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.Input:$0.32/M
Output:$1.6/M
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities. 256K context window.Input:$0.24/M
Output:$0.96/M
MiniMax-M2.7 offers the same top-tier intelligence as the standard version—including recursive self-evolution and expert-level office productivity—but is designed for applications requiring sub-second latency and high-speed token generation. Leveraging an enhanced inference backbone architecture, its output speed is 66% faster than the standard model (reaching 100 tps). It is the preferred choice for interactive programming assistants, real-time agent loop execution, and high-throughput enterprise pipelines with stringent completion time requirements.Context:200k
Input:$0.96/M
Output:$3.264/M
GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios.Context:1,050,000
Version of GPT-5.4 that produces smarter and more precise responses.Context:1,050,000
GPT-5.4 is the frontier model for complex professional work. Reasoning.effort supports: none (default), low, medium, high and xhigh.Input:$1.4/M
Output:$11.2/M
GPT-5.3 Instant model used in ChatGPT
Input:$0.2/M
Output:$1.2/M
Gemini 3.1 Flash-Lite is a highly cost-efficient and low-latency Tier-3 model in Google’s Gemini 3 series, designed for high-volume production AI workflow where throughput and speed matter more than maximal reasoning depth. It combines a large multimodal context window with efficient inference performance at a lower cost than most flagship counterparts.Claude Opus 4.6 is Anthropic’s “Opus”-class large language model, released February 2026. It is positioned as a workhorse for knowledge-work and research workflows — improving long-context reasoning, multi-step planning, tool use (including agentic software workflows), and computer-use tasks such as automated slide and spreadsheet generation. Input:$1.5616/M
Output:$9.3696/M
Nano Banana Pro is an AI model for general-purpose assistance in text-centric workflows. It is suitable for instruction-style prompting to generate, transform, and analyze content with controllable structure. Typical uses include chat assistants, document summarization, knowledge QA, and workflow automation. Public technical details are limited; integration aligns with common AI assistant patterns such as structured outputs, retrieval-augmented prompts, and tool or function calling.