ModelsSupportEnterpriseBlog
500+ AI Model API, All In One API.Just In CometAPI
Models API
Developer
Quick StartDocumentationAPI Dashboard
Resources
AI ModelsBlogEnterpriseChangelogAbout
2025 CometAPI. All right reserved.Privacy PolicyTerms of Service
Home/Models/Zhipu AI/GLM-4.7
Z

GLM-4.7

Input:$0.96/M
Output:$3.84/M
Context:200K
Max Output:128K
GLM-4.7 is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.
New
Commercial Use
Playground
Overview
Features
Pricing
API

What GLM-4.7 is

GLM-4.7 is Z.ai / Zhipu AI’s latest flagship open-foundation large language model (model name glm-4.7). It is positioned as a developer-oriented “thinking” model with particular improvements in coding/agentic task execution, multi-step reasoning, tool invocation, and long-context workflows. The release emphasizes large context handling (up to 200K context), high maximum output (up to 128K tokens), and specialized “thinking” modes for agentic pipelines.

Main features

  • Agentic / tool-use improvements: Built-in thinking modes (“Interleaved Thinking”, “Preserved Thinking”, turn-level control) to let the model “think before acting”, retain reasoning across turns, and be more stable when calling tools or executing multi-step tasks. This is aimed at robust agent workflows (terminals, tool chains, web browsing).
  • Coding & terminal competence: Significant improvements on coding benchmarks and terminal automation tasks — vendor benchmarks show clear gains vs GLM-4.6 in SWE-bench and Terminal Bench metrics. This translates to better multi-turn code generation, command sequencing and recovery in agent environments.
  • “Vibe coding” / frontend output quality: Improved default UI/layout quality for generated HTML, slides and presentations (cleaner layouts, sizing, better visual defaults).
  • Long-context workflows: 200K token context window and tools for context caching; practical for multi-file codebases, long documents, and multi-round agent sessions.

Benchmark performance

GLM-4.7’s publisher/maintainers and community benchmark tables report substantial gains vs GLM-4.6 and competitive results against other contemporary models on coding, agentic and tool usage tasks. Selected numbers (source: official Hugging Face / Z.AI published tables):

  • LiveCodeBench-v6 (coding agent benchmark): 84.9 (open-source SOTA cited).
  • SWE-bench Verified (coding): 73.8% (up from 68.0% in GLM-4.6).
  • SWE-bench Multilingual: 66.7% (+12.9% vs GLM-4.6).
  • Terminal Bench 2.0 (agentic terminal actions): 41.0% (notable +16.5% improvement over 4.6).
  • HLE (complex reasoning with tools): 42.8% when used with tools (big improvement reported vs prior versions).
  • τ²-Bench (interactive tool invocation): 87.4 (reported open-source SOTA).

Typical use cases & example scenarios

  • Agentic coding assistants: Autonomous or semi-autonomous code generation, multi-turn code fixes, terminal automation and CI/CD scripting.
  • Tool-driven agents: Web browsing, API orchestration, multi-step workflows (supported by preserved thinking & function calling).
  • Front-end and UI generation: Automatic website scaffolding, slide decks, posters with improved aesthetics and layout.
  • Research & long-context tasks: Document summarization, literature synthesis, and retrieval-augmented generation across long documents (200k token window is helpful here).
  • Interactive educational agents / coding tutors: Multi-turn tutoring with preserved reasoning that remembers prior reasoning blocks across a session.

How to access and use GLM 4.7 API

Step 1: Sign Up for API Key

Log in to cometapi.com. If you are not our user yet, please register first. Sign into your CometAPI console. Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.

Step 2: Send Requests to MiniMax M2.1 API

Select the “glm-4.7” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience. Replace <YOUR_API_KEY> with your actual CometAPI key from your account. Where to call it: Chat-style APIs.

Insert your question or request into the content field—this is what the model will respond to . Process the API response to get the generated answer.

Step 3: Retrieve and Verify Results

Process the API response to get the generated answer. After processing, the API responds with the task status and

Features for GLM-4.7

Explore the key features of GLM-4.7, designed to enhance performance and usability. Discover how these capabilities can benefit your projects and improve user experience.

Pricing for GLM-4.7

Explore competitive pricing for GLM-4.7, designed to fit various budgets and usage needs. Our flexible plans ensure you only pay for what you use, making it easy to scale as your requirements grow. Discover how GLM-4.7 can enhance your projects while keeping costs manageable.
Comet Price (USD / M Tokens)Official Price (USD / M Tokens)Discount
Input:$0.96/M
Output:$3.84/M
Input:$1.2/M
Output:$4.8/M
-20%

Sample code and API for GLM-4.7

Access comprehensive sample code and API resources for GLM-4.7 to streamline your integration process. Our detailed documentation provides step-by-step guidance, helping you leverage the full potential of GLM-4.7 in your projects.
Python
JavaScript
Curl
from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com/v1"

client = OpenAI(base_url=BASE_URL, api_key=COMETAPI_KEY)

# glm-4.7: Zhipu GLM-4.7 model via chat/completions
completion = client.chat.completions.create(
    model="glm-4.7",
    messages=[
        {"role": "user", "content": "Hello! Tell me a short joke."}
    ]
)

print(completion.choices[0].message.content)

Python Code Example

from openai import OpenAI
import os

# Get your CometAPI key from https://api.cometapi.com/console/token
COMETAPI_KEY = os.environ.get("COMETAPI_KEY") or "<YOUR_COMETAPI_KEY>"
BASE_URL = "https://api.cometapi.com/v1"

client = OpenAI(base_url=BASE_URL, api_key=COMETAPI_KEY)

# glm-4.7: Zhipu GLM-4.7 model via chat/completions
completion = client.chat.completions.create(
    model="glm-4.7",
    messages=[
        {"role": "user", "content": "Hello! Tell me a short joke."}
    ]
)

print(completion.choices[0].message.content)

JavaScript Code Example

import OpenAI from "openai";

// Get your CometAPI key from https://api.cometapi.com/console/token
const COMETAPI_KEY = process.env.COMETAPI_KEY || "<YOUR_COMETAPI_KEY>";
const BASE_URL = "https://api.cometapi.com/v1";

const client = new OpenAI({
  apiKey: COMETAPI_KEY,
  baseURL: BASE_URL,
});

// glm-4.7: Zhipu GLM-4.7 model via chat/completions
async function main() {
  const completion = await client.chat.completions.create({
    model: "glm-4.7",
    messages: [{ role: "user", content: "Hello! Tell me a short joke." }],
  });

  console.log(completion.choices[0].message.content);
}

main().catch(console.error);

Curl Code Example

#!/bin/bash

# Get your CometAPI key from https://api.cometapi.com/console/token
COMETAPI_KEY="${COMETAPI_KEY:-<YOUR_COMETAPI_KEY>}"

# glm-4.7: Zhipu GLM-4.7 model via chat/completions
curl -s https://api.cometapi.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $COMETAPI_KEY" \
  -d '{
    "model": "glm-4.7",
    "messages": [
      {
        "role": "user",
        "content": "Hello! Tell me a short joke."
      }
    ]
  }'

More Models

A

Claude Opus 4.6

Input:$4/M
Output:$20/M
Claude Opus 4.6 is Anthropic’s “Opus”-class large language model, released February 2026. It is positioned as a workhorse for knowledge-work and research workflows — improving long-context reasoning, multi-step planning, tool use (including agentic software workflows), and computer-use tasks such as automated slide and spreadsheet generation.
A

Claude Sonnet 4.6

Input:$2.4/M
Output:$12/M
Claude Sonnet 4.6 is our most capable Sonnet model yet. It’s a full upgrade of the model’s skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. Sonnet 4.6 also features a 1M token context window in beta.
O

GPT-5.4 nano

Input:$0.16/M
Output:$1/M
GPT-5.4 nano is designed for tasks where speed and cost matter most like classification, data extraction, ranking, and sub-agents.
O

GPT-5.4 mini

Input:$0.6/M
Output:$3.6/M
GPT-5.4 mini brings the strengths of GPT-5.4 to a faster, more efficient model designed for high-volume workloads.
A

Claude Mythos Preview

A

Claude Mythos Preview

Coming soon
Input:$60/M
Output:$240/M
Claude Mythos Preview is our most capable frontier model to date, and shows a striking leap in scores on many evaluation benchmarks compared to our previous frontier model, Claude Opus 4.6.
X

mimo-v2-pro

Input:$0.8/M
Output:$2.4/M
MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.

Related Blog

GLM-5 vs GLM-4.7: what changed, what matters, and should you upgrade?
Feb 26, 2026
glm-5
glm-4-7

GLM-5 vs GLM-4.7: what changed, what matters, and should you upgrade?

GLM-5, released February 11, 2026 by Zhipu AI (Z.ai), represents a large architectural leap from GLM-4.7: bigger MoE scale (≈744B vs ~355B total params), higher active parameter capacity, lower measured hallucination, and clear gains on agentic and coding benchmarks — at a cost in inference complexity and (sometimes) latency.
How to Use GLM-4.7-Flash Locally?
Jan 21, 2026
glm-4-7
glm-4-7

How to Use GLM-4.7-Flash Locally?

GLM-4.7-Flash is a lightweight, high-performance 30B A3B MoE member of the GLM-4.7 family designed to enable local and low-cost deployment for coding, agentic workflows and general reasoning. You can run it locally three practical ways: (1) via Ollama (easy, managed local runtime), (2) via Hugging Face / Transformers / vLLM / SGLang (GPU-first server deployment), or (3) via GGUF + llama.cpp / llama-cpp-python (CPU/edge friendly).
GLM-4.7 Released: What Does This Mean for AI  Intelligence?
Dec 23, 2025
glm-4-7

GLM-4.7 Released: What Does This Mean for AI Intelligence?

On December 22, 2025, Zhipu AI (Z.ai) officially released GLM-4.7, the newest iteration in its General Language Model (GLM) family — drawing global attention in the world of open-source AI models. This model not only advances capabilities in coding and reasoning tasks, but also challenges the dominance of proprietary models like GPT-5.2 and Claude Sonnet 4.5 in key benchmarks.