思考模式

默认思考行为

GLM-4.7 默认开启 Thinking，这一点不同于 GLM-4.6 的默认“混合 thinking（自动开启）”。如果你想关闭 thinking，请使用：

"thinking": {
    "type": "disabled"
}

交错式思考（Interleaved thinking）

我们默认支持交错式思考（这一点从 GLM 4.5 开始就支持），使 GLM 可以在工具调用之间、以及收到工具结果之后继续思考。这让模型能够进行更复杂的分步推理：在决定下一步行动前先解读每次工具输出，把多次工具调用与推理步骤串联起来，并根据中间结果做更细粒度的决策。

注意：当你在使用“交错思考 + 工具”时，必须显式保留 Reasoning content，并在返回工具结果时一并返回

交错思考的详细流程示意如下： Description

保留式思考（Preserved thinking）

GLM-4.7 在编码场景中引入了一项新能力：模型可以在上下文中保留来自先前 assistant 回合的 reasoning content。这有助于保持推理连续性与对话完整性、提升模型表现，并提高缓存命中率，在真实任务中节省更多 tokens。

该能力在 Coding Plan 端点默认开启、标准 API 端点默认关闭。如果你想在你的产品中开启保留式思考（该能力主要推荐 Coding / Agent 场景使用），你可以通过「“clear_thinking”: False」在 API 端点中开启，并需要将完整、未修改的 reasoning content 传回 API。所有连续的 reasoning content 必须与模型在原始请求期间生成的序列完全一致，不要重新排序或修改这些 content，否则会降低效果并影响缓存命中。

保留式思考的详细流程示意如下： Description

轮级思考

「轮级思考（Turn-level Thinking）」是一种按轮控制推理计算的能力：在同一调用会话中，每一轮请求都可以独立选择开启/关闭思考。这是 GLM-4.7 新引入的能力，具备以下优势：

更灵活的成本/时延控制：对“问个事实/改个措辞”等轻量轮次可关闭思考，追求快速响应；对“复杂规划/多约束推理/代码调试”等重任务轮次可开启思考，提升正确率与稳定性。
更顺滑的多轮体验：思考开关在会话内可随时切换，模型能在不同轮次间保持对话连贯与输出风格一致，让用户感觉“聪明时更聪明、简单时更快”。
更适合 Agent / 工具调用场景：在需要快速执行的工具轮次可降低推理开销，在需要综合工具结果做决策的轮次再开启深度思考，实现效率与质量的动态平衡。

使用示例

该机制同时适用于 Interleaved Thinking 和 Preserved Thinking，无需手动区分。请记得返回历史的 reasoning_content，以保持推理连贯性。

""""Interleaved Thinking + Tool Calling Example"""

import json
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.z.ai/api/paas/v4/",
)

tools = [{"type": "function", "function": {
    "name": "get_weather",
    "description": "Get weather information",
    "parameters": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]},
}}]

messages = [
    {"role": "system", "content": "You are an assistant"},
    {"role": "user", "content": "What's the weather like in Beijing?"},
]

# Round 1: the model reasons and then calls a tool
response = client.chat.completions.create(model="glm-4.7", messages=messages, tools=tools, stream=True, extra_body={
        "thinking":{
        "type":"enabled",
        "clear_thinking": False  # False for Preserved Thinking
    }})
reasoning, content, tool_calls = "", "", []
for chunk in response:
    delta = chunk.choices[0].delta
    if hasattr(delta, "reasoning_content") and delta.reasoning_content:
        reasoning += delta.reasoning_content
    if hasattr(delta, "content") and delta.content:
        content += delta.content
    if hasattr(delta, "tool_calls") and delta.tool_calls:
        for tc in delta.tool_calls:
            if tc.index >= len(tool_calls):
                tool_calls.append({"id": tc.id, "function": {"name": "", "arguments": ""}})
            if tc.function.name:
                tool_calls[tc.index]["function"]["name"] = tc.function.name
            if tc.function.arguments:
                tool_calls[tc.index]["function"]["arguments"] += tc.function.arguments

print(f"Reasoning: {reasoning}\nTool calls: {tool_calls}")

# Key: return reasoning_content to keep the reasoning coherent
messages.append({"role": "assistant", "content": content, "reasoning_content": reasoning,
                 "tool_calls": [{"id": tc["id"], "type": "function", "function": tc["function"]} for tc in tool_calls]})
messages.append({"role": "tool", "tool_call_id": tool_calls[0]["id"],
                 "content": json.dumps({"weather": "Sunny", "temp": "25°C"})})

# Round 2: the model continues reasoning based on the tool result and responds
response = client.chat.completions.create(model="glm-4.7", messages=messages, tools=tools, stream=True, extra_body={
        "thinking":{
        "type":"enabled",
        "clear_thinking": False # False for Preserved Thinking
    }})
reasoning, content = "", ""
for chunk in response:
    delta = chunk.choices[0].delta
    if hasattr(delta, "reasoning_content") and delta.reasoning_content:
        reasoning += delta.reasoning_content
    if hasattr(delta, "content") and delta.content:
        content += delta.content

print(f"Reasoning: {reasoning}\nReply: {content}")

开始使用

模型介绍

模型能力

模型工具

知识库

智能体

平台服务

默认思考行为

交错式思考（Interleaved thinking）

保留式思考（Preserved thinking）

轮级思考

使用示例

开始使用

模型介绍

模型能力

模型工具

知识库

智能体

平台服务

​默认思考行为

​交错式思考（Interleaved thinking）

​保留式思考（Preserved thinking）

​轮级思考

​使用示例

默认思考行为

交错式思考（Interleaved thinking）

保留式思考（Preserved thinking）

轮级思考

使用示例