> ## Documentation Index
> Fetch the complete documentation index at: https://docs.bigmodel.cn/llms.txt
> Use this file to discover all available pages before exploring further.

# GLM-4-Voice

## 概览

GLM-4-Voice 是智谱推出的首个端到端语音模型。它能够直接理解和生成中英文语音，实现实时语音对话，并可根据用户指令灵活调整语音的情感、语调、语速和方言等特性，使语音交互更加自然生动。

<CardGroup cols={3}>
  <Card title="价格" icon={<svg style={{maskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/coins.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=d140ba7189994790a79f83f5a763f59a)", WebkitMaskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/coins.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=d140ba7189994790a79f83f5a763f59a)", maskRepeat: "no-repeat", maskPosition: "center center",}} className={"h-6 w-6 bg-primary dark:bg-primary-light !m-0 shrink-0"} />}>
    80 元 / 百万 Tokens
  </Card>

  <Card title="输入模态" icon={<svg style={{maskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/arrow-down-right.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=088a58fa0b1a4048d5c6fab7841133c8)", WebkitMaskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/arrow-down-right.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=088a58fa0b1a4048d5c6fab7841133c8)", maskRepeat: "no-repeat", maskPosition: "center center",}} className={"h-6 w-6 bg-primary dark:bg-primary-light !m-0 shrink-0"} />}>
    音频、文本
  </Card>

  <Card title="输出模态" icon={<svg style={{maskImage: "url(https://mintcdn.com/zhipu-ef7018ed/Skp28ct-clfAIOZo/resource/icon/arrow-down-left.svg?fit=max&auto=format&n=Skp28ct-clfAIOZo&q=85&s=1ed65b58aa7a484b387f01be25d99278)", WebkitMaskImage: "url(https://mintcdn.com/zhipu-ef7018ed/Skp28ct-clfAIOZo/resource/icon/arrow-down-left.svg?fit=max&auto=format&n=Skp28ct-clfAIOZo&q=85&s=1ed65b58aa7a484b387f01be25d99278)", maskRepeat: "no-repeat", maskPosition: "center center",}} className={"h-6 w-6 bg-primary dark:bg-primary-light !m-0 shrink-0"} />}>
    音频
  </Card>

  <Card title="上下文窗口" icon={<svg style={{maskImage: "url(https://mintcdn.com/zhipu-ef7018ed/Skp28ct-clfAIOZo/resource/icon/arrow-down-arrow-up.svg?fit=max&auto=format&n=Skp28ct-clfAIOZo&q=85&s=ccc051baa101b9a46d0d9bc5fad04877)", WebkitMaskImage: "url(https://mintcdn.com/zhipu-ef7018ed/Skp28ct-clfAIOZo/resource/icon/arrow-down-arrow-up.svg?fit=max&auto=format&n=Skp28ct-clfAIOZo&q=85&s=ccc051baa101b9a46d0d9bc5fad04877)", maskRepeat: "no-repeat", maskPosition: "center center",}} className={"h-6 w-6 bg-primary dark:bg-primary-light !m-0 shrink-0"} />}>
    8K
  </Card>

  <Card title="最大输出 Tokens" icon={<svg style={{maskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/maximize.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=743c202becf04d91d943f9014a3fe67f)", WebkitMaskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/maximize.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=743c202becf04d91d943f9014a3fe67f)", maskRepeat: "no-repeat", maskPosition: "center center",}} className={"h-6 w-6 bg-primary dark:bg-primary-light !m-0 shrink-0"} />}>
    4K
  </Card>
</CardGroup>

## 推荐场景

<AccordionGroup>
  <Accordion title="角色陪伴" defaultOpen>
    AI 通过虚拟角色（如游戏角色、虚拟偶像）与用户进行情感化对话，虚拟角色可以设定为特定性格、背景和声音，实现全天候陪伴。
  </Accordion>

  <Accordion title="智能导游" defaultOpen>
    AI 导游与用户进行实时语音交互，为用户提供详细的历史背景、文化意义和建筑特点，通过语音描述帮助用户规划游览路线，解答用户关于景点的疑问。
  </Accordion>

  <Accordion title="英语学习" defaultOpen>
    AI 英语老师通过模拟真实场景（如点餐、问路）与用户进行对话练习，解答用户关于语法规则的疑问，实时纠正用户发音、学习日常表达和语法知识，并提供改进建议。
  </Accordion>

  <Accordion title="在线教育" defaultOpen>
    AI 辅导老师与学生通过详细讲解课程内容，为学生提供课程讲解、作业辅导和学习建议，涵盖多个学科（如数学、历史、科学），解答学生在作业中遇到的问题，通过多轮对话帮助学生理解难点。
  </Accordion>
</AccordionGroup>

## 使用资源

[接口文档](/api-reference/%E6%A8%A1%E5%9E%8B-api/%E5%AF%B9%E8%AF%9D%E8%A1%A5%E5%85%A8)：API 调用方式

## 详细介绍

<Steps>
  <Step icon={<svg style={{maskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/star.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=b3c8448dccf8f96abadf9a72e51b3cca)", WebkitMaskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/star.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=b3c8448dccf8f96abadf9a72e51b3cca)", maskRepeat: "no-repeat", maskPosition: "center center",}} className={"h-6 w-6 bg-primary dark:bg-primary-light !m-0 shrink-0"} />}>
    凭借其实时语音对话功能，GLM-4-Voice 为用户提供高效流畅的沟通体验。GLM-4-Voice具备情感表达、方言生成和语速调节的能力，同时支持中英文双语。它的应用场景广泛，覆盖虚拟角色互动、智慧教育、智能旅游、儿童陪伴等多个领域。通过灵活的语音输入和输出能力，GLM-4-Voice 能够为用户提供高效且个性化的服务体验。

    在企业应用方面，GLM-4-Voice 可针对不同垂直行业定制专业的场景解决方案，帮助开发者以较低成本快速适应和融入大模型时代。
  </Step>
</Steps>

# <svg style={{maskImage: "url(https://mintcdn.com/zhipu-ef7018ed/6jZAOYw-eXEZh1pv/resource/icon/code.svg?fit=max&auto=format&n=6jZAOYw-eXEZh1pv&q=85&s=2f67130d1597ee0b68135487ec31662f)", maskRepeat: "no-repeat", maskPosition: "center center",}} className={"h-6 w-6 bg-primary dark:bg-primary-light !m-0 shrink-0"} />   调用示例

<Tabs>
  <Tab title="Python">
    **安装 SDK**

    ```bash theme={null}
    # 安装最新版本
    pip install zai-sdk
    # 或指定版本
    pip install zai-sdk==0.2.3
    ```

    **验证安装**

    ```python theme={null}
    import zai
    print(zai.__version__)
    ```

    **调用示例**

    ```python theme={null}
    import wave
    import base64
    from zai import ZhipuAiClient

    def save_audio_as_wav(audio_data, filepath):
        """保存音频数据为 WAV 文件（模型返回的语音用）"""
        with wave.open(filepath, 'wb') as wav_file:
            wav_file.setnchannels(1)
            wav_file.setsampwidth(2)
            wav_file.setframerate(44100)
            wav_file.writeframes(audio_data)
        print(f"Audio saved to {filepath}")

    def get_base64_from_wav(wav_path):
        """将 WAV 文件转为 Base64 编码字符串"""
        with open(wav_path, "rb") as f:
            audio_bytes = f.read()
        return base64.b64encode(audio_bytes).decode("utf-8")

    client = ZhipuAiClient(api_key="your_api_key")  # 请填写您自己的 APIKey

    input_wav_path = "your_voice.wav"  # 你的 WAV 文件路径
    base64_voice = get_base64_from_wav(input_wav_path)

    response = client.chat.completions.create(
        model="glm-4-voice",
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": "你好，这是我的语音输入测试，请慢速复述一遍"
                    },
                    {
                        "type": "input_audio",
                        "input_audio": {
                            "data": base64_voice,
                            "format": "wav"
                        }
                    }
                ]
            }
        ],
        stream=False
    )

    print(response.choices[0].message.content)

    # 解析并保存模型返回的语音
    try:
        audio_data = response.choices[0].message.audio['data']
        decoded_data = base64.b64decode(audio_data)
        save_audio_as_wav(decoded_data, "output.wav")
    except Exception as e:
        print("处理音频失败：", e)
    ```
  </Tab>

  <Tab title="Java">
    **安装 SDK**

    **Maven**

    ```xml theme={null}
    <dependency>
        <groupId>ai.z.openapi</groupId>
        <artifactId>zai-sdk</artifactId>
        <version>0.3.5</version>
    </dependency>
    ```

    **Gradle (Groovy)**

    ```groovy theme={null}
    implementation 'ai.z.openapi:zai-sdk:0.3.5'
    ```

    **调用示例**

    ```java theme={null}
      import ai.z.openapi.ZhipuAiClient;
      import ai.z.openapi.service.model.ChatCompletionCreateParams;
      import ai.z.openapi.service.model.ChatCompletionResponse;
      import ai.z.openapi.service.model.ChatMessage;
      import ai.z.openapi.service.model.ChatMessageRole;
      import ai.z.openapi.service.model.InputAudio;
      import ai.z.openapi.service.model.MessageContent;

      import java.io.File;
      import java.io.IOException;
      import java.nio.file.Files;
      import java.util.Arrays;
      import java.util.Base64;
      import java.util.Collections;

      public class GLM4VoiceExample {
          public static void main(String[] args) throws IOException {
              ZhipuAiClient client = ZhipuAiClient.builder().ofZHIPU().apiKey("API_KEY").build();
              File audioFile = new File("your_path.asr.wav");
              byte[] audioBytes = Files.readAllBytes(audioFile.toPath());
              String base64 = Base64.getEncoder().encodeToString(audioBytes);
              ChatCompletionCreateParams request = ChatCompletionCreateParams.builder().model("glm-4-voice")
                  .messages(Collections.singletonList(ChatMessage.builder()
                  .role(ChatMessageRole.USER.value())
                  .content(
                      Arrays.asList(MessageContent.builder().type("text").text("你好，这是我的语音输入测试").build(),
                      MessageContent.builder().type("input_audio").inputAudio(InputAudio.builder()
                          .data(base64).format("wav").build()).build())).build())).build();
              ChatCompletionResponse response = client.chat().createChatCompletion(request);
              if (response.isSuccess()) {
                  Object reply = response.getData().getChoices().get(0).getMessage().getContent();
                  System.out.println(reply);
              } else {
                  System.err.println("错误: " + response.getMsg());
              }
          }
    }
    ```
  </Tab>

  <Tab title="旧版 Python">
    ```python theme={null}
    import zhipuai
    import wave
    import base64

    def get_base64_from_wav(wav_path):
        """将 WAV 文件转为 Base64 编码字符串"""
        with open(wav_path, "rb") as f:
            audio_bytes = f.read()
        return base64.b64encode(audio_bytes).decode("utf-8")

    zhipuai.api_key = "your_api_key"  # 请填写您自己的 APIKey

    input_wav_path = "your_voice.wav"
    base64_voice = get_base64_from_wav(input_wav_path)

    response = zhipuai.model_api.invoke(
        model="glm-4-voice",
        prompt="你好，这是我的语音输入测试",
        audio_data=base64_voice,
        audio_format="wav"
    )

    print(response)
    ```
  </Tab>

  <Tab title="输出示例">
    ```json theme={null}
    {
        "id": "20250605132035222ead927d794645",
        "object": "chat.completion",
        "created": 1749187238,
        "model": "glm-4-voice",
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": "你好！我听到了你的语音输入。有什么我可以帮助你的吗？",
                    "audio": {
                        "data": "707hTvovBW8zH3FPxH/1sCvgTXB/kJPQtJCqMIEgcCBUcDRQBZ...",
                        "expires_at": 1749187238,
                        "id": "f8d4bf4b-a376-48e6-8c81-54bb6a9a31d0"
                    }
                },
                "finish_reason": "stop"
            }
        ],
        "usage": {
            "prompt_tokens": 107,
            "completion_tokens": 340,
            "total_tokens": 447
        },
        "request_id": "20250605132035222ead927d794645"
    }
    ```
  </Tab>
</Tabs>

## 用户并发权益

API 调用会受到速率限制，当前我们限制的维度是请求并发数量（在途请求任务数量）。不同等级的用户并发保障如下。

| V0 | V1 | V2 | V3 |
| :- | :- | :- | :- |
| 5  | 10 | 15 | 20 |
