> ## Documentation Index > Fetch the complete documentation index at: https://docs.bigmodel.cn/llms.txt > Use this file to discover all available pages before exploring further. # GLM-4V-Plus-0111 ## 概览 GLM-4V-Plus-0111 是智谱新一代视觉理解模型，具备视觉总结 + 视觉修改 + 推理能力 + 多轮对话 + 时间问答能力，支持视觉问答、图像字幕、视觉定位、复杂目标检测等各类图像和视频理解任务。 }> 4 元 / 百万 Tokens }> 视频、图像、文本 }> 文本 }> 16K }> 动态计算：上下文-输入 ## 能力支持 }> 强大的视觉理解能力，支持图片，视频 }> 支持实时流式响应，提升用户交互体验 ## 推荐场景支持解析广告视频的视觉元素（如品牌标识露出时长、场景构图合理性），自动分类内容风格并标注情感倾向，结合视频事件分析能力评估目标受众匹配度，为广告策划提供数据化优化建议。分析教学图片、视频内容，自动总结知识点，通过视觉问答解答学生疑问，辅助教师快速生成图文并茂的优质课件。对生产线上的产品图像进行复杂目标检测，快速定位瑕疵位置，通过视觉总结生成质检报告，实现高效自动化质量把控。实时抓取视频流与图片库，精准分类媒体平台违规内容并自动分割问题片段，通过视频打标签与事件分析生成结构化审核报告，助力平台高效完成合规性筛查。 ## 使用资源快速测试模型在业务场景上的效果 API 调用方式 ## 详细介绍 GLM-4V-Plus-0111 在不牺牲任何 NLP 任务性能的情况下，实现了视觉语言特征的深度融合；同时得益于原生分辨率输入的优势，加上数据飞轮持续进行幻觉优化，具备更低幻觉和更全面的视觉内容描述能力。除了详细描述的能力，GLM-4V-Plus-0111 还能对图像视频数据进行分类，取标题，打标签，用户可以通过自定义提示词，提升图像视频数据的处理效率或者构建自动化视觉数据流程。视频数据不同于图像，其具有额外的时间维度。而大部分其他视觉理解模型，由于没有时间戳信息作为输入，因此并不具备时间感知和时间问答的能力。GLM-4V-Plus-0111 的时间问答能力，可以帮助我们快速定位发生特定事件的时间点，从而实现对视频的语义分割和视频自动化剪辑。当图像分辨率、视频帧率过低的时候，即便是人类也无法感知视频中的微小变动。GLM-4V-Plus-0111 具备更加精细的动作理解能力，精准解析视觉细节。 GLM-4V-Plus-0111 具备卓越的多模态理解能力，可同时处理5张图像。通过可变分辨率技术，GLM-4V-Plus-0111 能够适应更多的视频长度和视频分辨率，最高可支持长达2小时的视频理解，突破了视频处理的上限，拓宽视觉理解模型的业务场景。 ## 调用示例 **安装 SDK** ```bash theme={null} # 安装最新版本 pip install zai-sdk # 或指定版本 pip install zai-sdk==0.2.3 ``` **验证安装** ```python theme={null} import zai print(zai.__version__) ``` ### 上传图片 URL ```python theme={null} from zai import ZhipuAiClient client = ZhipuAiClient(api_key="your-api-key") # 填写您自己的 APIKey response = client.chat.completions.create( model="glm-4v", # 请填写您要调用的模型名称 messages=[ { 'role': 'user', 'content': [ {'type': 'text', 'text': "What's in this image?"}, {'type': 'image_url', 'image_url': {'url': f'https://img.iplaysoft.com/wp-content/uploads/2019/free-images/free_stock_photo.jpg'}}, ], } ], temperature=0.5, max_tokens=2000, ) print(response.choices[0].message.content) ``` ### 上传图片 Base64 ```python theme={null} import base64 from zai import ZhipuAiClient img_path = "/Users/YourComputer/xxxx.jpeg" with open(img_path, "rb") as img_file: img_base = base64.b64encode(img_file.read()).decode("utf-8") client = ZhipuAiClient(api_key="YOUR API KEY") # 填写您自己的 APIKey response = client.chat.completions.create( model="glm-4v-plus-0111", # 填写需要调用的模型名称 messages=[ { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": img_base } }, { "type": "text", "text": "请描述这个图片" } ] } ] ) print(response.choices[0].message.content) ``` ### 多轮图片对话 ```python theme={null} from zai import ZhipuAiClient client = ZhipuAiClient(api_key="your-api-key") # 填写您自己的 APIKey response = client.chat.completions.create( model="glm-4v-plus-0111", # 填写需要调用的模型名称 messages=[ { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": "https://cdn.bigmodel.cn/xxxxxx.png" } }, { "type": "text", "text": "图中有什么" } ] }, { "role": "assistant", "content": [ { "type": "text", "text": ( "这是一幅描绘自然风景的画，展现了一片宁静的湖泊，湖水呈现出深浅不一的蓝绿色调。" "湖边长着一棵盛开的樱花树，粉白色的花朵在阳光下显得格外鲜艳。树下有一位身穿蓝色衣服的人坐在船上，" "似乎正在享受这美好的时光。\n\n" "背景是连绵起伏的山脉，山体被绿色的植被覆盖，山顶则被云雾缭绕，增添了几分神秘感。" "远处还可以看到一座小亭子，它坐落在湖边的岩石上，与周围的自然环境和谐相融。\n\n" "整个画面给人一种宁静、和谐的感觉，仿佛置身于一个世外桃源之中。" ) } ] }, { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": "https://cdn.bigmodel.cn/xxxxxxx.jpeg" } }, { "type": "text", "text": "这个图与上面图有什么不一样" } ] }, { "role": "assistant", "content": [ { "type": "text", "text": ( "第一张图和第二张图在画面内容、构图、色彩等方面都有很大的不同，形成鲜明的对比。\n\n" "1.第一张图是风景画，主要描绘了自然风景，如樱花树、湖泊、山脉等，色彩丰富，充满春天的气息，" "而第二张图是描绘长城在冬季的风景照片，长城在白雪覆盖的山脉中蜿蜒盘旋，整体色调偏冷，" "给人一种寒冷、宁静的感觉。\n\n" "2.构图上，第一张图采用对称式构图，以樱花树为中心，两侧的景物相互呼应，形成平衡和谐的画面效果，" "第二张图则采用对角线构图，长城作为视觉引导线，引导观众的视线穿越整个画面，增加画面的深度和动态感。\n\n" "整体来看，这两张图在画面内容、色彩、构图和氛围上都呈现出明显的对比，一张是充满生机和色彩的春日风景画，" "另一张是宁静而神秘的冬日长城风景照。" ) } ] }, { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": "https://cdn.bigmodel.cn/xxxxxxx.jpeg" } }, { "type": "text", "text": "这个图与上一张图有什么区别" } ] } ] ) print(response.choices[0].message.content) ``` **安装 SDK** **Maven** ```xml theme={null} ai.z.openapi zai-sdk 0.3.5 ``` **Gradle (Groovy)** ```groovy theme={null} implementation 'ai.z.openapi:zai-sdk:0.3.5' ``` **调用示例** ```java theme={null} import ai.z.openapi.ZhipuAiClient; import ai.z.openapi.service.model.*; import ai.z.openapi.core.Constants; import java.util.Arrays; public class GLM4VPlusExample { public static void main(String[] args) { String apiKey = ""; // 请填写您自己的APIKey ZhipuAiClient client = ZhipuAiClient.builder().ofZHIPU() .apiKey(apiKey) .build(); ChatCompletionCreateParams request = ChatCompletionCreateParams.builder() .model("glm-4v-plus-0111") .messages(Arrays.asList( ChatMessage.builder() .role(ChatMessageRole.USER.value()) .content(Arrays.asList( MessageContent.builder() .type("text") .text("这张图片中有什么？") .build(), MessageContent.builder() .type("image_url") .imageUrl(ImageUrl.builder() .url("https://aigc-files.bigmodel.cn/api/cogview/20250723213827da171a419b9b4906_0.png") .build()) .build())) .build() )) .build(); ChatCompletionResponse response = client.chat().createChatCompletion(request); if (response.isSuccess()) { Object reply = response.getData().getChoices().get(0).getMessage().getContent(); System.out.println(reply); } else { System.err.println("错误: " + response.getMsg()); } } } ``` ### 上传视频 URL ```python theme={null} from zhipuai import ZhipuAI client = ZhipuAI(api_key="your-api-key") # 填写您自己的 APIKey response = client.chat.completions.create( model="glm-4v-plus", # Fill in the model name to be called messages=[ { "role": "user", "content": [ { "type": "video_url", "video_url": { "url" : "https://sfile.chatglm.cn/testpath/video/b844f8f1-5df9-556c-a515-3d3bfaa736e8_0.mp4" } }, { "type": "text", "text": "Please describe this video in detail" } ] } ] ) print(response.choices[0].message) ``` ### 上传视频 Base64 ```python theme={null} import base64 from zhipuai import ZhipuAI video_path = "/Users/YourCompluter/xxxx.mp4" with open(video_path, 'rb') as video_file: video_base = base64.b64encode(video_file.read()).decode('utf-8') client = ZhipuAI(api_key="YOUR API KEY") # 填写您自己的APIKey response = client.chat.completions.create( model="glm-4v-plus-0111", # 填写需要调用的模型名称 messages=[ { "role": "user", "content": [ { "type": "video_url", "video_url": { "url" : video_base } }, { "type": "text", "text": "请仔细描述这个视频" } ] } ] ) print(response.choices[0].message) ``` ## 用户并发权益 API 调用会受到速率限制，当前我们限制的维度是请求并发数量（在途请求任务数量）。不同等级的用户并发保障如下。 | V0 | V1 | V2 | V3 | | :- | :- | :- | :- | | 5 | 15 | 30 | 40 |