You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

110 lines
7.9 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
title: Google Gemini 系列 Tools Calling 评测
description: >-
使用 LobeChat 测试 Google Gemini 系列模型Gemini 1.5 Pro / Gemini 1.5
Flash的工具调用Function Calling能力并展现评测结果
tags:
- Tools Calling
- Benchmark
- Function Calling 评测
- 工具调用
- 插件
---
# Google Gemini 系列 Tools Calling
Google Gemini 系列模型 Tools Calling 能力一览:
| 模型 | 支持 Tools Calling | 流式 Stream | 并发Parallel | 简单指令得分 | 复杂指令 |
| ---------------- | ---------------- | ----------- | ------------ | ------ | ---- |
| Gemini 1.5 Pro | ✅ | ❌ | ✅ | ⛔ | ⛔ |
| Gemini 1.5 Flash | ❌ | ❌ | ❌ | ⛔ | ⛔ |
<Callout type={'important'}>
根据我们的的实际测试,强烈建议不要给 Gemini 开启插件,因为目前(截止 2024.07.07)它的 Tools Calling
能力实在太烂了。
</Callout>
## Gemini 1.5 Pro
### 简单调用指令:天气查询
测试指令:指令 ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/a5a35431-2a15-4e79-97d5-502637f829bc" />
Gemini 输出的 json 中name 是错误的,因此 LobeChat 无法识别到它调用了什么插件。(入参中,天气插件的 name 为 `realtime-weather____fetchCurrentWeather`,而 Gemini 返回的是 `weather____fetchCurrentWeather`)。
<Image alt="Gemini 1.5 Pro 简单指令的 Tools Calling" src="https://github.com/lobehub/lobe-chat/assets/28616219/1e077799-c25e-43c7-8492-c5c0bb9aed9b" />
<details>
<summary>Tools Calling 原始输出:</summary>
```yml
[stream start] 2024-7-7 17:53:25.647
[chunk 0] 2024-7-7 17:53:25.654
{"candidates":[{"content":{"parts":[{"text":"好的"}],"role":"model"},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":1,"totalTokenCount":96}}
[chunk 1] 2024-7-7 17:53:26.288
{"candidates":[{"content":{"parts":[{"text":"\n\n"}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":1,"totalTokenCount":96}}
[chunk 2] 2024-7-7 17:53:26.336
{"candidates":[{"content":{"parts":[{"functionCall":{"name":"weather____fetchCurrentWeather","args":{"city":"杭州"}}},{"functionCall":{"name":"weather____fetchCurrentWeather","args":{"city":"北京"}}}],"role":"model"},"finishReasoSTOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":79,"totalTokenCount":174}}
[stream finished] total chunks: 3
```
</details>
### 复杂调用指令:文生图
测试指令:指令 ②
<Image alt="Gemini 1.5 Pro 复杂指令的 Tools Calling" src="https://github.com/lobehub/lobe-chat/assets/28616219/a2454a60-3271-4786-861f-d49ceac1316e" />
在测试复杂指令集时Google 直接抛错:
```json
{
"message": "[400 Bad Request] Invalid JSON payload received. Unknown name \"maxItems\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"minItems\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[1].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[3].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[4].value': Cannot find field. [{\"@type\":\"type.googleapis.com/google.rpc.BadRequest\",\"fieldViolations\":[{\"field\":\"tools[0].function_declarations[0].parameters.properties[0].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"maxItems\\\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[0].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"minItems\\\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[1].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[1].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[3].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[3].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[4].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[4].value': Cannot find field.\"}]}]"
}
```
上述抛错中提到并不支持包含 `maxItems` 的 schema因此 Gemini 1.5 Pro 相当于无法使用 DallE 插件。
相关 issue:
- [Support for minItems and maxItems for FunctionDeclarationSchemaType.ARRAY?](https://github.com/google-gemini/generative-ai-js/issues/200)
- [Gemini Models unusable when dalle plugin is enabled](https://github.com/lobehub/lobe-chat/issues/2537)
综合以上两个测试来看Google 的 Tool Calling 能力似乎是支持了,但是几乎没法在日常中使用,我个人认为已经等于虚假宣传了。
## Gemini 1.5 Flash
### 简单调用指令:天气查询
测试指令:指令 ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/6cab77e8-d761-4a91-8325-a61748cebac1" />
而 Gemini 1.5 flash 更为抽象说完调用就结束了。结合以下原始输出可以看到Gemini 1.5 Flash 并没有输出 Tool Calling 的数据,因此可以说是完全不可用。
```yml
stream start] 2024-7-7 19:4:50.936
[chunk 0] 2024-7-7 19:4:50.943
{"candidates":[{"content":{"parts":[{"text":"好的"}],"role":"model"},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":1,"totalTokenCount":97}}
[chunk 1] 2024-7-7 19:4:52.209
{"candidates":[{"content":{"parts":[{"text":",请稍等,我正在查询杭州和北京的天气信息。 "}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"ATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":16,"totalTokenCount":112}}
[chunk 2] 2024-7-7 19:4:53.288
{"candidates":[{"content":{"parts":[{"text":"\n"}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":16,"totalTokenCount":112}}
[stream finished] total chunks: 3
```
### 复杂调用指令:文生图
测试指令:指令 ②
该指令和 Gemini 1.5 Pro 的复杂指令一样,直接抛错,因此不再详细展开。