You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

153 lines
6.1 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
title: Anthropic Claude 系列 Tools Calling 评测
description: >-
使用 LobeChat 测试 Anthropic Claude 系列模型Claude 3.5 sonnet / Claude 3 Opus /
Claude 3 haiku 的工具调用Function Calling能力并展现评测结果
tags:
- Tools Calling
- Benchmark
- Function Calling 评测
- 工具调用
- 插件
---
# Anthropic Claude Series Tools Calling
Overview of Anthropic Claude Series model Tools Calling capabilities:
| Model | Support Tools Calling | Stream | Parallel | Simple Instruction Score | Complex Instruction |
| ----------------- | --------------------- | ------ | -------- | ------------------------ | ------------------- |
| Claude 3.5 Sonnet | ✅ | ✅ | ✅ | 🌟🌟🌟 | 🌟🌟 |
| Claude 3 Opus | ✅ | ✅ | ❌ | 🌟 | ⛔️ |
| Claude 3 Sonnet | ✅ | ✅ | ❌ | 🌟🌟 | ⛔️ |
| Claude 3 Haiku | ✅ | ✅ | ❌ | 🌟🌟 | ⛔️ |
## Claude 3.5 Sonnet
### Simple Instruction Call: Weather Query
Test Instruction: Instruction ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/42a6980c-ea2a-44fd-b61f-a7989827f5a5" />
<Image alt="Claude 3.5 Sonnet Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/71146b75-2c73-48c3-9688-1d8814d2a791" />
<details>
<summary>Tools Calling Raw Output:</summary>
```yml
```
</details>
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/a9a40899-d5f3-4ef2-aa08-922751b05ca6" />
From the above video:
1. Sonnet 3.5 supports Stream Tools Calling and Parallel Tools Calling;
2. In Stream Tools Calling, it is observed that creating long sentences will cause a delay (as seen in the Tools Calling raw output `[chunk 40]` and `[chunk 41]` with a delay of 6s). Therefore, there will be a relatively long waiting time at the beginning stage of Tools Calling.
<Image alt="Claude 3.5 Sonnet Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/23e2d7e5-a6f3-4f4c-9c6a-5651f35a5910" />
<details>
<summary>Tools Calling Raw Output:</summary>
```yml
```
</details>
## Claude 3 Opus
### Simple Instruction Call: Weather Query
Test Instruction: Instruction ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/0e120fa2-8410-4552-a947-5ab7a91d994d" />
From the above video:
1. Claude 3 Opus outputs a `<thinking>` tag at the beginning of Tools Calling, which is not very helpful for users and consumes more tokens;
2. Opus triggers Tools Calling twice, indicating that it does not support Parallel Tools Calling;
3. The raw output of Tools Calling shows that Opus also supports Stream Tools Calling.
<Image alt="Claude 3 Opus Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/fa2f89bc-b9d5-43e3-a15e-1e79174d002c" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/b2dc8cd9-2582-43fe-9121-29c20a1cdc7b" />
From the above video:
1. Combining with simple tasks, Opus will always output a `<thinking>` tag, which significantly impacts the user experience;
2. Opus outputs the prompts field as a string instead of an array, causing an error and preventing the plugin from being called correctly.
<Image alt="Claude 3 Opus Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/1eee785d-932f-4320-845e-eed0bee4b1ae" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
## Claude 3 Sonnet
### Simple Instruction Call: Weather Query
Test Instruction: Instruction ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/600becd5-7f12-4a9a-86c7-e5cca0db6b1b" />
From the above video, it can be seen that Claude 3 Sonnet triggers Tools Calling twice, indicating that it does not support Parallel Tools Calling.
<Image alt="Claude 3 Sonnet Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/e82f5c69-7607-488f-8c10-0482fb380c6c" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/c150aa5f-36bc-40f2-a779-9c4fdcf2cd4c" />
From the above video, it can be seen that Sonnet 3 fails in the complex instruction call. The error is due to prompts being expected as an array but generated as a string.
<Image alt="Claude 3.5 Sonnet Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/b7d84e26-920d-4a82-8798-1b1060ebb341" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
## Claude 3 Haiku
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/02b3e872-735a-4928-8245-a90786acea8b" />
From the above video:
1. Claude 3 Haiku triggers Tools Calling twice, indicating that it also does not support Parallel Tools Calling;
2. Haiku does not provide a good response and directly calls the tool;
<Image alt="Claude 3 Haiku Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/9081b586-cf43-440f-8ef8-1de5d8658694" />
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/d1e3f804-0b89-4b90-9d78-69aee0db1c4d" />
From the above video, it can be seen that Haiku 3 also fails in the complex instruction call. The error is the same as prompts generating a string instead of an array.
<Image alt="Claude 3 Haiku Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/cde80220-4615-43bb-934f-35fe0de88754" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>