You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

115 lines
4.3 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

---
title: OpenAI GPT 系列 Tools Calling 评测
description: >-
使用 LobeChat 测试 OpenAI GPT 系列模型GPT 3.5-turbo / GPT-4 /GPT-4o 的工具调用Function
Calling能力并展现评测结果
tags:
- Tools Calling
- Benchmark
- Function Calling
- 工具调用
- 插件
---
# OpenAI GPT Series Tool Calling
Overview of the Tool Calling capabilities of OpenAI GPT series models:
| Model | Tool Calling Support | Streaming | Parallel | Simple Instruction Score | Complex Instruction Score |
| ------------- | -------------------- | --------- | -------- | ------------------------ | ------------------------- |
| GPT-3.5-turbo | ✅ | ✅ | ✅ | 🌟🌟🌟 | 🌟 |
| GPT-4-turbo | ✅ | ✅ | ✅ | 🌟🌟 | 🌟🌟 |
| GPT-4o | ✅ | ✅ | ✅ | 🌟🌟🌟 | 🌟🌟 |
<Callout type={'info'}>
For testing instructions, see [Tools Calling - Evaluation Task
Introduction](/docs/usage/tools-calling#evaluation-task-introduction)
</Callout>
## GPT 3.5-turbo
### Simple Instruction Call: Weather Inquiry
Test Instruction: Instruction ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/65901ee2-78b8-4f56-9e0d-6407c484f434" />
<Image alt="Tool Calling for Simple Instruction in GPT 3.5 Turbo" src="https://github.com/lobehub/lobe-chat/assets/28616219/1251dfc0-d1c4-4c3d-825e-dd6205793d53" />
<details>
<summary>Streaming Tool Calling Raw Output:</summary>
</details>
### Complex Instruction Call: Wenshengtu
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/2047665f-ab22-4da7-a390-0fb4ec5a2a14" />
<Image alt="Tool Calling for Complex Instruction in GPT 3.5 Turbo" src="https://github.com/lobehub/lobe-chat/assets/28616219/125ad028-a621-4433-b5fa-321f8fd76302" />
<details>
<summary>Streaming Tool Calling Raw Output:</summary>
</details>
## GPT-4 Turbo
### Simple Instruction Call: Weather Inquiry
Test Instruction: Instruction ①
Unlike GPT-3.5 Turbo, GPT-4 Turbo did not respond with "okay" when calling Tool Calling, and after multiple tests, it remained the same. Therefore, in this follow-up of a compound instruction, it is not as good as GPT-3.5 Turbo, but the remaining two capabilities are still good.
Of course, it is also possible that GPT-4 Turbo's model has more "autonomy" and believes that it does not need to output this "okay."
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/f865d91b-b84a-4258-ae09-9d1e15eeb43d" />
<Image alt="Tool Calling for Simple Instruction in GPT-4 Turbo" src="https://github.com/lobehub/lobe-chat/assets/28616219/19298693-7a9b-4b54-9e28-c46b541b4f41" />
<details>
<summary>Streaming Tool Calling Raw Output:</summary>
</details>
### Complex Instruction Call: Wenshengtu
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/69989faf-9b98-41ec-ba51-40cc3545d8d1" />
<Image alt="Tool Calling for Complex Instruction in GPT-4 Turbo" src="https://github.com/lobehub/lobe-chat/assets/28616219/8329c1b2-5e36-4457-946c-ce3781b05afd" />
<details>
<summary>Streaming Tool Calling Raw Output:</summary>
</details>
## GPT-4o
### Simple Instruction Call: Weather Inquiry
Test Instruction: Instruction ①
Similar to GPT-3.5, GPT-4o performs very well in following compound instructions in simple instruction calls.
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/c77b65ab-0854-4e1f-a25b-ff43275bd318" />
<Image alt="Tool Calling for Simple Instruction in GPT-4o" src="https://github.com/lobehub/lobe-chat/assets/28616219/e5d6214f-f628-4064-a330-cbd7c5d474ac" />
<details>
<summary>Streaming Tool Calling Raw Output:</summary>
</details>
### Complex Instruction Call: Wenshengtu
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/714bd86a-3b58-4941-8323-186c3fa4c6ea" />
<Image alt="Tool Calling for Complex Instruction in GPT-4o" src="https://github.com/lobehub/lobe-chat/assets/28616219/8329c1b2-5e36-4457-946c-ce3781b05afd" />
<details>
<summary>Streaming Tool Calling Raw Output:</summary>
```yml
```
</details>