You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
29 lines
1.3 KiB
Markdown
29 lines
1.3 KiB
Markdown
---
|
|
title: LobeChat support Vision Recognition
|
|
description: >-
|
|
Discover how LobeChat integrates visual recognition capabilities like OpenAI's
|
|
gpt-4-vision and Google Gemini Pro vision for intelligent conversations based
|
|
on uploaded images.
|
|
tags:
|
|
- LobeChat
|
|
- Model Vision Recognition
|
|
- Multimodal Interaction
|
|
- Visual Elements
|
|
- Intelligent Conversations
|
|
---
|
|
|
|
# Model Vision Recognition
|
|
|
|
<Image
|
|
alt={'Model Vision Recognition'}
|
|
borderless
|
|
cover
|
|
src={
|
|
'https://github.com/user-attachments/assets/18574a1f-46c2-4cbc-af2c-35a86e128a07'
|
|
}
|
|
/>
|
|
|
|
LobeChat now supports large language models with visual recognition capabilities such as OpenAI's [`gpt-4-vision`](https://platform.openai.com/docs/guides/vision), Google Gemini Pro vision, and Zhipu GLM-4 Vision, enabling LobeChat to have multimodal interaction capabilities. Users can easily upload or drag and drop images into the chat box, and the assistant will be able to recognize the content of the images and engage in intelligent conversations based on them, creating more intelligent and diverse chat scenarios.
|
|
|
|
This feature opens up new ways of interaction, allowing communication to extend beyond text and encompass rich visual elements. Whether it's sharing images in daily use or interpreting images in specific industries, the assistant can provide an excellent conversational experience.
|