You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

38 lines
1.9 KiB
Markdown

---
title: Enhancing Multimodal Interaction with Visual Recognition Models
description: >-
Explore how LobeChat integrates visual recognition capabilities into large
language models, enabling multimodal interactions for enhanced user
experiences.
tags:
- Visual Recognition
- Multimodal Interaction
- Large Language Models
- LobeChat
- Custom Model Configuration
---
# Visual Model User Guide
The ecosystem of large language models that support visual recognition is becoming increasingly rich. Starting from `gpt-4-vision`, LobeChat now supports various large language models with visual recognition capabilities, enabling LobeChat to have multimodal interaction capabilities.
<Video alt={'Visual Model Usage'} src={'https://github.com/user-attachments/assets/1c6b4975-bfc3-4470-a934-558ff7a16941'} />
## Image Input
If the model you are currently using supports visual recognition, you can input image content by uploading a file or dragging the image directly into the input box. The model will automatically recognize the image content and provide feedback based on your prompts.
<Image alt={'Image Input'} src={'https://github.com/user-attachments/assets/e6836560-8b05-4382-b761-d7624da4b0f1'} />
## Visual Models
In the model list, models with a `👁️` icon next to their names indicate that the model supports visual recognition. Selecting such a model allows you to send image content.
<Image alt={'Visual Models'} src={'https://github.com/user-attachments/assets/fa07a326-04c8-4744-bb93-cef715d1d71f'} />
## Custom Model Configuration
If you need to add a custom model that is not currently in the list and explicitly supports visual recognition, you can enable the `Visual Recognition` feature in the `Custom Model Configuration` to allow the model to interact with images.
<Image alt={'Custom Model Configuration'} src={'https://github.com/user-attachments/assets/c24718cc-402b-4298-b046-8b4aee610cbc'} />