You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
38 lines
1.9 KiB
Markdown
38 lines
1.9 KiB
Markdown
---
|
|
title: Enhancing Multimodal Interaction with Visual Recognition Models
|
|
description: >-
|
|
Explore how LobeChat integrates visual recognition capabilities into large
|
|
language models, enabling multimodal interactions for enhanced user
|
|
experiences.
|
|
tags:
|
|
- Visual Recognition
|
|
- Multimodal Interaction
|
|
- Large Language Models
|
|
- LobeChat
|
|
- Custom Model Configuration
|
|
---
|
|
|
|
# Visual Model User Guide
|
|
|
|
The ecosystem of large language models that support visual recognition is becoming increasingly rich. Starting from `gpt-4-vision`, LobeChat now supports various large language models with visual recognition capabilities, enabling LobeChat to have multimodal interaction capabilities.
|
|
|
|
<Video alt={'Visual Model Usage'} src={'https://github.com/user-attachments/assets/1c6b4975-bfc3-4470-a934-558ff7a16941'} />
|
|
|
|
## Image Input
|
|
|
|
If the model you are currently using supports visual recognition, you can input image content by uploading a file or dragging the image directly into the input box. The model will automatically recognize the image content and provide feedback based on your prompts.
|
|
|
|
<Image alt={'Image Input'} src={'https://github.com/user-attachments/assets/e6836560-8b05-4382-b761-d7624da4b0f1'} />
|
|
|
|
## Visual Models
|
|
|
|
In the model list, models with a `👁️` icon next to their names indicate that the model supports visual recognition. Selecting such a model allows you to send image content.
|
|
|
|
<Image alt={'Visual Models'} src={'https://github.com/user-attachments/assets/fa07a326-04c8-4744-bb93-cef715d1d71f'} />
|
|
|
|
## Custom Model Configuration
|
|
|
|
If you need to add a custom model that is not currently in the list and explicitly supports visual recognition, you can enable the `Visual Recognition` feature in the `Custom Model Configuration` to allow the model to interact with images.
|
|
|
|
<Image alt={'Custom Model Configuration'} src={'https://github.com/user-attachments/assets/c24718cc-402b-4298-b046-8b4aee610cbc'} />
|