You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
54 lines
2.0 KiB
Markdown
54 lines
2.0 KiB
Markdown
# Knowledge Base / File Upload
|
|
|
|
LobeChat supports file upload and knowledge base management. This feature relies on the following core technical components. Understanding these components will help you successfully deploy and maintain the knowledge base system.
|
|
|
|
## Core Components
|
|
|
|
### 1. PostgreSQL and PGVector
|
|
|
|
PostgreSQL is a powerful open-source relational database system, and PGVector is its extension for vector operations.
|
|
|
|
- **Purpose**: Store structured data and vector indexes
|
|
- **Deployment Tip**: Use official Docker image for quick deployment
|
|
|
|
Deployment script example:
|
|
|
|
```
|
|
docker run -p 5432:5432 -d --name pg -e POSTGRES_PASSWORD=mysecretpassword pgvector/pgvector:pg16
|
|
```
|
|
|
|
- **Note**: Ensure sufficient resources for vector operations
|
|
|
|
### 2. S3-compatible Object Storage
|
|
|
|
S3 (or S3-compatible storage services) is used for storing uploaded files.
|
|
|
|
- **Purpose**: Store raw files
|
|
- **Options**: AWS S3, MinIO, or other S3-compatible services
|
|
- **Note**: Configure appropriate access permissions and security policies
|
|
|
|
### 3. OpenAI Embedding
|
|
|
|
OpenAI's Embedding service is used to convert text into vector representations.
|
|
|
|
<Callout type={'info'}>
|
|
|
|
LobeChat currently uses OpenAI's `text-embedding-3-small` model by default. Ensure your API Key has access to this model.
|
|
|
|
</Callout>
|
|
|
|
- **Purpose**: Generate vector representations for semantic search
|
|
- **Notes**:
|
|
- Requires valid OpenAI API key
|
|
- Implement proper API call limits and error handling
|
|
|
|
### 4. Unstructured.io (Optional)
|
|
|
|
Unstructured.io is a powerful document processing tool.
|
|
|
|
- **Purpose**: Process complex document formats, extract structured information
|
|
- **Use Case**: Handle non-plain text formats like PDF, Word
|
|
- **Note**: Evaluate processing needs based on document complexity
|
|
|
|
By correctly configuring and integrating these core components, you can build a powerful and efficient knowledge base system for LobeChat. Each component plays a crucial role in the overall architecture, supporting advanced document management and intelligent retrieval functions.
|