You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
49 lines
2.6 KiB
Markdown
49 lines
2.6 KiB
Markdown
---
|
|
title: LobeChat support Speech Synthesis and Recognition (TTS & STT)
|
|
description: >-
|
|
Experience seamless Text-to-Speech (TTS) and Speech-to-Text (STT) technologies
|
|
in LobeChat. Choose from a variety of high-quality voices for personalized
|
|
communication. Learn more about Lobe TTS toolkit @lobehub/tts.
|
|
tags:
|
|
- LobeChat
|
|
- TTS
|
|
- STT
|
|
- Voice Conversation
|
|
- Lobe TTS
|
|
- Text-to-Speech
|
|
- Speech-to-Text
|
|
- Voice Options
|
|
---
|
|
|
|
# TTS & STT Voice Conversation
|
|
|
|
<Image
|
|
alt={'TTS & STT Voice Conversation'}
|
|
borderless
|
|
cover
|
|
src={
|
|
'https://github.com/user-attachments/assets/50189597-2cc3-4002-b4c8-756a52ad5c0a'
|
|
}
|
|
/>
|
|
|
|
LobeChat supports Text-to-Speech (TTS) and Speech-to-Text (STT) technologies. Our application can convert text information into clear voice output, allowing users to interact with our conversational agents as if they were talking to a real person. Users can choose from a variety of voices and pair the appropriate audio with the assistant. Additionally, for users who prefer auditory learning or need to obtain information while busy, TTS provides an excellent solution.
|
|
|
|
In LobeChat, we have carefully selected a series of high-quality voice options (OpenAI Audio, Microsoft Edge Speech) to meet the needs of users from different regions and cultural backgrounds. Users can choose suitable voices based on personal preferences or specific scenarios, thereby obtaining a personalized communication experience.
|
|
|
|
## Lobe TTS
|
|
|
|
<Image alt={'LobeTTS @lobehub/tts'} borderless src={'https://github.com/lobehub/lobe-chat/assets/28616219/52d20fdc-48c0-45e4-9c51-ee150fc18be7'} />
|
|
|
|
[`@lobehub/tts`](https://tts.lobehub.com) is a high-quality TTS toolkit developed using the TS language, supporting usage in both server and browser environments.
|
|
|
|
- **Server**: With just 15 lines of code, it can achieve high-quality speech generation capabilities comparable to OpenAI TTS services. It currently supports EdgeSpeechTTS, MicrosoftTTS, OpenAITTS, and OpenAISTT.
|
|
- **Browser**: It provides high-quality React Hooks and visual audio components, supporting common functions such as loading, playing, pausing, and dragging the timeline, and offering extensive audio track style adjustment capabilities.
|
|
|
|
<Callout type={'info'}>
|
|
During the implementation of the TTS feature in LobeChat, we found that there was no good frontend
|
|
TTS library on the market, which resulted in a lot of effort being spent on implementation,
|
|
including data conversion, audio progress management, and speech visualization. Adhering to the
|
|
"Community First" concept, we have polished and open-sourced this implementation, hoping to help
|
|
community developers who want to implement TTS.
|
|
</Callout>
|