Spaces:
Sleeping
Sleeping
| title: Command_RTC | |
| emoji: 🦀 | |
| colorFrom: yellow | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.32.1 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Text-to-speech using Gradio, FastAPI, and Chatterbox TTS | |
| tags: | |
| - chatterbox-tts | |
| - text-to-speech | |
| - voice-cloning | |
| - gradio | |
| - fastapi | |
| # Voice Chat Assistant | |
| A conversational voice assistant powered by AI that responds to your spoken queries with natural-sounding speech. | |
| ## Features | |
| - Speech Recognition: Uses OpenAI's Whisper model to accurately transcribe your voice | |
| - Natural Language Understanding: Leverages Cohere's LLM API for intelligent responses | |
| - Text-to-Speech: Generates natural speech using Chatterbox-TTS | |
| - Reply on Pause: Automatically responds when you finish speaking | |
| - Conversation History: Maintains context throughout your dialogue | |
| ## Demo | |
| Speak into your microphone and the assistant will respond with voice! | |
| ## How It Works | |
| - Your voice is transcribed to text using Whisper | |
| - The text is processed by Cohere's LLM to generate a response | |
| - The response is converted to speech using Chatterbox-TTS | |
| - The conversation continues with full context retention | |
| ## Technical Details | |
| This project utilizes: | |
| - Zero-GPU: Efficient GPU memory usage with Hugging Face's Zero-GPU technology | |
| - FastRTC: Real-time communication for seamless voice interaction | |
| - Gradio: Simple and intuitive user interface | |
| ## Setup | |
| To run this locally, you'll need a Cohere API key and Python 3.8+. | |
| ## Acknowledgements | |
| - OpenAI for the Whisper speech recognition model | |
| - Cohere for the language model API | |
| - Tortoise-TTS for the text-to-speech capabilities | |
| - Hugging Face for the Spaces and Zero-GPU infrastructure |