An AI-powered virtual streamer that can chat with viewers, sing songs on request, and play sound effects. The virtual streamer uses a large language model for conversation, Fish Audio TTS for speech, and Replay for AI voice singing.
- 🗣️ Text-to-Speech using Fish Audio voices
- 🎵 AI singing with Replay voice conversion
- 💬 Natural conversation using GPT-4o or compatible model
- 🔊 Sound effect playback
- 🎬 OBS integration with subtitles and "Now Playing" widgets
- 📱 Twitch chat integration
- 🗃️ Message history with MongoDB
- ⚡ Auto-talk feature for spontaneous AI chatter
- Node.js (version 18+ recommended)
- MongoDB (or a MongoDB connection URI)
- Replay for AI singing voice conversion
- Fish Audio API key for text-to-speech
-
Clone the repository:
git clone https://github.com/yourusername/ai-sings-and-speaks.git cd ai-sings-and-speaks -
Install dependencies:
npm install
-
Create required directories: The following directories will be created automatically when needed, but you can create them manually:
mkdir -p audio-cache downloads models outputs song-cache sound-effects weights
-
Set up configuration:
cp .env.example .env
Edit the
.envfile and add your API keys and configuration. -
Install Replay for AI singing: Download and install Replay from weights.gg/replay
-
Download voice models:
- Voice models for singing should be downloaded from weights.gg or use your own
- Place voice models in the
modelsdirectory or configure a custom path in your.env
-
Configure voice models:
- Update the
VOICE_MODEL_IDin your.envfile to match the name of your voice model
- Update the
-
Start the server:
npm start
- audio-cache/: Cached TTS audio files
- downloads/: Temporary song downloads
- models/: Voice models for singing
- outputs/: Output files from Replay
- public/: Web pages for OBS browser sources
- song-cache/: Cached converted songs
- sound-effects/: Custom sound effects (add .mp3 files here)
- weights/: Model weights (copy from Replay installation)
Edit the .env file to customize your setup:
OPENAI_API_KEY: Your OpenAI API key (or compatible API)OPENAI_BASE_URL: API endpoint URLOPENAI_MODEL: Model to use (e.g., "gpt-4o")
FISHAUDIO_KEY: Your Fish Audio API keyVOICE_ID: Voice ID to use for TTSFISH_AUDIO_MODEL: TTS model version
VOICE_MODEL_ID: Voice model name for singingSONG_API_URL: URL of the Replay API (default: http://localhost:62362)
MONGODB_URI: MongoDB connection stringTWITCH_OAUTH_TOKEN: Twitch OAuth token for chat integrationPORT: Web server port (default: 3000)
Once the server is running:
- View subtitles: http://localhost:3000/subtitles
- Now playing widget: http://localhost:3000/now-playing
- Test interface: http://localhost:3000 (for testing without Twitch)
Add browser sources in OBS:
- Add
http://localhost:3000/subtitlesas a browser source for showing speech - Add
http://localhost:3000/now-playingto show currently playing songs
Add MP3 files to the sound-effects directory. They will be automatically detected and can be triggered with:
@sound("sound-name", "times")
where "times" is optional (defaults to playing once).
- Support for additional TTS providers
- Voice model fine-tuning options
- More customization options
- Expanded widget collection
- Issues with TTS: Check your FISHAUDIO_KEY and VOICE_ID settings
- Singing problems: Ensure Replay is running and properly configured
- Missing voice models: Download models from weights.gg and place in the models directory
- MongoDB errors: Verify your MongoDB connection string