BentoVoiceAgent
Build Phone Calling Voice Agent fully powered by open source models.
About BentoVoiceAgent
This repository demonstrates how to build a voice agent using open-source Large Language Models (LLMs), text-to-speech (TTS), and speech-to-text (STT) models. It utilizes Pipecat voice pipeline and is deployed with BentoML. The voice agent is accessible via a phone number, leveraging Twilio as the communication transport. This example can be easily extended to incorporate additional voice agent features and functionality.
The LLM and XTTS models are deployed as separate API endpoints, as outlined in the instructions below. These API endpoints are provided to the voice agent through environment variables.
Deploy the LLM and XTTS models by following the instructions provided in their respective repositories.
BentoVoiceAgent is an open-source project written primarily in Python, with 124 stars on GitHub. It was last updated in April 2025.
pip install -U bentomlBentoVoiceAgent vs. the alternatives
All voice agents →| Agent | Stars | Pricing | ||
|---|---|---|---|---|
| BentoVoiceAgent | 124 | Python | — | Open source |
| xiaozhi-esp32-server | 10.0k | JavaScript | MIT | Open source |
| ten-vad | 2.2k | C | — | Open source |
| bailing | 1.7k | Python | MIT | Open source |
| RCLI | 1.5k | C++ | MIT | Open source |
| CyberVerse | 1.4k | Python | GPL-3.0 | Open source |
