OmniVoice Studio
A tiernew this weekA fully local, open-source desktop app for voice cloning, video dubbing, and real-time dictation — zero cloud, zero subscription, 646 languages.
Kai's verdict
The feature surface is genuinely impressive for a solo open-source project — 646-language TTS, video dubbing, MCP integration, and a real desktop UI — but active beta status and fuzzy commercial licensing mean you should treat it as a powerful experiment, not a production workhorse yet. (Verdict pending Phi's full review.)
Strengths
- Completely local processing — no API keys, no cloud account, no data sent to third parties
- 646 languages for TTS (vs. ElevenLabs' 32) and 99 languages for transcription via WhisperX
- Zero-shot voice cloning from a 3-second clip, plus parametric voice design without any reference audio
- Full video dubbing pipeline (transcribe → translate → re-voice → MP4) with Demucs background audio preservation
- Built-in MCP Server exposes capabilities to Claude, Cursor, or any MCP client for automation
Weaknesses
- Active beta — things break between releases; macOS builds aren't notarized yet, requiring a manual Gatekeeper workaround
- Meaningful hardware requirements: CPU-only runs work but TTS is ~3× slower; 16 GB RAM and 8+ GB VRAM recommended for production use
- Commercial licensing terms aren't finalized, creating uncertainty for business use cases
Best for
Privacy-conscious creators, developers, and localizers who want ElevenLabs-tier voice features without the cloud dependency, usage limits, or monthly bill.
Pricing
Free (FSL license — personal/non-commercial); commercial pricing TBD
Released under Functional Source License (FSL-1.1-ALv2): free for personal, educational, and non-commercial use. Each release converts to Apache 2.0 two years after publication. Commercial licensing pricing is being finalized — quote requests via in-app page.