← Back to feed

VibeVoice

GitHub Repo Pretty sure · TTS removal is honest, not spin
https://github.com/microsoft/VibeVoice

Microsoft's frontier voice AI that actually ships working models—ASR handles 60min single-pass with speaker diarization, TTS disabled after misuse concerns, Realtime-0.5B is genuinely useful. The bar for "open-source" got lower when they pulled TTS code.

15%
20%
65%
Slop 15%Signal 20%Science 65%

VibeVoice-ASR is real signal: 7.5Hz acoustic tokenizers + LLM diffusion handling 60min context with speaker tracking/timestamps/hotwords is technically non-trivial. Papers exist (arxiv links valid). HF integration shipped. Realtime-0.5B is production-grade small model. BUT: TTS code was pulled mid-2025 after "inconsistent with stated intent" — corporate speak for "people made deepfakes." Remaining work is solid but the framework's original ambition (both TTS+ASR) got guillotined. That's not m...

24646 stars Python 2026-03-27 214 days old

Become a MFer to rate — log in