スマホで試さないと意味ないww
You can now generate real-time speech that sounds conversational.
Microsoft just open-sourced VibeVoice, a real-time text-to-speech system with ~300 ms first audio latency and streaming input.
https://x.com/lioronai/status/2013220214217879931?s=46