Open (Apache 2.0) TTS model for streaming conversational audio in realtime github.com 12 points by SweetSoftPillow 4 days ago
woodson 7 minutes ago Looks very similar to Kyutai’s models, given that it uses the same neural audio codec (Mimi) and Depformer module etc.
ks2048 an hour ago > Our work was heavily inspired by KyutaiTTS and SesameI wish they’d describe the technical details of the differences between this and other TTS they were “inspired by”.So many projects like this, I will just have to assume they are vibe-coded clones to get some publicity unless there’s more technical details.
Looks very similar to Kyutai’s models, given that it uses the same neural audio codec (Mimi) and Depformer module etc.
> Our work was heavily inspired by KyutaiTTS and Sesame
I wish they’d describe the technical details of the differences between this and other TTS they were “inspired by”.
So many projects like this, I will just have to assume they are vibe-coded clones to get some publicity unless there’s more technical details.