News
Meta defines the system as “a non-autoregressive flow-matching model trained to infill speech, given audio context and text.” It’s been trained on more than 50,000 hours of unfiltered audio.
With a focus on expressive quality, reproducibility, and open access, Dia adds a distinctive new voice to the landscape of text-to-speech.
Decagon, which builds AI-powered voice experiences, saw a 30% improvement in transcription accuracy using OpenAI’s speech recognition model.
Meta’s open-source speech AI recognizes over 4,000 spoken languages It can also produce text-to-speech in over 1,100 languages.
Results that may be inaccessible to you are currently showing.
Hide inaccessible results