Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more A two-person startup by the name of Nari ...
Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.
To coincide with the rollout of the ChatGPT API, OpenAI today launched the Whisper API, a hosted version of the open source Whisper speech-to-text model that the company released in September. Priced ...
ElevenLabs, an AI startup that just raised a $180 million mega-funding round, has been primarily known for its audio-generation prowess. The company took a step in another technological direction by ...
On Tuesday, Meta announced SeamlessM4T, a multimodal AI model for speech and text translations. As a neural network that can process both text and audio, it can perform text-to-speech, speech-to-text, ...
On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback