News

Meta’s open-source speech AI recognizes over 4,000 spoken languages It can also produce text-to-speech in over 1,100 languages.
Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.
With a focus on expressive quality, reproducibility, and open access, Dia adds a distinctive new voice to the landscape of text-to-speech.
The giant program uses multiple kinds of neural nets to train speech and text at the same time, an example of the increasing importance of multi-modality.