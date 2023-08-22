Meta, formerly known as Facebook, today unveiled SeamlessM4T, an all-in-one multimodal and multilingual AI translation model for effortless communication through speech and text across several languages.

This groundbreaking model enhances cross-lingual communication by offering automatic speech recognition and speech-to-text translation and text-to-text translation for nearly 100 languages. SeamlessM4T also supports speech-to-speech translation, with support for nearly 100 input languages and 36 (including English) output languages.

In addition, the new model also offers text-to-speech translation for nearly 100 input languages and 35 output languages (including English), enabling users to listen to translated content.

Meta is taking an inclusive approach by publicly releasing SeamlessM4T under a research license. "In keeping with our approach to open science, we're publicly releasing SeamlessM4T under a research license to allow researchers and developers to build on this work. We're also releasing the metadata of SeamlessAlign, the biggest open multimodal translation dataset to date, totalling 270,000 hours of mined speech and text alignments," Meta said on Tuesday.

Compared to traditional approaches that rely on separate models for translation, SeamlessM4T introduces a revolutionary single-system approach that substantially reduces errors and delays, ultimately increasing the efficiency and quality of the translation process.

SeamlessM4T draws on findings from Meta's previous projects including No Language Left Behind (NLLB), Universal Speech Translator and the Massively Multilingual Speech to enable a single model to perform speech-to-text, speech-to-speech, text-to-speech, and text-to-text translations.

"SeamlessM4T builds on advancements we and others have made over the years in the quest to create a universal translator. This is only the latest step in our ongoing effort to build AI-powered technology that helps connect people across languages. In the future, we want to explore how this foundational model can enable new communication capabilities — ultimately bringing us closer to a world where everyone can be understood," the company said.