Meta, the business conglomerate of which giants like Facebook are a part, has been busting for a couple of days. It is not for less.
With the Irish authorities once again tightening the screws on the big technology companies to guarantee that the US government will not have access to the data of its European users, the presentation in society of the No Language Left Behind project (No Tongue Left Behind, NLLB) is clearly the kind of story Meta wants to highlight to improve its tarnished reputation.
But NLLB is, on the other hand, much more than a marketing campaign. It’s about a AI model that, in the absence of offering more samples to assess the quality of their productions, it is for the moment most complete translator known to man.
Chess grandmaster Garry Kasparov and what happens when machines “reach a level humans can’t compete against”
With 200 languages at your disposal, it exceeds the slightly more than 130 that Google Translate contemplates. Naturally, the AI is far from the 6,500 that are estimated to be in total in the world, but its 200 languages do not stop being an interesting leap forward.
The idea of Meta, as the name of its project indicates, is that no language is left behind. This, of course, with the metaverse in mind.
Zuckerberg’s company fantasizes about the day when, in the middle of its virtual reality, any person from any corner of the world can understand each other speaking their mother tongue regardless of how majority or minority it is.
But that day, some experts warn, is still far, far away. It may never come, in fact.
“Getting to have a reliable universal translator is impossible. In Catalonia, for example, there are projects that try to collect all the dialect varieties [maneras de pronunciar un idioma] there is, and it is already impossible because the same word from one people to another changes worldwide, pursuing something like this is interesting, but impossible“, he explains over the phone to Business Insider Spain Cecilio Angulo, professor of AI and robotics at the Polytechnic University of Catalonia and president of the Catalan Association for AI.
NBBL, a revolutionary model but with room for improvement
There is no doubt, however, that the Meta project, detailed extensively in a scientific article that has also seen the light this week, is ambitious. And therein lies its main strength.
Just like has counted Mark Zuckerberg himself in a post on his Facebook account, to reach 200 languages, the AI has contemplated 50,000 million parameters that have been analyzed by SuperCluster, one of the most powerful supercomputers in the world.
A manager from Amazon’s artificial intelligence division shares the 4 steps the company has taken to become a technological benchmark in the sector
The objective, says the CEO of the company, is to reach the 25,000 million daily translations.
But not only cold data has fed this AI. To assess the quality of the translations, 3,001 pairs of sentences were taken from each language (from English to the target language), which were evaluated by expert translators who were native speakers of the language being examined.
They have done so by taking the BLEU (Bilingual Evaluation Understudy) system as a reference, an international method that allows a numerical value to be assigned to a translation based on its quality.
The results of these tests could hardly have been better: testers have found that Meta’s AI translations are 44% better than those of the best translators out there.
It is, however, data that should be quarantined. To begin with, consider that the examiners were human, which means that their assessments may be biased.
The most obvious, some experts point out, is that speakers of minority languages tend to be less demanding. Excited at the very idea of seeing their language translated from English, they tend to overlook grammatical errors that they judge to be minor.
In languages, however, every word, every expression, and every turn of phrase counts.
The Spanish engineer behind the innovation laboratory that works hand in hand with Amazon and who wants technology companies to take responsibility for their algorithms
They know it well precisely on Facebook, which in 2017 had to see how a Palestinian user of its social network was arrested by the Israeli police due to an error of its robots: the man wrote “Good morning” on his board and the translation software of the platform interpreted “Atácalos”.
Second, it should be made clear that 3,001 sentences do not define a language. A translator can work with them successfully without necessarily being infallible, and vice versa.
Another aspect to take into account is the root of the contemplated languages. 200 languages, explains Angulo, may seem like a lot, but they do not necessarily have to be a challenge for an AI.
“The grace lies in the rarity of the languages. For example, if the majority share a Latin root, to which you have 2, you already have 7. If your languages are Spanish, Italian, Portuguese, Catalan, Galician and Romanian, the words change, but your structures will always be very similar”, comments the AI expert.
On the other hand, the real difficulty of translations lies not so much in grammar as in sayings and popular expressionswhich depend on the context.
“The richness of the language lies more in the logic of the context. The phrase ‘It has many balls’ depends entirely on the context. A good translator is one who knows that the English expression ‘Put yourself in my shoes’ in Spanish means ‘Put yourself in my skin “.
The ethical problems of AI, big business and minority languages
To these problems must be added an inherent mistrust between communities of minority language speakers and large corporations.
Not a few of these speakers interpret that the fact of having easier access to languages such as English may mean that there are few incentives in their community to continue producing documents in their native language.
However, for now the research casts positive feelings among academics.
The disturbing limits of technology: OpenAI closes an artificial intelligence chat development website after a man reproduced the personality of his late fiancée
“Overall, I’m glad Meta has gotten on board with this. I wish there was more work like this from companies like Google, Meta, and Microsoft. They all have substantial work ahead of them in machine translation of languages with few speakers,” says Alexander Fraser, Professor of Computational Linguistics at the University of Munich, in The Verge.
“In the project we have worked with linguists, sociologists and ethicists. This type of interdisciplinary approach allows us to really focus on human problems,” says Angela Fan, a Meta researcher involved in the development of AI .
It also underlines the importance of Meta having given free access to some elements of the project so that anyone who wants can use it in their research.
For Angulo, on the other hand, the future of translators will be more in the spoken word than in the written translation.
“In major languages, text translations have come a long way and are going more or less well. The great development will come with the voice and with the ability of machines to detect certain inflections, intonations and ways of pronouncing.”
We want to give thanks to the writer of this article for this amazing web content
Meta breaks the record with its 200-language translator AI, but experts stress its limits