On June 27, Google announced that support for 110 new languages is added to the Translate feature. In this case, special thanks should go to its PaLM 2 large language model. A quarter of the total languages that have been added already originated in Africa. Besides, the company added support for five Philippine languages, including—Bikol, Hiligaynon, Kapampangan, Pangasinan, and Waray. Additionally, it supported translations for Cebuano, Ilocano, and Tagalog, which is known as Filipino on the service.
With the wide availability of languages, Google Translate helps to break down the language barriers while allowing people to connect as well as understand the world in a better way. The company is applying the most recent technologies to permit more people to access the tool. Google already added 24 new languages with the help of Zero-Shot Machine Translation in 2022. In this technology, an ML model learns how to translate into another language without viewing an instance. The company even took an initiative to generate AI models which will be compatible with 1000 languages which are spoken mostly across the globe.
Google Translate support for more than half a billion people:
These new languages, from Qʼeqchiʼ to Cantonese, enable over 614 million speakers (8% of the entire population) to communicate with each other. A few languages have more than 100 million speakers. Besides, there are other languages which small communities of Indigenous people speak. In addition, a few languages do not have any native speakers. But they do have active revitalization efforts. A lot of new languages originated from Africa and represent the largest expansion of the African Language to date; Fon, Kikongo, Luo, Ga, Swati, Venda and Wolof are some of these languages.
Google Translate: 110 New Languages Are Added
We have given here some languages which are newly supported in Google Translate:
Afar: It is a tonal language which is spoken in Eritrea, Djibouti, and Ethiopia. This language had the most volunteer community contribution among all languages that were launched.
Manx: It is the Celtic language of the Isle of Man. In 1974, after the death of its last native speaker, this language had almost gone extinct. However, all credits go to the island-wide revival movement, because of which a lot of speakers now use this language.
Cantonese: Most of the people have requested this language for Google Translate. In this case, you have to know that the language often overlaps with Mandarin in writing. So, it may be difficult to find data and training models.
NKo: This is actually a standardized form of the West African Manding languages. NKo is capable of unifying multiple dialects into a common language. In 1949, people invented the unique alphabet of this language. This newly supported language has an active research community. This community helps to develop technology and resources for it at present.
Tamazight (Amazigh): North African people used to speak the Berber language. You can find multiple dialects of this language. But still the written form can be mutually understandable. Tifinagh and Latin script are used to write this language. Google Translate supports both of these scripts.
Tok Pisin: This is an English-based creole. Besides, it is the lingua franca of Papua New Guinea. Suppose you speak the English language. In that case, you can try to translate it into Tok Pisin — it can help you to understand the meaning.
How Does Google Translate Work?
Translating all these languages is not as simple as it looks. In order to power the Translate feature, the company uses AI which requires enough written data like text from books, articles, and websites.
After that the data feed algorithms that are inspired by the human brain known as neural networks. It lets the system analyze context. Besides, the system needs to analyze linguistic structures and patterns to offer natural-sounding translations.
How Language Varieties Are Chosen:
While adding new languages to the Google Translate, the company needed to consider a lot of things. Languages are available in a huge variation, including- Regional varieties, dialects, and different spelling standards. You can even find multiple languages without one standard form. As a result, it becomes challenging to go with the right one. Google aims to put importance to those varieties of languages which are most commonly used. For instance, there are a lot of dialects available of the Romani language throughout theEurope. Google Translate creates the text closest to Southern Vlax Romani.
PaLM 2 is a language model used by Google Translate to learn languages which are closely related to each other. Technology is getting new advancements regularly. Tuning with the updating technology, Google is partnering with native speakers and expert linguists. In this way, the company is trying to offer support for more language varieties and spelling conventions.
Benefits Of Incorporating These New Languages:
The company emphasized how the volunteer community helps to identify new languages and add them, especially in regions with limited digital resources. By adding support for 110 new languages to Google Translate, the company is trying to break down the language barriers, while making information accessible to a wide range of audiences.
In order to learn more about the newly supported languages, you need to visit the Help Center. Then, you can start to translate any language at the translate.google.com website. Otherwise, it is possible for Android and iOS users to use the Google Translate app.
The Bottom Line:
Artificial intelligence can deliver better performance in the English language because of the wide availability of training data. However, apart from Google, Microsoft is also interested in rare languages and launched Microsoft Language Bank in 2022, highlighting that 40% of all available languages worldwide are endangered. The Living Tongues Institute for Endangered Languages made a partnership with Shure (an audio company) for a campaign that is known as "No Voice Left Behind", which aims to spread awareness of preserving languages used in remote regions.