This AI Startup Is Better At Translating Ethiopian Languages Than Google And Microsoft – Here’s Why
Asmelash Teka Hadgu, the visionary behind startup Lesan, is changing the game for language translation, especially for Ethiopian Languages.
Acknowledging the limitations of popular translation systems like Google Translate and Facebook, Lesan is on a mission to create language-specific technologies.
Translating ‘low-resource’ Languages
In an interview with Rest of the World, Hadgu explained that it has become urgent to create language-specific technologies because Google Translate and ChatGPT don’t fit the mark.
AI translation models are known to be more accurate for ‘high-resource’ languages like English and Mandarin, and it was recently found that machine translations of languages such as Dari and Pashto have led to confusion and rejections of asylum claims in Afghans.
Errors are common with AI translation into African languages, also. Computer science Ph.D. student at the University of Porto, Shamsuddeen Hassan Muhammad, told Science that his friend, whose native language is Hausa, announced on Facebook that his wife gave birth. Facebook automatically translated the news into “my prostitute gave birth.”
Lesan: A Language-specific languages
Hadgu founded Lesan in 2019 to develop machine translation products.
In an exclusive with Rest of World, he said, “If you put Google and Facebook on its face and compare it with smaller startups, the quality is low.”
“At Lesan, we don’t believe that you create just one model that solves these problems. If you have the technical know-how around machine learning, that unique advantage and connection to the community can carry you further.”
He claims that chatbots such as ChatGPT are broken or useless for languages as most of the data that powers them is internet data and currency. There isn’t enough data online for those languages.
By contrast, Lesan uses offline print resources to train its language translation model.
In a research paper, Hadgu and colleagues explained that millions of people worldwide can not access content on the web due to the content needing to be readily available in their language.
Machine translation (MT) systems, such as Lesan’s, can potentially change this.
“We present Lesan, an MT system for the low-resource language, and we perform extensive human evaluation and show that Lesan outperforms state-of-the-art methods such as Google Translate and Microsoft Translator,” he concluded.