Today’s language translation apps are like self-driving cars: incredibly useful, promising, nearing maturity, and almost entirely powered by machines. It's astonishing that the technology even exists.
Even so, machine translation is still clunky at times, if not awkward.
Consider a recent conversation I had with my neighbor, Andre, who immigrated from Russia last year. Speaking little to no English, Andre is navigating the American Dream almost entirely through Google Translate, the most popular speech-to-speech translation app, first launched 10 years ago.
Through his phone, Andrew and I can hold surprisingly deep conversations about where he’s from, how he thinks, how we can help each other, and what he hopes for. But on more than one occasion, Google Translate failed to communicate what Andre was trying to express, which forced us both to shrug and smile through the breakdown.
As computers get smarter, however, Google, Apple, Microsoft, and others hope to fully remove the language barrier Andre and I shared that day. But it’ll take faster neural machine learning for that to happen, which “might be a few years out,” one developer I spoke to admitted.
Not that the wait matters. In fact, many consumers are surprised to learn just how good today’s translation apps already are. For example, this video shows three Microsoft Researchers using the company's live translation software to hold a conversation across multiple languages. The video is seven years old. But when I showed it to some friends, they reacted as if they'd seen the future.
“The technology surrounding translation has come a long way in a very short time,” says Erica Richter, a spokesperson for DeepL, an award-winning machine-translation service that licenses its technology to Zendesk, Coursera, Hitachi, and other businesses. “But this hasn’t happened in parallel with consumer awareness.”
I am a case in point. Although I’ve written about technology for nearly 20 years, I had no idea how deft Google Translate, Apple Translate, Microsoft Translator, and Amazon Alexa were until I started researching this story after my fateful encounter with Andre. The technology still isn’t capable of instant translation like you expect from a live human translator. But the turn-based speech-to-speech, text-to-speech, or photo-to-text translation is incredibly powerful.
And it’s getting better by the year. “Translate is one of the products we built that’s entirely using artificial intelligence,” a Google spokesperson says. “Since launching Google’s Neural Machine in 2016, we’ve seen the largest improvements in accuracy to translate entire sentences rather than just phrases.”
At the same time, half of the six apps I tried for this story sometimes botch even basic greetings. For instance, when I asked Siri and Microsoft Translator to convert “Olá, tudo bem?” from Portuguese to English, both correctly replied, “Hi, how are you?” Google Translate and Amazon Alexa, on the other hand, returned a more literal and awkward, “Hi, everything is fine?” or “Hi, is everything OK?” Not a total fail. But enough nuance to cause hesitancy or confusion on the part of the listener.
In other words, translation technology is similar to the impressive but often clumsy writing that ChatGPT churns out. It works. It’s encouraging. It’s a sign of the times. But the result often feels inhuman, if not disorienting.
It’s still good enough to change the world, though. “We process over a billion translations every day on Translate,” says the same Google rep. “And we’ve recently launched more AI-powered features to provide contextual awareness, including the ability to translate images with Lens, which enables you to search what you see with your camera app.”
For its part, Microsoft, which includes a helpful split screen for people facing each other on its highly rated translation app, boasts similar numbers. “We now have thousands of businesses using our technology to do batch, real-time, and document translation across 141 languages, as well as millions of active users taking advantage of live conversation through Microsoft Translator,” says Marco Casalaina, VP of product for Microsoft’s Azure AI.
When it comes to machine translation, there are basically two toolkits for converting tongues: small language models, like the open-source kind Microsoft uses “to be nimble, iterate faster, and scale effectively on important user devices,” and large language models, like the proprietary kind DeepL sells to 100,000 customers.
Some say the latter approach is more accurate and faster, but there are trade-offs: fewer supported languages (only a quarter of the 140 total for small language models) and no offline access, chief among them. But as DeepL’s Richter spins it, “We don’t offer offline translation, since end devices don’t provide the quality we want when working in the cloud.”
What’s next, then, for translation apps? Big Tech is mum for now.
"We don't speculate,” says a tight-lipped publicist from Apple, which first introduced its Siri-powered Translation app in 2020. “Soon, we will expand our web service to give users more options for translating image-based content, regardless of how you search for it,” says Google’s rep. For its part, DeepL is developing significant speech improvements “launching later this year.”
But none of this would even be possible without artificial intelligence, according to every developer I spoke to. “As AI continues to unlock new translation possibilities, we will remove the remaining language barriers,” says Microsoft’s Casalaina. “The tech just needs a few years to evolve,” adds DeepL’s Richter.
As my sometimes clumsy exchanges with Andre prove, today’s translation technology is mostly awesome but still confusing at times. Given that machines have been “speaking” for only 10 to 20 years, however, it’s hard to believe how good they’ve become at understanding and translating what our species has been doing for 200,000 years.
It might not be miraculous, but it’s pretty close.
Capisce?