Monday, October 11, 2021

Microsoft taps AI techniques to bring Translator to 100 languages - VentureBeat - Translation

Join gaming leaders online at GamesBeat Summit Next this upcoming November 9-10. Learn more about what comes next. 


Today, Microsoft announced that Microsoft Translator, its AI-powered text translation service, now supports more than 100 different languages and dialects. With the addition of 12 new languages including Georgian, Macedonian, Tibetan, and Uyghur, Microsoft claims that Translator can now make text and information in documents accessible to 5.66 billion people worldwide.

Its Translator isn’t the first to support more than 100 languages — Google Translate reached that milestone first in February 2016. (Amazon Translate only supports 71.) But Microsoft says that the new languages are underpinned by unique advances in AI and will be available in the Translator apps, Office, and Translator for Bing, as well as Azure Cognitive Services Translator and Azure Cognitive Services Speech.

“One hundred languages is a good milestone for us to achieve our ambition for everyone to be able to communicate regardless of the language they speak,” Microsoft Azure AI chief technology officer Xuedong Huang said in a statement. “We can leverage [commonalities between languages] and use that … to improve whole language famil[ies].”

Z-code

As of today, Translator supports the following new languages, which Microsoft says are natively spoken by 84.6 million people collectively:

  • Bashkir
  • Dhivehi
  • Georgian
  • Kyrgyz
  • Macedonian
  • Mongolian (Cyrillic)
  • Mongolian (Traditional)
  • Tatar
  • Tibetan
  • Turkmen
  • Uyghur
  • Uzbek (Latin)

Powering Translator’s upgrades is Z-code, a part of Microsoft’s larger XYZ-code initiative to combine AI models for text, vision, audio, and language in order to create AI systems that can speak, see, hear, and understand. The team comprises a group of scientists and engineers who are part of Azure AI and the Project Turing research group, focusing on building multilingual, large-scale language models that support various production teams.

Z-code provides the framework, architecture, and models for text-based, multilingual AI language translation for whole families of languages. Because of the sharing of linguistic elements across similar languages and transfer learning, which applies knowledge from one task to another related task, Microsoft claims it managed to dramatically improve the quality and reduce costs for its machine translation capabilities.

With Z-code, Microsoft is using transfer learning to move beyond the most common languages and improve translation accuracy for “low-resource” languages, which refers to languages with under 1 million sentences of training data. (Like all models, Microsoft’s learn from examples in large datasets sourced from a mixture of public and private archives.) Approximately 1,500 known languages fit this criteria, which is why Microsoft developed a multilingual translation training process that marries language families and language models.

Techniques like neural machine translation, rewriting-based paradigms, and on-device processing have led to quantifiable leaps in machine translation accuracy. But until recently, even the state-of-the-art algorithms lagged behind human performance. Efforts beyond Microsoft illustrate the magnitude of the problem — the Masakhane project, which aims to render thousands of languages on the African continent automatically translatable, has yet to move beyond the data-gathering and transcription phase. Additionally, Common Voice, Mozilla’s effort to build an open source collection of transcribed speech data, has vetted only dozens of languages since its 2017 launch.

Z-code language models are trained multilingually across many languages, and that knowledge is transferred between languages. Another round of training transfers knowledge between translation tasks. For example, the models’ translation skills (“machine translation”) are used to help improve their ability to understand natural language (“natural language understanding”).

In August, Microsoft said that a Z-code model with 10 billion parameters could achieve state-of-the-art results on machine translation and cross-lingual summarization tasks. In machine learning, parameters are internal configuration variables that a model uses when making predictions, and their values essentially — but not always — define the model’s skill on a problem.

Microsoft is also working to train a 200-billion-parameter version of the aforementioned benchmark-beating model. For reference, OpenAI’s GPT-3, one of the world’s largest language models, has 175 billion parameters.

Market momentum

Chief rival Google is also using emerging AI techniques to improve the language-translation quality across its service. Not to be outdone, Facebook recently revealed a model that uses a combination of word-for-word translations and back-translations to outperform systems for more than 100 language pairings. And in academia, MIT CSAIL researchers have presented an unsupervised model — i.e., a model that learns from test data that hasn’t been explicitly labeled or categorized — that can translate between texts in two languages without direct translational data between the two.

Of course, no machine translation system is perfect. Some researchers claim that AI-translated text is less “lexically” rich than human translations, and there’s ample evidence that language models amplify biases present in the datasets they’re trained on. AI researchers from MIT, Intel, and the Canadian initiative CIFAR have found high levels of bias from language models including BERT, XLNet, OpenAI’s GPT-2, and RoBERTa. Beyond this, Google identified (and claims to have addressed) gender bias in the translation models underpinning Google Translate, particularly with regard to resource-poor languages like Turkish, Finnish, Persian, and Hungarian.

Microsoft, for its part, points to Translator’s traction as evidence of the platform’s sophistication. In a blog post, the company notes that thousands of organizations around the world use Translator for their translation needs, including Volkswagen.

“The Volkswagen Group is using the machine translation technology to serve customers in more than 60 languages — translating more than 1 billion words each year,” Microsoft’s John Roach writes. “The reduced data requirements … enable the Translator team to build models for languages with limited resources or that are endangered due to dwindling populations of native speakers.”

VentureBeat

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member

Adblock test (Why?)

Sunday, October 10, 2021

26 Korean Words Added to English Dictionary | HYPEBAE - HYPEBAE - Dictionary

With the rise of South Korea‘s influence on music, entertainment, food and more, the Oxford English Dictionary (OED) has now been updated with 26 Korean words.

The country’s popular culture has risen to global fame thanks to Bong Joon-Ho‘s award-winning Parasite, K-pop groups BTS and BLACKPINK and most recently, Netflix‘s Squid Game. The fashion industry is looking to Korea for some of the most exciting up-and-coming designers, while beauty fanatics are stocking their vanities with K-beauty products. Recognizing the Korean wave (also known as hallyu), the OED has added dozens of entries to its vocabulary.

Standouts include K-drama, which is defined as “a television series in the Korean language and produced in South Korea.” A batch of dishes have also been added, including chimaek (fried chicken served with beer), galbi (beef short ribs, usually marinated in soy sauce, garlic, and sugar, and sometimes cooked on a grill) and bulgogi (thin slices of beef or pork which are marinated then grilled or stir-fried).

Elsewhere, hanbok — the traditional Korean costume typically worn on formal or ceremonial occasions — has been introduced, as well as aegyo, defined as a kind of “cuteness or charm, esp. of a sort considered characteristic of Korean popular culture.” Mukbang, which has become a significant category in the world of YouTube, has also made it to the list.

Scroll down to see the full list of newly added and updated Korean words in the OED.

aegyo
banchan
bulgogi
chimaek
daebak
dongchimi
fighting
galbi
hallyu
hanbok
Hangul
japchae
K-drama
kimbap
Kono
manhwa
mukbang
noona
oppa
PC bang
samgyeopsal
sijo
skinship
taekwondo
Tang Soo Do
unni

Read Full Article

Adblock test (Why?)

'Squid Game' is the latest example of when subtitles are a little off - NPR - Translation

Netflix's Squid Game is a huge hit, but some say its subtitles are inaccurate. Podcast host Youngmi Mayer and translation professor Denise Kripper explain why things got lost in translation.

LULU GARCIA-NAVARRO, HOST:

If you haven't already watched "Squid Game," you have probably heard about it. Netflix's new survival drama is set in South Korea, and its premise is not a happy one.

(SOUNDBITE OF MUSIC)

GARCIA-NAVARRO: Each of its players are deep in debt, but if they win, they'll have enough prize money to pay those loans off. The catch - losing costs you your life.

(SOUNDBITE OF TV SHOW, "SQUID GAME")

UNIDENTIFIED ACTOR #1: (As character, non-English language spoken).

GARCIA-NAVARRO: "Squid Game" is yet another example of how Korean media is dominating the global market, but some viewers have noticed its English subtitles are a little off.

YOUNGMI MAYER: I'm Youngmi Mayer. I am a comedian based in New York City, and I'm also the co-host of "Feeling Asian" podcast.

GARCIA-NAVARRO: Youngmi Mayer is fluent in Korean. And while watching "Squid Game," she noticed that the show's English captions didn't quite reflect what the characters were actually saying. She took her thoughts to TikTok - where else would you take this? - along with a scene from the show.

(SOUNDBITE OF TV SHOW, "SQUID GAME")

UNIDENTIFIED ACTOR #2: (As character, non-English language spoken).

MAYER: Translation says, oh, I'm not a genius, but I can work it out. What she actually said was, I am very smart. I just never got a chance to study.

GARCIA-NAVARRO: And because of those inaccuracies, Youngmi Mayer says that audiences may not understand the show's cultural references.

MAYER: That is a huge trope in Korean media - the poor person that's smart and clever and just isn't wealthy.

GARCIA-NAVARRO: Now, translating subtitles for TV can be tricky. There are rules.

DENISE KRIPPER: There's space limitation that you have to keep in mind.

GARCIA-NAVARRO: Denise Kripper is a translation scholar and assistant professor at Lake Forest College in Illinois. She also has experience translating TV shows from English into Spanish.

KRIPPER: Translation in subtitles is usually two lines, and there's a certain number of characters that you cannot pass.

GARCIA-NAVARRO: There's also trying to fit it all within the constraints of character limits and scene speed. But Kripper says there's another challenge, one that's far trickier. Languages have different structures and different metaphors, so it can be really hard to accurately convey meaning. Jokes can be especially difficult, like this scene she had to translate from the sitcom "Friends."

KRIPPER: Chandler is waiting for the phone to ring, to hear from some woman, I think.

GARCIA-NAVARRO: Meanwhile, Ross and Phoebe are doing a crossword.

(SOUNDBITE OF TV SHOW, "FRIENDS")

DAVID SCHWIMMER: (As Ross Geller) Four letters - circle or hoop.

MATTHEW PERRY: (As Chandler Bing) Ring, damn it, ring.

SCHWIMMER: (As Ross Geller) Thanks.

(LAUGHTER)

GARCIA-NAVARRO: Denise Kripper says an exact translation of the scene doesn't really work in Spanish.

KRIPPER: To ring - the phone to ring is one word in Spanish, and a ring that you can wear on your fingers - a totally different one. So, again, this is a lot to work with for such a short amount of time, right?

GARCIA-NAVARRO: Kripper says in cases like this, the translator may have to change the dialogue in a scene rather than translate word for word and leave viewers confused. Youngmi Mayer says she knows translators are limited in what they can do but worries viewers who rely on subtitles when they watch fast-paced shows like "Squid Game" are getting short-changed.

MAYER: It just seems like maybe this is a time for us to just pause and rethink and restructure the old way of translations. Those metaphors and deep, like, very intelligently written ideas and ideologies that the writer's trying to express to us - they're getting literally just taken out of the script because the translation can't translate that in real time.

GARCIA-NAVARRO: In the meantime, "Squid Game" fans who want to have a fuller understanding of the context of the show can do like Youngmi Mayer did when she watched "Breaking Bad" - or, really, any of us who enjoy parsing episodes of any series - and just pick up a smartphone, Google and never miss a pop culture reference again.

(SOUNDBITE OF JUNG JAEIL'S "UNFOLDED..."

Copyright © 2021 NPR. All rights reserved. Visit our website terms of use and permissions pages at www.npr.org for further information.

NPR transcripts are created on a rush deadline by Verb8tm, Inc., an NPR contractor, and produced using a proprietary transcription process developed with NPR. This text may not be in its final form and may be updated or revised in the future. Accuracy and availability may vary. The authoritative record of NPR’s programming is the audio record.

Adblock test (Why?)

English translation of iconic Bangla kids' collection Thakurmar Jhuli - The Tribune - Translation

Sutapa Basu’s translation of Thakurmar Jhuli, an iconic work of children’s literature written more than 100 years ago by Dakshinarajan Mitra Majumdar has been released recently. The book has been published under Readomania’s children’s imprint, Reado Junior.

 Dakshinaranjan Mitra Majumdar collected folktales from villages and towns across Bengal and rendered them into a unique Bengali collection of children’s fiction, titled Thakumar Jhuli. Enjoyed by children over the ages, the anthology became synonymous with the cultural heritage of the region.

Basu’s translation promises to take readers to an enchanting land sprinkled with flying horses, speaking birds, cunning foxes, indestructible monsters, bold princes, and even bolder, beautiful princesses.

The book reflects the region’s cultural heritage in its semi-realistic illustrations and icons reminiscent of the rice-paste alpona patterns, a familiar sight at all auspicious occasions in Bengal.

The translator says, "This edition subtitled Princesses, Monsters and Magical Creatures is a translation and not a transliteration. I intended to and have adhered strictly to the original narrative. Nevertheless, a few adjustments have been unavoidable, primarily due to differences in linguistic nuances between the two languages."

Sutapa Basu is a best-selling author. Her latest book, The Curse of Nader Shah won the Best Fiction Award by AutHer Awards, 2020 instituted by JK Papers and The Times of India.

Adblock test (Why?)

Saturday, October 9, 2021

Korean Experts Break Down The Translation Issues From Squid Game - LADbible - Translation

If you are one of the millions of people worldwide who have been enjoying Squid Game recently, you might be interested to know that some people believe that you've not been getting the whole message correctly. You can see their take on what viewers are missing out on below:

Loading…

So obviously, the show is from South Korea, and therefore most people are watching it with subtitles, or with dubbed audio.

However, you're very much at the mercy of the translators by that stage and some of the people who actually speak Korean say that there are some parts of the show that are lost.

One such person, who goes by the name of Youngmi Mayer, explained what's going on over on Twitter.

She said: "I want to do a scene breakdown on TikTok to show you what they could've translated to I might work on it today just so you can see what I mean and see what you missed.

"Such a shame. Translation is extremely important."

In a video shared about the show, she explained how certain characters - in this case, a 'low-class gangster' character - are represented differently in the translation.

For example, at one point the subtitles read 'I'm not a genius but I can work it out', whereas the actual Korean was 'I'm very smart I just didn't get a chance to study'.

That's important, apparently, because it's the 'entire purpose' of the character, and represents a trope of Korean culture, according to Youngmi.

Anyway, it's been a wildly successful show, so perhaps they'll be able to update it with different translations at a later date.

Not everyone thinks it's so rubbish, though.

Euijin Seo, a Korean language teacher, told Buzzfeed that you can't exactly call it 'bad', as: "All dialogues in the show are extremely Korean-ish, reflecting Korean culture,"

He continued: "The process of translation must have been tough because there are tons of terms in Korean that cannot be directly translated into English."

So, there's obviously nuances of the dialogue that we're not getting, but essentially the show is the same regardless of the translation.

Credit: Netflix
Credit: Netflix

Why would Netflix deliberately make the show different for different cultures, after all?

Either way, it's the biggest show in the world just now, so they're probably not too worried about the issues that some folks have with the translation.

Adblock test (Why?)

Friday, October 8, 2021

This Korean American Woman Pointed Out Inaccurate Translations In The English Version Of "Squid Game," And It's Starting A Big Online Debate - BuzzFeed - Translation

"Our job as the dubbing actor is to faithfully maintain the spirit and tone and most importantly, the mouth flaps [the way the mouth moves up and down] of the character, so that it matches to the original as close as possible. A lot of times, a translator will revise the script on the spot, as they try to figure out synonyms that will still match with the meaning of what the original dialogue is, while still matching the flaps of the language."

Adblock test (Why?)

New Nevada Law Protects Limited-English Proficiency Consumers by Requiring Translation of Certain Financial Legal Documents - JD Supra - Translation

How to provide financial services to limited-English proficiency (“LEP”) consumers has become a pressing legal issue. Both federal and state laws provide requirements and limitations regarding translations of financial documents. Earlier this year, the Consumer Financial Protection Bureau (“CFPB”) published a comprehensive statement encouraging financial institutions to provide services to LEP consumers. The CFPB also took enforcement action against a company for, among other things, deceptively marketing to Spanish-speaking consumers. Following the trend to protect LEP consumers, a new Nevada law, effective October 1, 2021, makes it a deceptive practice to not  provide translations for certain financial contracts, agreements and disclosures (“Nevada Law”).

Under the Nevada Law, enacted as Assembly Bill No. 359, any person, who in the course of business, advertises and negotiates certain transactions in a language other than English must provide a translation of the contract or agreement that results from the advertising and negotiations. The translation must include every term and condition of the contract or agreement.

Which transactions are covered?

Subsection 3 of Section 4 of the Nevada Law requires that translations be provided for a contract or an agreement with an LEP consumer that results from either of the following:

  • A loan or extension of credit secured by property, other than real property, used for personal or household purposes
  • A lease, sublease, rental contract or other contract or agreement for a term of at least one month and that involves a dwelling, apartment, mobile home or dwelling unit used as a residence
  • An unsecured loan used for personal, family, or household purposes

Who is exempted from the translation requirements?

According to subsection 4 of Section 4, the translation requirements do not apply to banks or savings and loan associations that have physical locations and engage in a transaction other than the issuance of a credit card or automobile loan.

Under Section 5, a financial institution that is required to make disclosures under Regulation M (Consumer Leasing) or Regulation Z (Truth-in-Lending) will be deemed in compliance with the Nevada Law if it provides translations for those disclosures in the same language of the contract and provides the translated disclosure to the contracting parties before execution.

What is not required to be translated?

As detailed above, the law requires that every term and condition be translated. However, under Section 8 of the Nevada Law, the following text does not have to be translated:

  • Names and titles of persons
  • Addresses
  • Brand names, trade names, trademarks, and registered service marks
  • Make and model of goods or services
  • Numerals, dollar amounts expressed in numerals and dates
  • Individual words or expressions that do not have a generally accepted non-English translation

What are the consequences for not complying with the translation requirements?

Section 9 of the law provides the aggrieved party (e.g., LEP consumer) the right to rescind the contract or agreement if the financial institution fails to comply with the translation requirements.

Takeaways

As state attorneys general and legislatures take more measures to protect LEP consumers, financial institutions should revise their LEP policies and guidelines to ensure compliance with state and federal laws and regulations. The consequences for failure to comply with these regulations can be substantial as the new Nevada Law and recent CFPB enforcement activity show.

Adblock test (Why?)