Tuesday, January 18, 2022

Google Research Brings 'Massively Multilingual' Machine Translation to 200+ Languages - Slator - Translation

Google Research Brings ‘Massively Multilingual’ Machine Translation to 200+ Languages

Everything old is new again — including Google’s latest machine translation (MT) research. Co-authors Ankur Bapna, Orhan Firat, Yuan Cao, and Mia Xu Chen, who collaborated on a July 2019 paper presenting the culmination of five years’ work on a “massively multilingual” MT model, were joined this time around by Aditya Siddhant, Isaac Caswell, and Xavier Garcia.

Google’s January 2022 paper, Towards the Next 1,000 Languages in Multilingual Machine Translation, again takes up the cause of universal translation, addressing the challenge of scaling a massively multilingual model by training more parallel data. In addition to the prohibitive cost involved in collecting and curating parallel data for so many language pairs, this solution is typically unhelpful for many low-resource languages with limited data.

“Beyond the highest resourced 100 languages, bilingual data is a scarce resource often limited only to narrow domain religious texts,” the authors wrote. To build and train an MT model that covers more than 200 languages, Google researchers employed a mix of supervised and self-supervised objectives, depending on the data available for languages.

This “pragmatic approach,” as described by the authors, can enable a multilingual model to learn to translate effectively, even for severely under-resourced language pairs with no parallel data and little monolingual data. Moreover, they wrote, the results of their experiments “demonstrate the feasibility of scaling up to hundreds of languages without the need for parallel data annotations.”

Conceptually, the researchers explained, “one could think of this as monolingual data and self-supervised objectives […] helping the model learn the language and the supervised translation in other language pairs teaching the model how to translate by transfer learning.”

Pragmatic though it may be, the design is not new, ModelFront CEO and co-founder Adam Bittlingmayer told Slator, with “almost all competitive systems” now using some target-side monolingual data, even for major language pairs.

However, Bittlingmayer added, “it is in contrast to the recent publications from Facebook on this front.” For Facebook’s M2M-100, designed to avoid English as an intermediary between source and target languages, researchers manually created data for all pairs, while the social networking company snagged a November 2021 WMT win by focusing exclusively on translation to and from English.

Parallel or Monolingual Data?

The Google team performed two experiments, the first using parallel and monolingual data from the WMT corpus to train 15 different multilingual models, for 15 languages to and from English. Each model omitted parallel data for one language, simulating a realistic scenario in which parallel data is unavailable for all language pairs.

For each language, researchers then compared the performance of the “zero-resource model” (i.e., without parallel data) to a multilingual baseline trained on all language pairs using all parallel data available via the WMT corpus.

2021 M&A and Funding Report Product

Slator 2021 Language Industry M&A and Funding Report

Data and Research, Slator reports

46 pages on language industry M&A and venture funding. Includes financial investments, mergers, acquisitions, and IPOs.

For high-resource languages, this setup was able to match the performance of fully supervised multilingual baselines, but it was not enough to help the lowest-resource languages in the study (e.g., Kazakh and Gujarati) achieve high-quality translation. Adding monolingual data for those languages had a significant positive impact, improving translation quality above that of a supervised model.

“Even for high-resource languages, the method can achieve similar translation quality by leaving out parallel data entirely (for the language under evaluation) and throwing in 3–4 times monolingual examples, which would be easier to obtain,” the researchers wrote.

The team found that adding zero-resource languages in the same model diminishes performance across languages, while adding more languages with parallel data helps in all cases, since an unsupervised language learns something from each supervised pair. In the same vein, a lack of parallel data seems to be slightly more detrimental to translation quality, compared to a lack of monolingual data. 

Kenneth Heafield, Reader in MT at the University of Edinburgh, told Slator that these findings are not particularly surprising. “Using all the available data, parallel and monolingual, is usually best, provided it is clean,” he said, adding that of course, there are exceptions, such as extreme cases of domain mismatch: “Trying to translate software manuals when your only parallel data is the King James Bible is difficult.”

Beyond WMT

While high-quality, the WMT dataset is relatively small and covers a limited number of languages. To scale the model to cover more than 200 languages, the researchers conducted a second experiment, starting with a highly multilingual crawl of the web for monolingual and parallel data. 

They cleaned up the noisy dataset for the 100 lowest-resource languages to use for back translation. The cleaner version of the monolingual data was then translated into English, generating synthetic data for the zero-resource language pairs.

In this scenario, the authors wrote, “We find that xx→en and en→xx translation quality exhibit different trends.” 

Translation quality into English did not correlate well with the amount of monolingual data available for the non-English language; rather, the languages that performed well were typically those with similar languages in the supervised set. (In this context, the languages are not necessarily similar from a linguistic perspective, but have similar representations and labels learned within a massively multilingual MT model).

BLEU scores for English translation into other languages were high only for languages with high into-English translation quality, as well as relatively large amounts of monolingual data.

While the paper did not provide a timeline for when Google Translate users might benefit from this research, there is certainly widespread demand.

“On the product side, at this point, our median fellow human — more than four billion of us — is an Internet user and does not understand English. And there is a content explosion,” Bittlingmayer said. “So there is just a strong pull from the market, even if spend lags views.”

Adblock test (Why?)

Column: Translating UNC's COVID-19 communications - The Daily Tar Heel - Translation

Editor's Note: This article is satire.

We all know that as eloquent and articulate as his words can be, Chancellor Kevin Guskiewicz’s emails can be, well, hard to read. Our generation’s attention spans are rapidly decreasing, leaving in their wake a trail of impatience, boredom and a general distaste toward any written work that can’t be found on SparkNotes. 

But worry no more, there is no longer a need to put in an Adderall prescription every time another headache (but hopefully not cough, congestion or fever)-inducing COVID-19 update email is sent to the students and faculty of UNC. At The Daily Tar Heel, we have taken it upon ourselves to simplify these Shakespearean-esque soliloquies into a mere couple of sentences:

Rajee Ganesan
Snippet from general notice from UNC

Translation: Testing will be inaccessible so that the number of positive cases looks lower. If you do have COVID-19, you should probably figure it out yourself elsewhere.

Rajee Ganesan
Snippet from general notice from UNC

Translation: You need to isolate yourself if you are exposed to COVID-19, but we have gotten rid of quarantine dorms and have nowhere to send you. If you live anywhere that’s not North Carolina, I guess your roommate is in for a stressful next five to 10 days!

Rajee Ganesan
Snippet from general notice from UNC

Translation: You will very likely get COVID-19 this semester and have to miss class, but professors are not required to stream or record their lectures. Let’s hope you have friends in your classes, or else your GPA is going to drop faster than you can say the word “omicron.”

Rajee Ganesan
Snippet from general notice from UNC

Translation: Once again, it is your job to find a friend in your classes rather than the professor taking it upon themselves to give you a virtual version of the lesson. Oh, you don’t know anyone in your class? Why don’t you try speaking up in a breakout room of 15 strangers with their cameras off? Have you ever considered taking advantage of Zoom’s private messaging feature?

Rajee Ganesan
Snippet from general notice from UNC

Translation: Here, we are not so subtly flexing that we are the nation’s leader in infectious disease research, but are also likely soon to be the nation’s leader in infectious disease outbreaks on a college campus.

Rajee Ganesan
Snippet from general notice from UNC

Translation: Your compliance means everything to us.

Behind all of the three-syllable words and complex syntax is a complete lack of care for the students and faculty during one of the worst COVID-19 outbreaks yet.

I wish I could decipher even more from these emails, but after six hours of nonstop reading, I have a headache like a 7.0 magnitude earthquake. Unless ... I hope it’s not … I'd better get tested.

Has anyone read the emails well enough to know where I can do that?

@_hannahkaufman

opinion@dailytarheel.com

To get the day's news and headlines in your inbox each morning, sign up for our email newsletters.

SUBSCRIBE NOW

Adblock test (Why?)

Monday, January 17, 2022

Navi uses SharePlay to bring live subtitles and translation to FaceTime - iMore - Translation

Apple's addition of SharePlay with iOS 15.1 is one that didn't get as much attention as it perhaps deserved. Sure, it can be used to listen to music and watch movies with friends over FaceTime, but that's just the beginning. Developers can build on that and the result is some great apps — like Navi, an app that adds subtitles and translation to FaceTime.

Yes, you read that right. You can have a FaceTime call with someone who speaks another language and then have Navi automatically translate and provide subtitles on the fly. It's like magic but backed by APIs and hard work.

Just check out the promo video to see what makes Navi so cool!

Enable subtitles and see them on top of the FaceTime video window. The app opens FaceTime up to people with hearing impairments and other disabilities that prevent them from engaging easily in a video call environment.

All of Navi's processing is done on-device and then transmitted via Apple's SharePlay connection — ensuring privacy across all conversations. whether you're using iPhone, iPad, or Mac.

Sounds pretty cool, right? This is sure to be the best iPhone app for people who are learning a new language, for example. Or if you're someone whose hearing sometimes needs a little help — enable subtitles and chat away! Whatever the reason for using it, you can download Navi from the App Store right now. It's free with in-app purchases available.

We may earn a commission for purchases using our links. Learn more.

Adblock test (Why?)

Her Translation Agency Uses Real Human Translators – Not Problematic AI - The Story Exchange - Translation

Mariona Bolohan Lotuly

Mariona Bolohan and her husband got into the translation industry by chance. While living in Spain and selling antiques, they began translating documents that accompanied certain pieces. Eventually they started picking up more and more translation gigs and became full time translators. But Bolohan noticed some issues in the industry; many large translation agencies rely on AI, which creates flawed content – or they expect real human translators to work at a pace that is unrealistic. She felt there had to be a better approach, so she started her own translation agency, Lotuly. Today the London, England-based entrepreneur and her husband are focused on providing high quality human translation services while also building a team of translators who are well paid and able to do work they are passionate about.

Bolohan’s story, as told to The Story Exchange 1,000+ Stories Project:

What was your reason for starting your business?

Before my husband and I moved to England we were selling antiques in several markets in Spain. It was here that we translated our first documents, explaining the specifics of the items we were selling to our buyers. This led to a series of further enquiries for translations. Eventually, it got to a point where we were making more money from translating than from selling. And Robert and I realized how much we enjoyed the feeling of breaking through the language barrier and helping people to understand each other. That was our ‘aha’ moment and this is essentially where our business idea came from.

When I did freelance work I noticed that prospective clients would usually list something like the following: I’m trying to reach a new market, my copy is in English and well, it’s also translated with Google Translate but I’m not getting the ROI that I was expecting.

Boom! That’s what made us come up with a solution. You can argue there were other giant translation agencies and we didn’t stand a chance but they operate on a business model that undermines freelancers and they pester translators with mass emails trying to see who will do it cheaper and faster – that’s where we come in. We decided enough is enough, we’ve experienced that atrocious way of doing business and the only ones benefiting were agencies with deep pockets – not the clients. Clients would get bad quality, and often machine translation therefore losing money in the long run and at the same time suffering from a damaged reputation.

Big agencies would charge a lot of money and pay very little to its translators or sometimes not pay anything at all until 90 days after invoicing. We’ve scrapped that and decided to operate our translation agency by putting our translators first. We pay them upfront, they choose their place of work and time they want to do the work and they can do it remotely. We vet them, interview them and make sure they understand what we value about them. We make sure we offer human translation done by qualified experts in the subject matter. We love tech but machine translation and all these AI services like Google Translate are not sufficiently advanced in order to take over human translation. And a lot of B2B companies have seen a huge improvement in their sales by having their content translated by a human expert translator.

How do you define success?

For me, having started a business from scratch in a country where you were not born and still being profitable whilst making an impact to people’s lives and businesses and also the environment is success. Doing all that with your better half – it’s the best definition of success.

Tell us about your biggest success to date

Neither of us attended University and from the moment we stepped foot in the United Kingdom we focused on building our startup by solving a communication problem through translation. Then, we expanded little by little into building a team of language experts, and our services like localization, keyword research and even SEO translation to help other startups and big companies reach global markets in a sustainable way. We are not afraid to mention that we started small but now we’ve been on all sides, as clients, as translators and also as agency owners.

Related

Leadership requires a balance of caution and kindness. [Credit: Vlada Karpovich // Pexels]

4 Leadership Qualities Business Owners Should Strive For in 2022

What is your top challenge and how have you addressed it?

Our business was and still is 100% bootstrapped, we’ve always reinvested what we’ve made back into the business. We’ve not been successful at securing any funding from the government or any schemes offered because we couldn’t find any suitable ones for what we do. The industry, ‘Translation’ doesn’t exist when you try to choose what category your business is so it always ends up being the odd ”other.’ Despite all that we’ve managed to stay afloat and to be fair at the beginning of the pandemic we’ve turned over the same amount we did in the previous 6 months, in just 1 month.

Another problem is that people think that just because you speak two languages that alone automatically makes you a translator, and that is not true. You may be able to translate some basic stuff but to fully immerse yourself and convey the meaning of specific content from one language to another is an art, and people study and train themselves for years in order to do that.

For us it’s a no brainer, if your copy is not in the language of your customer’s heart it will not appeal to them as much. If your copy is but it’s done with Google Translate how will your customer feel? What impression does that give them of your company? Perhaps they will think that you did not put the effort or the funds in to reach them? You see, your customers want to feel appreciated and one way you can achieve that is by allocating a budget and investing in translation, otherwise how are you going to sell your business to them?

Have you experienced any significant personal situations that have affected your business decisions?

Definitely mental health, because before COVID-19, even though we worked from home we would still go outside, go buy something from the store, meet with friends, go out for a drink, etc. When COVID-19 hit and we were unable to go outside or meet people that meant that you would be in the house/office 24/7. As an entrepreneur launching their business that meant us being in front of a screen from morning till night because we would feel guilty for stopping working when we had so much to do and we couldn’t go out so what was the alternative?

I think for a lot of business owners it made it even more difficult to focus on their mental health during the Pandemic, at least this is one of the main challenges we faced. Also the uncertainty, a lot of companies went out of business because of COVID-19 so that put a lot of pressure on us to find new clients, especially in the translation industry, as translation is viewed as a commodity and not a necessity – so some businesses stopped translating their content.

Related

blank

Announcing the Winners of Our Women in Science Incentive Prize

What is your biggest tip for other startup entrepreneurs?

Don’t wait until everything is perfect because as a business owner it’ll never be, we delayed our launch almost a year just because we wanted to have everything perfect. In the end, there’s always so many things to do but your advantage is that your customers do not know that. So get your business out there as soon as it’s presentable and gather feedback.

Ask as many people as you can for feedback and tell them to please tell you what they like but most importantly what they don’t like, some feedback will sting a bit but it may help you to push your business forward as they may see something that you did not, that’s the raw and unfiltered feedback you are looking for in order to build a successful and robust, product, service or business.

Prioritize your mental and physical health. If you are not well no one will be able to care for your business, or what is a business if you are not able to enjoy it? Always trust your gut feeling, if something doesn’t feel right it probably isn’t. Be able to detach yourself from rejections, don’t take them personally, learn from them and move on, there is no point in thinking about it. You can’t force someone to understand the importance of your product or the good you are doing to their business and eventually the ROI. If they don’t see the value, move on to the next one.

How do you find inspiration on your darkest days?

Going for a walk through Black Park, reading the Bible, listening to the rain drops pouring on the concrete, there’s something about the sound of rain that relaxes you and exercising also helps with inspiration.

Who is your most important role model?

My role model is myself in 5 years, then myself in 10 and so forth. I strive to look up for myself and trust my skills and passion for the language industry and I want to be able to say I’m finally proud of myself and I keep looking up to myself every few years.

Share your story! Check out our Advice + Tips for entrepreneurs starting-up
Watch our latest
videos
Subscribe to our podcast

Adblock test (Why?)

Column: What's new in the dictionary? Turns out, a lot - Cody Enterprise - Dictionary

[unable to retrieve full-text content]

Column: What's new in the dictionary? Turns out, a lot  Cody Enterprise

Green Pulse Podcast: Climate dictionary - What does adapting to climate change mean? - The Straits Times - Dictionary

In this episode, The Straits Times environment correspondent Audrey Tan and climate change editor David Fogarty discuss this with Dr Arjuna Dibley, a researcher at the Oxford Sustainable Law Programme and a co-author of a recent UN report on adaptation. 

Highlights (click/tap above):

00:59 What is adaptation, and why is it important? 

03:19 What are some examples of adaptation? 

05:11 How much would it cost to adapt to climate change? 

09:53 What are the key points of contention when it comes to global discussions on adaptation? 

Climate change discussion at COP26: https://ift.tt/34S5xUp

UN report on adaptation: https://ift.tt/3BM0fES

Produced by: Audrey Tan (audreyt@sph.com.sg), David Fogarty (dfogarty@sph.com.sg), Ernest Luis and Hadyu Rahim

Edited by: Hadyu Rahim

Subscribe to Green Pulse Podcast series and rate us on your favourite audio apps:

Channel: https://str.sg/JWaf

Apple Podcasts: https://str.sg/JWaY

Spotify: https://str.sg/JWag

Google Podcasts: https://str.sg/J6EV 

Website: https://ift.tt/3r0NiSd

Feedback to: podcast@sph.com.sg

Follow Audrey Tan on Twitter: https://str.sg/JLMB

Read her stories: https://str.sg/JLM2

Follow David Fogarty on Twitter: https://str.sg/JLM6

Read his stories: https://str.sg/JLMu

Read ST's Climate Code Red site: https://str.sg/3pSz

---

Discover more ST podcast series:

Health Check Podcast: https://str.sg/JWaN

ST Sports Talk Podcast: https://str.sg/JWRE

#PopVultures Podcast: https://str.sg/JWad

Bookmark This! Podcast: https://str.sg/JWas

Lunch With Sumiko Podcast: https://str.sg/J6hQ

Discover BT Podcasts: https://bt.sg/pcPL

Follow our shows then, if you like short, practical podcasts!

Adblock test (Why?)

Sunday, January 16, 2022

Works in translation: A round-up of the best in fiction and non-fiction - The Irish Times - Translation

English-language debuts don’t come much more lacerating than Pola Oloixarac’s Mona (Serpent’s Tail, £12.99, translated by Adam Morris). The eponymous anti-heroine of her own book, Mona is a rising star of Peruvian literature who ends up on a Californian university campus smoking marijuana and popping prescription pills. Her acerbic, outrageous observations underline her conspicuous status as a woman and a writer of colour, regarded by others on campus as a curio.

Mona’s escape route from this somnolent, sun-drenched existence comes through a nomination for a significant European literary award: €2,000; 13 finalists, one winner. Mona heads off to an isolated spot in Sweden for the Grand Meeting, where she is claustrophobically immured with an international array of hipster competitors.

Adblock test (Why?)