Saturday, October 23, 2021

Foreign Language Translation for the IC Gets a Machine Learning Boost From IARPA - Hstoday - HSToday - Translation

Some of the hottest, trending languages are Kazakh, Swahili and Pashto. Well, at least for the U.S. Intelligence Community (IC).

It’s probably safe to say that no organization is more interested in what foreign nationals are saying and writing than the IC. This is especially true for what’s being said in widely spoken languages of U.S. adversaries, like China and Russia. However, it’s also the case for “low resource” languages that are spoken by much smaller populations around the globe, like Kazakh, Swahili and Pashto. 

The perennial challenge the IC has faced is how to quickly and accurately interpret those lesser-used languages or any language.

Using human beings to translate the quadrillions of words written and spoken by people around the world every day would be an incredibly time intensive and expensive endeavor. Fortunately, with its Machine Translation for English Retrieval of Information in Any Language (MATERIAL) program, IARPA is revolutionizing the way the IC consumes foreign language information. 

By using machine learning to turn multilingual text and speech media into useable intelligence information for analysts, regardless of their language expertise, the need for human translation is substantially waning.  

“The MATERIAL program has really altered the landscape by making it possible for anyone to efficiently find information in low resource languages,” said MATERIAL Program Manager Dr. Carl Rubino. “This is a game-changer for the IC, revolutionizing the way we access important foreign language data.”    

Launched in October 2017, MATERIAL program performers, including Johns Hopkins University, Raytheon BBN Technologies, Columbia University and the University of Southern California Information Sciences Institute, were charged with building robust, automated language capabilities over a four-year period. MATERIAL’s ultimate goal was to build Cross-Language Information Retrieval (CLIR) systems that would find speech and text content in diverse lower-resource languages, using only English search queries, and succinctly relay the retrieved relevant foreign language information in English. Performers exceeded expectations and have successfully done just that

In addition to Kazakh, Swahili and Pashto, the CLIR systems performers developed include state-of-the-art automatic speech recognition and machine translation systems and models for other languages such as Tagalog, Somali, Lithuanian, Georgian, Bulgarian and Farsi.  

MATERIAL technologies were recently deployed in SCALE 2021, a multinational Summer Workshop at Johns Hopkins University that is devoted to exploring topics in human language technology. This summer’s topic was Cross-Language Information Retrieval. Using lessons learned and baseline models from the program, SCALE scientists were able to develop customized CLIR capabilities for Chinese, Russian and Farsi.

“I’m thrilled this technology is taking root,” Dr. Rubino said. “With continued IC investment and championship, this relatively novel approach for data discovery should soon be a standard and reliable tool for our analysts.”

Read the announcement at IARPA

Adblock test (Why?)

Searching for Meaning in ‘The Critical Dictionary of Southeast Asia' at the Crow Museum of Asian Art - NBC 5 Dallas-Fort Worth - Dictionary

Dallas

Searching for Meaning in ‘The Critical Dictionary of Southeast Asia' at the Crow Museum of Asian Art

The moving image installation by Ho Tzu Nyen is on view through January 30 in the Dallas Arts District.

Crow Museum of Asian Art The Critical Dictionary of Southeast Asia tiger film still
Courtesy the artist and Edouard Malingue Gallery.

Not all dictionaries are books. Ho Tzu Nyen’s The Critical Dictionary of Southeast Asia is a moving image installation that questions generalized ideas about a complicated part of the world. The U.S. premiere of the work is now on view through January 30 at the Crow Museum of Asian Art of The University of Texas at Dallas.

The Critical Dictionary of Southeast Asia is an ongoing project Ho has been developing over several years. “It really just stems from one question: what do we think of when we think of Southeast Asia? How do we define Southeast Asia?” said Dr. Jacqueline Chao, the museum’s chief curator and curator of this exhibition.

A quick glance at a map shows the diversity of the region. Southeast Asia includes 11 countries: Brunei, Cambodia, East Timor, Indonesia, Laos, Malaysia, Myanmar, Philippines, Singapore, Thailand, and Vietnam. “Southeast Asia is an umbrella term to group of a large cluster of countries and cultures and regions that are not necessarily unified at all by language, by politics, by religion. They’re all very different,” Chao said.

The Critical Dictionary of Southeast Asia investigates the region’s distinctive qualities. “The beauty of Ho’s work is his thoughtfulness to take this on, deconstructing a term like Southeast Asia is an invitation to see the individual rather than the habitual action of grouping individuals, cultures and meanings,” said Amy Lewis Hofland, Senior Director of the Crow Museum of Asian Art.

Crow Museum of Asian Art The Critical Dictionary of Southeast Asia film set-up
Chad Redmon
Viewers sit on socially distanced benches to watch the constantly shifting video.

The constantly changing video features a series of images representative of specific keywords and concepts significant to Southeast Asian culture. “It’s a mixture of found footage from the internet, movies, pop culture. He’s also looking at terms from mythology, legends, mythical creatures like the weretiger or other things that have an interesting twist in a Southeast Asian context,” Chao said.

The Scene

Troy Aikman Announces Music Festival to Help His Hometown

Greenberg Smoked Turkeys Returning to Tables After ‘Devastating' Fire

The keywords including anarchism, buffalo, corruption, decay, epidemics, forest, ghosts, and humidity are organized alphabetically. An algorithm created by the artist with software developer Jan Gerber and media artist Sebastian Lütgert generates different permutations with every screening.

The video is narrated in English with texts and notes Ho accumulated through extensive research. The narration varies from a whisper to a quick tempo repetition. Singaporean musician and vocalist Bani Haykai sings a series of Ho’s notes, combining vivid imagery, music and spoken text.   

The video is accompanied by an LED light installation. “The computer itself will trigger flashes behind the screen that will wash out the image periodically. It’s randomized when the lights do that,” Chao said.

The random flashes ensure the viewer never becomes a passive observer while sitting on socially distanced benches. “Some of that is to think about our desensitization to media and how we absorb a lot of information,” Chao said.

The video runs on an infinite loop with the alphabet beginning again just when it ends. The concept of a dictionary as a video defies the construct of a tangible dictionary and the definitions within it. “I think what he’s trying to do is mess with that a little bit,” Chao said. “It’s the questioning of truth and fact, what we know and what we think we know.”

Crow Museum of Asian Art The Critical Dictionary of Southeast Asia index
Chad Redmon
Visitors can examine texts related to the work as well as an index of the dictionary terms.

The exhibition features additional texts related to the work, including an index of dictionary terms and a selection from Ho’s research notes.

Ho continues to develop this project, creating more nuanced definitions of a multifaceted region. “Southeast Asia is a region of dazzling heterogeneity, characterized by an unruly plurality of languages, ethnicities and belief systems, and this project can, in some sense, be regarded as attempt to find, or to create, a form for this region which is not one,” Ho said.

The Crow Museum of Asian Art’s exhibition is the fullest iteration of the project, enveloping the viewer in an immersive reading of a dictionary. “It’s a full sensory experience,” Chao said.

Learn more: https://ift.tt/2XBsasW

Adblock test (Why?)

Foreign Translation for the IC Gets a Machine Learning Boost from IARPA - Hstoday - HSToday - Translation

Some of the hottest, trending languages are Kazakh, Swahili and Pashto. Well, at least for the U.S. Intelligence Community (IC).

It’s probably safe to say that no organization is more interested in what foreign nationals are saying and writing than the IC. This is especially true for what’s being said in widely spoken languages of U.S. adversaries, like China and Russia. However, it’s also the case for “low resource” languages that are spoken by much smaller populations around the globe, like Kazakh, Swahili and Pashto. 

The perennial challenge the IC has faced is how to quickly and accurately interpret those lesser-used languages or any language.

Using human beings to translate the quadrillions of words written and spoken by people around the world every day would be an incredibly time intensive and expensive endeavor. Fortunately, with its Machine Translation for English Retrieval of Information in Any Language (MATERIAL) program, IARPA is revolutionizing the way the IC consumes foreign language information. 

By using machine learning to turn multilingual text and speech media into useable intelligence information for analysts, regardless of their language expertise, the need for human translation is substantially waning.  

“The MATERIAL program has really altered the landscape by making it possible for anyone to efficiently find information in low resource languages,” said MATERIAL Program Manager Dr. Carl Rubino. “This is a game-changer for the IC, revolutionizing the way we access important foreign language data.”    

Launched in October 2017, MATERIAL program performers, including Johns Hopkins University, Raytheon BBN Technologies, Columbia University and the University of Southern California Information Sciences Institute, were charged with building robust, automated language capabilities over a four-year period. MATERIAL’s ultimate goal was to build Cross-Language Information Retrieval (CLIR) systems that would find speech and text content in diverse lower-resource languages, using only English search queries, and succinctly relay the retrieved relevant foreign language information in English. Performers exceeded expectations and have successfully done just that

In addition to Kazakh, Swahili and Pashto, the CLIR systems performers developed include state-of-the-art automatic speech recognition and machine translation systems and models for other languages such as Tagalog, Somali, Lithuanian, Georgian, Bulgarian and Farsi.  

MATERIAL technologies were recently deployed in SCALE 2021, a multinational Summer Workshop at Johns Hopkins University that is devoted to exploring topics in human language technology. This summer’s topic was Cross-Language Information Retrieval. Using lessons learned and baseline models from the program, SCALE scientists were able to develop customized CLIR capabilities for Chinese, Russian and Farsi.

“I’m thrilled this technology is taking root,” Dr. Rubino said. “With continued IC investment and championship, this relatively novel approach for data discovery should soon be a standard and reliable tool for our analysts.”

Read the announcement at IARPA

Adblock test (Why?)

Word for word: A dictionary to take Sanskrit to the world - Hindustan Times - Dictionary

What does it take to concise an ancient language into a modern dictionary? Wknd sits down with Shashi Bala, who is heading a unique effort.
The English-to-Sanskrit dictionary Bala is working on will be published by the Bharatiya Vidya Bhavan foundation. PREMIUM
The English-to-Sanskrit dictionary Bala is working on will be published by the Bharatiya Vidya Bhavan foundation.
Updated on Oct 23, 2021 03:28 PM IST
By Dipanjan Sinha

Shashi Bala, 65, has been steeped in Sanskrit for decades, and soon she’ll have a unique dictionary to show for her efforts.

The multi-volume work — one volume is being printed, two are nearly done, and there will be a total of seven or eight volumes and 3,200 pages in all — is perhaps the most extensive of her projects. But Bala has been fascinated with Sanskrit since she was 15, and began studying the language and its cultural influence across Asia, as well as contributing scholarship herself, soon after.

“I set out wanting to be a doctor, like my aunt,” Bala says, laughing. It was her father who introduced her to the work of linguist and scholar Raghu Vira (1902 - 1963). He had a printing press in Delhi and many of Vira’s books were published there.

First reading Vira’s work as a 15-year-old, in 1965, “it was just fascinating to know that the Sanskrit language had spread to countries like China, Japan and Mongolia so many centuries ago. That was how the seed of scholarship was sown in my mind by my father,” Bala says.

In college, she opted to study Sanskrit. In 1977, she began her life in research, with an MPhil project on Sanskrit grammatical texts from Indonesia.

“The research opened up a fantastic world. There is Sanskrit poetry from Indonesia, Buddhist literature, and their versions of the Ramayana and Mahabharata,” she says. “According to the Vietnamese Ramayana, Rama after his victory over Ravana does not go to Ayodhya but goes to Vietnam via Thailand. The geography, flora and fauna described in this Ramayana is of these countries, not of India,” she says.

Tracking Sanskrit across the region involved some fascinating journeys. As part of the research for her PhD in Vedic deities in Japanese Buddhist art, Bala travelled to Japan. Tracing the Sanskrit and Brahmi scripts along the Silk Route in 2016, she traversed the Taklamakan desert and ended up in Shanghai. Revisiting the route that took Buddhism to Japan, she made her way through China. “I must say that the Chinese government has done stellar work in the upkeep of this history,” Bala says.

Through it all, the dictionary was waiting its turn. The entries were first compiled by Vira in the 1940s. His son Lokesh Chandra, also a Vedic scholar, was then closely associated with the project. In 2013, on Chandra’s encouragement, Bala dedicated herself to making the hundreds of thousands of entries print-ready and seeing the dictionary into publication. This involved checking, editing and reorganising the material.

Complexity was one challenge; sheer volume, another. In terms of complexity, for each English verb, for instance, Sanskrit has numerous equivalents, each with its own nuance. In terms of volume, “the material fills 114 boxes and about 300,000 index cards,” Bala says.

So far, there has been one known English-to-Sanskrit dictionary, created by the 19th-century British scholar Monier Monier-Williams. “But his purpose, his way of studying and understanding the culture that this language comes from, is very different from Raghu Vira’s and ours,” Bala says. “This project could dramatically change the way the world and India reads its ancient history and culture.” It could also make that history and culture more accessible to its inheritors and scholars in India and around the world.

The dictionary is being published by the Bharatiya Vidya Bhavan foundation and will be priced at 1,800 per volume. Once the first print run is completed, a digital version will also be available online too.

“I am excited to be able to bring to the world the work of a fabulous scholar like Raghu Vira. This dictionary should help people trying to understand Sanskrit and its influence immensely,” Bala says.

Enjoy unlimited digital access with HT Premium

Subscribe Now to continue reading
Start 15 Days Free Trial
freemium
TRENDING TOPICS
  • PM Narendra Modi
  • Horoscope Today
  • Gold Price
  • T20 World Cup 2021
  • KBC 13
  • Sudha Chandran

Adblock test (Why?)

Friday, October 22, 2021

Vatican Issues Decree Clarifying Responsiblities for Translation of Latin Liturgical Texts - National Catholic Register - Translation

Archbishop Roche underlined that the translation of liturgical texts is “a great responsibility” because “the revealed word can be proclaimed and the prayer of the Church can be expressed in a language which the people of God can understand.”

VATICAN CITY — The Vatican issued a decree on Friday guiding bishops’ conferences on the proper protocol for the translation of liturgical texts from Latin into vernacular languages.

Published on Oct. 22, the feast of St. John Paul II, the decree, called Postquam Summus Pontifex, clarifies changes already made by Pope Francis to the process of translating liturgical texts.

The decree from the Congregation for Divine Worship builds on a motu proprio Pope Francis issued in September 2017 shifting responsibility for the revision of liturgical texts toward bishops’ conferences. 

The motu proprio, Magnum Principium, modified Canon 838 of the Code of Canon Law, which addresses the authority of the Vatican and national bishops’ conferences in preparing liturgical texts in vernacular languages.

The decree implementing this change to canon law comes four years after Pope Francis’ motu proprio was first published and a few months after the appointment of Archbishop Arthur Roche as the prefect of the Congregation for Divine Worship, succeeding Cardinal Robert Sarah.

“Fundamentally the aim is to make collaboration between the Holy See and the bishops’ conferences easier and more fruitful,” the 71-year-old English archbishop said in an interview with Vatican News.

“The great task of translation, especially translating into their own languages what we find in the liturgical books of the Roman Rite, falls to the bishops.”

Archbishop Roche, who also published a commentary on the new decree, underlined that the translation of liturgical texts is “a great responsibility” because “the revealed word can be proclaimed and the prayer of the Church can be expressed in a language which the people of God can understand.”

With the 2017 motu proprio, the text of Canon 838 changed to read: “It is for the Apostolic See to order the sacred liturgy of the universal Church, publish liturgical books, recognize adaptations approved by the episcopal conference according to the norm of law, and exercise vigilance that liturgical regulations are observed faithfully everywhere.”

The text of the following paragraph added that it was the responsibility of bishops’ conferences “to approve and publish the liturgical books for the regions for which they are responsible after the confirmation of the Apostolic See.”

The new decree from the Congregation for Divine Worship presents the norms and procedures to be taken into account when publishing liturgical books. 

It says that the Holy See remains responsible for reviewing the adaptations approved by bishops’ conferences and confirming the translations that are made.

“This reform of Pope Francis aims to underline the responsibility and competence of the bishops’ conferences, both in assessing and approving liturgical adaptations for the territory for which they are responsible, and in preparing and approving translations of liturgical texts,” Archbishop Roche said.

“The bishops, as moderators, promoters, and custodians of liturgical life in their particular church, have a great sensitivity, due to their theological and cultural formation, which enables them to translate the texts of Revelation and the Liturgy into a language that responds to the nature of the People of God entrusted to them,” he said.

Adblock test (Why?)

New edition of landmark English-Yiddish Dictionary includes “lockdown” and “breakout room” - Forward - Dictionary

The world has changed massively in five years – from new political movements to a global pandemic, and now your Yiddish can keep up with it.

Five years after the Comprehensive English-Yiddish Dictionary was first published, comes a revised and expanded second edition. Both versions, edited by Gitl Schaechter-Viswanath and Dr. Paul Glazer, were published by the League for Yiddish.

The source for many of the terms listed in the dictionary was 87 card catalogs and shoeboxes of Yiddish words and phrases compiled by the late lexicographer Dr. Mordkhe Schaechter, with the intention of publishing the first English-Yiddish dictionary since Uriel Weinreich’s classic one was published more than fifty years ago. But Dr. Schaechter passed away before completing his life’s work, so his daughter, Schaechter-Viswanath, and Yiddish linguist Glazer took on the challenge of finishing the task.

Following the first edition’s widely-hailed release in 2016, which included a glowing review by the New York Times, the dictionary became the new standard for anyone searching for the answer to the question: “How do you say that in Yiddish?”

The second volume enables Yiddish speakers to update their vocabulary to stay current in today’s changing world. It comprises more than 84,000 entries, with nearly 1,000 additional words and expressions, including new contemporary terms from the fields of technology, science, and politics.

Among the new terms, are Yiddish translations for “to be in lockdown” – zayn farshpart and “breakout room” – der baytzimer.

To order the dictionary, click here.

Adblock test (Why?)

‘Climate crisis’ has made it into the Oxford English Dictionary - Grist - Dictionary

As formerly green forests turn into charred remains and glaciers melt away to reveal bare mountainsides, the effects of climate change on the landscape are hard to miss. But there are less obvious results, too, as our conversations adapt to a rapidly changing climate, ushering in new words.

In a special update this month, the Oxford English Dictionary reviewed the scope of this “rapidly changing area of vocabulary” encompassing words and phrases like eco-anxiety, net-zero, and climate strikes. The dictionary’s editors updated old entries and added new ones ahead of the U.N. climate summit in Glasgow, Scotland next week, where world leaders will meet to hash out their climate pledges. Among the new entrants: global heating, food insecurity, and climate crisis.

The update reflects the urgency and the often complicated emotions that people feel when confronted by rising seas, worsening floods, and hotter temperatures. The editors picked eco-anxiety — “apprehension about current and future harm to the environment” — to make its dictionary debut, a signal of climate change’s psychological toll. According to Google Trends, search interest for climate anxiety has gone up 565 percent over the past year.

Even the name for climate change itself has undergone some adjustment as people have begun to use more intense language to describe what they see happening. The phrase climate crisis, which appeared in the dictionary for the first time this month, became 20 times more popular from 2018 to 2020, and climate emergency increased 76 times, the OED found. The phrase greenhouse effect, popular back in the ’90s, has dropped by the wayside; the once-common global warming has also gradually fallen out of favor.

Language nerds love the Oxford English Dictionary because it attempts to trace words back to their origins and documents how their meanings have changed over time. Today, the phrase climate refugee refers to someone who has been forced to relocate in response to rising seas, wildfires, drought, or other environmental disasters. But the OED places climate refugee’s entrance into the lexicon back in 1889, when the phrase was a disparaging name for someone who moved somewhere for a more mild or pleasant climate. (“He is a climate refugee from the frigid east, and is looking for a home under genial skies of Southern California,” read an Indiana newspaper article in 1911.)

While the dictionary update includes some downers — including mass extinction — it also reflects a growth spurt in solutions. Words related to electric vehicles are gaining ground as drivers talk about smart charging their vehicles to optimize their battery life and report range anxiety that the battery will run out before they finish their journey. 

The phrases renewable energy and fossil fuels are both increasing in use, according to the OED. However, the words used alongside fossil fuels are becoming more negative in tone (divestment, phasing out, and transition), reflecting the need to cut greenhouse gas emissions.

In what might cause a chemistry class flashback for some, the OED decided that CO2 — aka carbon dioxide, the main greenhouse gas heating up the planet — merited its own entry, since people have started to throw it around in the same casual way they talk about H2O.


Adblock test (Why?)