Wednesday, May 25, 2022

Translate scanned PDF documents with Document translation - Microsoft - Translation

Today, the Document translation feature of Translator, a Microsoft Azure Cognitive Service, adds the ability to translate PDF documents containing scanned image content, eliminating the need for customers to preprocess them through an OCR engine before translation.

Document translation was made generally available last year, May 25, 2021, allowing customers to translate entire documents and batches of documents into more than 110 languages and dialects while preserving the layout and formatting of the original file. Document translation supports a variety of file types, including Word, PowerPoint and PDF, and customers can use either pre-built or custom machine translation models. Document translation is enterprise-ready with Azure Active Directory authentication, providing secured access between the service and storage through Managed Identity.

Translating PDFs with scanned image content is a highly requested feature from Document translation customers. Customers find it difficult to segregate PDF documents which have regular text or scanned image content through automation. This creates workflow issues as customers have to route PDF documents with scanned image content first to an OCR engine before sending them to document translation.

Document translation services now have the intelligence

  • to identify whether the PDF document contains scanned image content or not,
  • to route PDFs containing scanned image content to an OCR engine internally to extract text,
  • to reconstruct the translated content as regular text PDF while retaining the original layout and structure.

Font formatting like bold, italics, underline, highlights, etc. are not retained for scanned PDF content as OCR technology does not currently capture them. However, font formatting is preserved while translating regular text PDF documents.

Document translation currently supports PDF documents containing scanned image content from 68 source languages into 87 target languages. Support for additional source and target languages will be added in due course.

Now it’s easier for customers to send all PDF documents to Document translation directly and let it decide when and how to use the OCR engine efficiently.

For customers already using Document translation, no code change is required to be able to use this new feature. PDF documents with scanned content can be submitted for translation like any other supported document formats.

We are also pleased to announce that the Document translation adds support for scanned PDF document content with no additional charges to customers. Two pricing plans are available for Document translation through Azure — the Pay-as-you-go plan and the D3 volume discount plan for higher volumes of document translation. Pricing details can be found at aka.ms/TranslatorPricing.

Learn how to get started with Document translation at aka.ms/DocumentTranslationDocs.
Send your feedback to mtfb@microsoft.com.

Adblock test (Why?)

Tuesday, May 24, 2022

Research translation, innovation updates top BOT's May meeting - The Well : The Well - The Well - Translation

News about University research — how it’s being done and how it’s applied to solve problems — dominated the Board of Trustees meeting May 18-19.

The Office of Undergraduate Research showed how important undergraduates are in making new discoveries in its May 19 presentation. Gabriella Hesse ’22, now a School of Medicine student, presented her research about sex-related tendencies in the development of certain brain diseases. Lauren McShea, a rising sophomore majoring in environmental health sciences, showed how she helped develop low-cost ways to monitor well water for harmful bacteria.

Because so many Carolina faculty are involved in research, “our undergraduates get to be in proximity to that, to take part in that,” said Troy Blackburn, associate dean for undergraduate research. “It’s learning by doing. It’s taking classroom knowledge and using it to solve problems.”

With the full implementation of the new IDEAs in Action curriculum this fall, about 19,000 undergraduates will be required to engage in original research to meet the new research and discovery requirement, Blackburn said.

Student Gabriella Hesse and teaching associate professor Sabrina Robertson

Student Gabriella Hesse, left, and teaching associate professor Sabrina Robertson share research done on Parkinson’s disease. (Jon Gardiner/UNC-Chapel Hill)

Next steps in research

Provost and Chief Academic Officer J. Christopher Clemens spoke about helping researchers develop their work when asking that the Institute for Convergent Science, based in the College of Arts & Sciences since 2017, become a pan-University, interdisciplinary center.

“Our faculty are very good at sponsored research,” Clemens, the institute’s founding director, told the University Affairs committee. But basic research skills are very different from those required for building a company. “We see ICS as a bridge that helps faculty navigate the pathways they must go through if they’re going to take research from the lab and out into the world.”

The institute, located in the Genome Sciences Building, operates in a three-lane research-to-market process called “Ready, Set, Go.” The middle lane is the newest to the University. “It’s called pre-commercial development,” Clemens said. “It awards money based on proposals,” allowing researchers to continue to develop their ideas without having to take entrepreneurial risks.

“It needed to be elevated. It will help us recruit faculty who will grow the research infrastructure and support other initiatives,” said Chancellor Kevin M. Guskiewicz.

The board approved the institute, which will be led by Gregory P. Copenhaver, Chancellor’s Eminent Professor of Convergent Science and associate dean of research and innovation in the College of Arts & Sciences.

Vinay Patel

Trustee Vinay B. Patel called the proposed downtown Innovation District “very exciting news, not just for the University but for the entire region.” (Jon Gardiner/UNC-Chapel Hill)

On to innovation and commercialization

Researchers who are ready to be entrepreneurs can call upon the many resources of Innovate Carolina.

“The gap between research and discovery and impact is wide, long, resource-intensive and risky. And this is the place where Innovate Carolina sits,” Michelle Bolas, the University’s chief innovation officer and executive director of Innovate Carolina, told the Strategic Initiatives committee.

One of the department’s recent successes is approval of the 20,000-square-foot Innovation Hub at 136 E. Rosemary St. The downtown Chapel Hill space is being renovated as a new home for Innovate Carolina and a startup accelerator, with co-working and meeting spaces.

The Innovation Hub, scheduled to open in April 2023, and the redevelopment of Porthole Alley will be key components of a proposed downtown Innovation District. “We will be one of the only leading universities with an Innovation District at the edge of our campus, on our front door,” Bolas said.

Trustee Vinay B. Patel called the development “very exciting news, not just for the University but for the entire region,” when he presented an update to the full board.

Board of Trustees Chair David L. Boliek Jr. responded that the new district shows “this board’s commitment to economic development and the vibrancy of Chapel Hill and the 100 block of Franklin Street.”

Guskiewicz announced another tangible result of innovative research attracting funding in his meeting remarks — a $65 million award from the National Institute of Allergy and Infectious Diseases to the UNC Gillings School of Global Public Health. The grant will establish the Antiviral Drug Discovery Center to develop oral antivirals that can combat pandemic-level viruses like COVID-19. The center builds upon UNC’s Rapidly Emerging Antiviral Drug Development Initiative.

Brian James

Incoming chief of UNC Police Brian James addresses the board during the UNC Board of Trustees full board meeting May 19. (Jon Gardiner/UNC-Chapel Hill)

A new slate of campus leaders

Guskiewicz introduced trustees to four recently hired members of his leadership team:

  • Janet Guthmiller, new dean of Adams School of Dentistry and Claude A. Adams Distinguished Professor, effective Oct. 15.
  • James W.C. White, new dean of the College of Arts & Sciences, effective July 1.
  • Brian James, new chief of UNC Police, effective July 1.
  • Amy McConkey, new director of state affairs.

Not in attendance but also mentioned in the chancellor’s remarks was Valerie Howard, new dean of School of Nursing, effective Aug. 1.

Another new leader at the May meeting was Student Body President Taliajah Vann, who took the oath of office to become an ex officio member of the board for the next year. “I am excited to work within this space,” Vann said.

In addition to the Institute for Convergent Science, the trustees voted to approve:

Trustees also received the following reports:

  • A budget update and a plan to implement OneStream software as the new campus-wide budget tool, from Nathan Knuffman, vice chancellor for finance and operations.
  • An overview of the Office of Institutional Integrity and Risk Management, from George Battle, vice chancellor for institutional integrity and risk management.
  • Remarks from Katie Musgrove, Employee Forum chair, who reminded trustees that staff are struggling because of the Great Resignation and a “plague of lingering vacancies” that have left them “overtasked and burned out.”

Adblock test (Why?)

Monday, May 23, 2022

Meta Tries Making Human Evaluation of Machine Translation More Consistent - Slator - Translation

Although automatic evaluation metrics, such as BLEU, have been widely used in industry and academia to evaluate machine translation (MT) systems, human evaluators are still considered the gold standard in quality assessment.

Human evaluators use quite different criteria when evaluating MT output. These are determined by their linguistic skills and translation-quality expectations, exposure to ΜΤ output, presentation of source or reference translation, and unclear descriptions of the evaluation categories, among others. 

“This is especially [problematic] when the goal is to obtain meaningful scores across language pairs,” according to a recent study by a multidisciplinary team from Meta AI that includes Daniel Licht, Cynthia Gao, Janice Lam, Francisco Guzman, Mona Dia, and Philipp Koehn.

To address this challenge, the authors proposed in their May 2022 paper, Consistent Human Evaluation of Machine Translation across Language Pairs, a novel metric. Called XSTS, it is more focused on meaning (semantic) equivalence and cross-lingual calibration, which enables more consistent assessment.

Adequacy Over Fluency

XSTS — a cross-lingual variant of STS (Semantic Textual Similarity) — estimates the degree of similarity in meaning between source sentence and MT output. The researchers used a five-point scale, where 1 represents no semantic equivalence and 5 represents exact semantic equivalence.

The new metric emphasizes adequacy rather than fluency, mainly due to the fact that assessing fluency is much more subjective. The study noted that subjectivity leads to higher variability and the preservation of meaning is a pressing challenge in many low-resource language pairs.

The authors compared XSTS to Direct Assessment (i.e., the expression of a judgment on the quality of MT output using a continuous rating scale) as well as some variants of XSTS, such as Monolingual Semantic Textual Similarity (MSTS), Back-translated Monolingual Semantic Textual Similarity (BT+MSTS), and Post-Editing with critical errors (PE).

They found that “XSTS yields higher inter-annotator agreement compared [to] the more commonly used Direct Assessment.”

Cross-Lingual Consistency

“Even after providing evaluators with instruction and training, they still show a large degree of variance in how they apply scores to actual examples of machine translation output,” wrote the authors. “This is especially the case, when different language pairs are evaluated, which necessarily requires different evaluators assessing different output.”

To address this issue, the authors proposed using a calibration set that is common across all languages and consists of MT output and corresponding reference translation. The sentence pairs of the calibration set should be carefully selected to cover a wide quality range, based on consistent assessments from previous evaluations. These scores can then be used as the “consensus quality score.”

Evaluators should assess this fixed calibration set in addition to the actual evaluation task. Then the average score each evaluator gives to the calibration set should be calculated.

According to the authors, “The goal of calibration is to adjust raw human evaluation scores so that they reflect meaningful assessment of the quality of the machine translation system for a given language pair.”

Given that the calibration set is fixed, quality is fixed, and the average score each evaluator assigns to any sentence pair in the set should be the same. Hence, the score assigned by each evaluator and the official fixed score can be used to make adjustments to each evaluator’s score. 

“If this evaluator-specific calibration score is too high, then we conclude that the evaluator is generally too lenient and their scores for the actual task need to be adjusted downward, and vice versa,” explained the authors.

For example, if the consensus quality score for the calibration set is 3.0 but an evaluator assigned it a score of 3.2, then 0.2 from all their scores for the actual evaluation task should be deducted.

The authors concluded that the calibration leads to improved correlation of system scores to subjective expectations of quality based on linguistic and resource aspects, as well as to improved correlation with automatic scores, such as BLEU.

Adblock test (Why?)

Sunday, May 22, 2022

Translation efforts of Jehovah's Witnesses reach Alabama residents in the language of their Hearts - Elmore Autauga News - Translation

From Jehovah’s Witnesses of the United States of America Organization

‘Straight to the Heart’: Unprecedented Translation Work brings Words to Life for Millions

Jin Gim (pronounced Kim) joined his family in the United States when he was 28 years old. Montgomery resident Gim was born and raised in Seoul, South Korea and did not speak English. However, prior to his arrival, he had been baptized as one of Jehovah’s Witnesses. The organization first began publishing the Watchtower and Awake! magazines in the Korean language in 1952. Today, in the Republic of Korea there are 1,270 congregations. However, in Alabama, there is only one.

Says Gim, “I knew very few words of English when I first came to the U.S., and it’s still hard for me. It’s really awesome and fantastic that I can attend meetings and study Christian publications in my own language.”

Why do Jehovah’s Witnesses put so much effort into translation, including for some smaller language groups?

“We understand that a region’s official language may not be the language of a person’s heart,” said Robert Hendriks, the U.S. spokesperson for Jehovah’s Witnesses.

In the United States alone, some 67 million residents speak a language other than English at home.

According to UNESCO, education based on the language one speaks at home results in better quality learning, fosters respect and helps preserve cultural and traditional heritage. “The inclusion of languages in the digital world and the creation of inclusive learning content is vital,” according to its website.

That’s true for all ages and for all types of education.

“Translating spiritually uplifting material into over 1,000 languages takes a considerable amount of time and resources,” said Hendriks, “but we know that reaching a person’s heart with the Bible’s comforting message can only be accomplished if they fully understand it.”

Gim and others in the Korean-language congregation regularly reach out to their Korean-speaking neighbors in Montgomery. Although the door-to-door work of Jehovah’s Witnesses has been suspended since March 2020, they continue their ministry by writing letters and making telephone calls. Before the pandemic, Gim recalls engaging in the door-to-door work and one resident, who had recently arrived from Korea, remarked, “Oh, my! The Witnesses are here, too!”

Before moving to Alabama, Gim had also assisted Korean-speaking Witnesses in

California, North Carolina, and Georgia.

“No matter where I live, my fellow Christian friends are always there to help me,” says Gim. “My family and I have always been welcomed and we know that as Jehovah’s Witnesses, we are all one.”

To learn more about the translation work of Jehovah’s Witnesses visit

https://ift.tt/lHR8MSK

Adblock test (Why?)

“Turnovers” was the only word in the Celtics’ Game 3 dictionary - Celtics Blog - Dictionary

The Boston Celtics gave away Game 3 against the Miami Heat. Literally.

Their 24 turnovers tied the most in a playoff game this season by any team. The only other squad to record that number of giveaways in a game was the Philadelphia 76ers in Round 1 against the Toronto Raptors. The big difference is, the Sixers won their game.

Boston recorded at least five turnovers in every single quarter of the contest. They had five in the first, six in the second, seven in the third, and five in the fourth. It was a truly incredible feat that prevented them from mounting one of the most improbable comeback victories in postseason franchise history.

Head coach Ime Udoka spoke about the turnovers and said that while the Celtics were able to climb all the way back, they dug themselves in too big of a hole to overcome.

You turn the ball over 24 times and gift them 33 points out of that, you dig yourself a hole. Credit, we fought back and got into a one-point game, they made some mistakes and some more turnovers, but you dig yourself in that big of a hole due to playing in the crowd.

While the Heat managed to produce 33 points off of Boston’s turnovers, the Celtics only scrounged up nine points on nine Miami turnovers. That’s a 24-point difference in a game that was decided by just six points. Big man Al Horford commented on this as well, stating that “it seemed like every time we put ourselves in a position, we turned it over.”

The Celtics’ carelessness also helped the Heat set a new franchise record. Their 19 steals were the most ever in a playoff game for Miami. It was a team effort, too, as three different players had four steals (Kyle Lowry, Bam Adebayo, and Victory Oladipo), and four others had at least one.

Notching 24 turnovers took a team effort, but Jayson Tatum and Jaylen Brown led the way. The star duo combined for 13 giveaways, as Brown had seven and Tatum six. Brown owned up to his ball-handling issues and explaining what he needs to do better.

Did a s*** job today of taking care of the basketball. But, just being stronger, you know. Driving, I’m gonna keep being aggressive, I’m gonna keep getting to the basket, I’m gonna keep doing what I do, but be stronger when I get in there.

Brown has had issues dribbling in traffic for most of the postseason, and those problems caught up with him again on Saturday night. Of his seven turnovers, six of them came while he was trying to make a move toward the hoop. The other was an errant pass where he attempted to get the ball to Grant Williams down low.

This play is emblematic of Brown’s struggles. His handle is extremely loose, and even when he gets past an initial defender, he loses control and gives up possession, allowing Miami to get an easy bucket on the other end.

Tatum also talked about the turnovers during his post-game interview. He said that his performance was “unacceptable” and that he left the team hanging with how poorly he played.

Obviously, they played well from the beginning. But you know, six turnovers and no field goals in the second half, that is unacceptable. I gotta play better. I feel like I left the guys hanging tonight. That’s on me. I acknowledge that.

While Brown mainly struggled with his handle, Tatum’s turnovers usually occur in a wider variety of ways. However, they can be boiled down to three problems: bad passes, losing the ball in traffic at the rim, and offensive fouls. Below is an example of each one.

The poor passing decisions were a carry-over from Game 1 when Tatum recorded six turnovers by himself in the third quarter. He’s improved as a playmaker a lot this season, but his lazy passes have killed the Celtics in this series.

During his drives into traffic, Tatum just has to make decisions quicker. Miami’s defense is tough, and once he gets too far into the trees, he’s trapped. And as far as his offensive fouls go, Tatum just has to stop extending his arm. He gets called for that constantly, and defenders are able to look for it now, especially smart defenders like P.J. Tucker.

After a rocky start, Boston turned things around in a big way, but the turnovers persisted. As Horford stated, they just gave the ball away whenever they got close. They’ll have an opportunity to bounce back in Game 4, as they have all season, and taking care of the basketball will undoubtedly be the top priority.

Game 4 is set to tip off on Monday night at 8:30 p.m. EST.

Adblock test (Why?)

Saturday, May 21, 2022

Matthew McConaughey believes a word should be removed from dictionary - KFOR Oklahoma City - Dictionary

[unable to retrieve full-text content]

Matthew McConaughey believes a word should be removed from dictionary  KFOR Oklahoma City

Matthew McConaughey Wants His 'Least Favourite Word' Wiped From The Dictionary - LADbible - Dictionary

Matthew McConaughey is a mesmerising storyteller, and he's made a living captivating audiences down the decades. But there's one word you won't catch the actor uttering. Watch here:

Loading…

In a video posted on social media, the Magic Mike star decried the word 'unbelievable'.

Now, for a lot of people that's a pretty inoffensive word. In fact, many people - such as Chris Kamara and Gary Neville - seem to like it quite a lot. Not McConaughey, though.

"Unbelievable - it's my least favourite word," he said.

"I think we should wipe it out of the dictionary."

So, why on Earth does the 52-year-old find this word so offensive that he wants it expunged from the English language?

"What's so unbelievable about tragedy, about triumph, about people that raise us up or let us down?" he asked. "It happens every single day.

"We shouldn't think that the most beautiful sunset, or the greatest play, or the greatest love of our life, or the greatest moment of euphoria is unbelievable - believe it. It's happening right in front of you in you.

"We shouldn't feel like the greatest tragedy, or death, or earthquakes, or natural disasters, or loss is unbelievable. It's part of life too, believe it. We see it happen every day.

"So, unbelievable, I don't buy. Awesome, horrible, incredible. I believe those. That's a good way to explain things, but unbelievable. Nah, it just happened. Believe it."

McConaughey doesn't find anything unbelievable. Credit: Alamy
McConaughey doesn't find anything unbelievable. Credit: Alamy

Fair dos Matthew, you've convinced me - though with that charisma he could convince me of just about anything.

In fact, he recently did convince me of a pretty unbelievable story, or hard to believe story, I should say.

As you can see, for a bloke in his fifties McConaughey sports an enviable head of hair, even though he'd started thinning before the turn of the millennium.

Indeed, as he recounts in his memoir Greenlights, the hair loss got so bad that he decided to shave his head.

But somehow - without hair transplants - it grew back.

"I get this topical ointment and I rub it into my scalp, once a day for 10 minutes," he told LADbible.

"I was fully committed, I was fully committed to it - no Propecia, no nothing, it was just manual labour.

"All I can tell you is it came back. I have more hair now than I had in 1999."

Astonishing. Dumbfounding. Unreal... But not unbelievable.

Adblock test (Why?)