Saturday, May 20, 2023

Machine translations fail because we talk about violence through silence, says writer Meena Kandasamy - The Hindu - Translation

Since the research release of ChatGPT in December 2022, the number of jobs on the AI hit list has grown exponentially—at least in the public imagination. Writers, artists, and poets have been forced to reflect upon their replaceability; even Google Search cannot escape scrutiny.

But as users discovered that ChatGPT also produced quick and relatively accurate translations (though it prefers the romanised version of the source language), the human translator also joins the list of professionals who may one day be threatened by evolving models like ChatGPT.

Translator, writer, and poet Dr. Meena Kandasamy spoke to The Hindu’s Sahana Venugopal about translating the Kamattu-p-pal couplets (kurals) of the roughly 2,000-year-old Tirukkural, the ChatGPT-human paradigm, a place for AI in the world of linguistics, and those words that no machine can translate.

ALSO READ
New ways to imagine and resist | Meena Kandasamy picks four path-breaking books

Edited excerpts from the conversation:

In The Book of Desire (Galley Beggar Press, 2023) you translated the kurals of the Kamattu-p-pal, exploring love, romance, heartbreak, and human connection. We asked ChatGPT to translate a few of these kurals to see what it did differently from you. How would you rate the result?

Tamil original [Kural 1081]: 

அணங்குகொல் ஆய்மயில் கொல்லோ கனங்குழை 

மாதர்கொல் மாலும்என் நெஞ்சு.  

Romanised:

aṇaṅkukol āymayil kollō kaṉaṅkuḻai

mātarkol mālum eṉ neñcu

Dr. Kandasamy’s translation:

My heart is tossed about:

Is she the lusty she-devil

A flamboyant peacock

Lady of heavy earrings?

ChatGPT’s translation:

“Will the wild elephant, caught in a pit, befriend

The man who dug it, and who knows not how to mend?”

Meena Kandasamy: It’s wrong. How did it translate this? Why the ‘wild elephant’.... Because ‘Anangu’ actually still means a dangerous woman, a dangerous goddess. There’s no word for ‘caught in a pit’ there. There’s no idea about the man who dug it. I think it’s possibly just written a couplet by itself. So this one is really wrong.

Tamil original [Kural 1082]:

நோக்கினாள் நோக்கெதிர் நோக்குதல் தாக்கணங்கு 

தானைக்கொண் டன்ன துடைத்து.  

Romanised:

nōkkiṉāḷ nōkketir nōkkutal tākkaṇaṅku

tāṉaikkoṇ ṭaṉṉa tuṭaittu

Dr. Kandasamy’s translation:

She looks, her look

A face-off to mine—

Looks like she has brought along

a shocktroop of terrifying goddesses.

ChatGPT’s translation:

“Looking at what is seen, looking at the viewer and looking at the act of seeing, the seer himself is confused.”

MK: I’m sorry, but this one is way off the mark. There’s nothing that refers to the ‘seer’ here, and the translation doesn’t catch anything about either the goddess or the attacking army she’s brought with her, or the fact that there’s a woman involved—because ‘nōkkiṉāḷ’ characterises the female gender—or about ‘nōkketir’, which is ‘direct,’ ‘eye-to-eye.’ If this was submitted by my student, it would be getting fail marks.

ALSO READ
The workings of ChatGPT, the latest natural language processing tool
Translating gender in Tamil
The letters ள் [l] or ன் [n], added to the end of words, help listeners understand whether the speaker is talking about a woman or a man carrying out an action. Tamil also uses the gender-neutral ர் [r] to show respect.
Here, Dr. Kandasamy notes that ChatGPT has taken the original word ‘நோக்கினாள்,’ (nōkkiṉāḷ) which refers to a woman gazing at something, and erased her gender in the AI-translated version.
ChatGPT’s translation of a Kural from ‘The Book of Desire’

ChatGPT’s translation of a Kural from ‘The Book of Desire’ | Photo Credit: Screenshot from ChatGPT

Tamil original [Kural 1083]:

பண்டறியேன் கூற்றென் பதனை இனியறிந்தேன் 

பெண்டகையால் பேரமர்க் கட்டு.  

Romanised:

paṇṭaṟiyēṉ kūṟṟeṉ pataṉai iṉiyaṟintēṉ

peṇṭakaiyāl pēramark kaṭṭu

Dr. Kandasamy’s translation:

Once, I never knew of that

which is called Death. Now

I know. It wages war with

A woman’s striking eyes.

ChatGPT’s translation:

“I have searched and found the treasure, a beloved who is a gem among women.”

MK: And this one is my favourite kural, the third one. [after reading] So, ChatGPT has written a kural all by itself.

SV: Can you perhaps see or offer a suggestion as to where these mistakes are coming from?

MK: I think what it has done, is that it understands how kurals work because it’s possibly got within its data set, ‘how kurals work.’ So it understands one or two words and goes on its own trajectory. This is a colossal failure, but at the same time, it’s made up these translations that look almost like it has some meaning. So all of these could pass off as kurals, except they are not translations of the kurals.,

SV: Is this specifically because of the Tamil language? I have seen people using ChatGPT for French-to-English or Spanish-to-English translations and they come out a lot better.

MK: That’s interesting because my partner is both a translator and an interpreter, and he translates his [French] articles on Google Translate and then fixes the mistakes. And he’s been doing this for ages before ChatGPT because it’s so good, it’s almost ready to be used—but it’s not the same for us, right? It’s not the same at all for people using Indian languages. 

But I think it might be interesting because at some point, if ChatGPT can recognise Tamil and get as much data as it has for other languages in the Roman script, for Indian languages—not just of contemporary writers but of scholarly or ancient sangam language writing—if it gets that kind of data, I think it would just do a better job.

It would know what it’s talking about, because it’s not like Tamil isn’t accessible. There are wonderful lexicons online [and] all of these words that Valluvar used are alive. I would think that 60-70% are still in daily use, 10% have possibly fallen out of usage, and 20% are words we don’t commonly use. Because if someone like me can translate them, it means it’s very accessible.

SV: So you believe that as long as ChatGPT accesses more Tamil data, its translations will get better?

MK: We cannot get carried away by this, but at some point, automation will keep this language alive. I think it’s going to help languages, even though there are examples like this [the kural translations], but this is something that can be rectified. I don’t think this is a ChatGPT problem as much as a lack of data and some design construct within ChatGPT that wants to mimic structures. The content doesn’t match, but the style is mimicry of the Tirukkural.

SV: Moving from Tiruvalluvar’s time to our own, would you ever be willing to edit AI-translated copy in the future?

MK: Badly written, badly translated stuff exists as much by humans as by machines. So I don’t have any natural bias against machines.

SV: On the other hand, would you use ChatGPT to help you in your own translation work and save time?

MK: I do translation all the time, with my own writing. At some point there are things I feel very strongly about, that exist in the space between Tamil and English, and I translate myself all the time, hunting for this English word because I write in English but I’m a Tamil woman at the heart of it. So for me, that translation is a natural process, it’s a constant process. So why would I use ChatGPT? The need wouldn’t arise.

SV: You have translated works that cover gender-based violence and trauma. Are there some translations that you would never allow a tool like ChatGPT to ever touch?

MK: War crimes testimony. . . no, I wouldn’t use something like that [ChatGPT]. I wouldn’t let anybody else translate because these are issues about violence, trauma, rape. I think it’s very cultural—for example, women who have been through rape or women who have been asked to recount violence, hardly use the words for these. They wouldn’t call the rape “rape.” And it’s very re-traumatising for them so they wouldn’t describe what happened, but they would talk about it in a non-existential sense. I remember meeting a woman who just used the phrase “it happened.” So “it” meant what should not have happened— rape. But to understand this, it takes somebody from the culture, somebody who knows what has happened. A machine cannot guess that. A machine cannot guess what that silence is.  

I think you either have to be a victim or you have to be a witness, or you have to have some woman’s empathy to know what’s been talked about, so I think a machine would be a failure there, because we talk about violence through silence. Because that’s how women have been trained and also that’s what I think violence is trained to do. It’s trained to break down your language, to break down your power over language. . . to make you not access these aspects of language. So much of language is unspoken.

SV: If a creator or writer you admired went on to publish a novel with the help of ChatGPT, can you see yourself reading it or supporting them?

MK: I think the thing with being a writer is that you’re so full of yourself and you’re so in love with your own individuality that you would never do that, right? Next to politicians, or possibly a little more than politicians, writers are the most narcissistic people in the world. There’s no way they are going to put a byline with a ChatGPT story—like, what kind of a writer are you? I don’t think any writer worth her salt would do that, because we are so full of ourselves.

ChatGPT’s translation of a Kural from ‘The Book of Desire’

ChatGPT’s translation of a Kural from ‘The Book of Desire’ | Photo Credit: Screenshot from ChatGPT

The romanisation and translation of the Kurals are taken from a public excerpt of ‘ The Book of Desire,’ [2023], published by Galley Beggar Press.

What is ChatGPT?

*Disclaimer: AI-powered chatbots are prone to a phenomenon known as “hallucination,” where they generate logical sounding yet completely false answers. For this reason, a response generated by an AI chatbot cannot be taken at face value as fact. This report was researched using the February and March versions of ChatGPT.

Adblock test (Why?)

No comments:

Post a Comment