As the press and the internet buzz about ChatGPT and many experts express everything from amazement to dismay, the translation industry ponders these advances with a degree of pragmatism, since AI already changed the game in this field a decade ago. The transformation of the profession happened over time, and the industry has progressively adopted the advances of machine translation.
While translation agencies such as Version internationale cannot deny the huge potential of AI, they note that its capacities remain limited, as demonstrated by the fact that some of their clients who are most aware of this issue still order traditional translation/localization services.
A brief history
Machine translation already has a longer history than one might assume. The first research projects were led during World War II to help the secret services understand coded radio communications from Nazi Germany (operation Enigma). Alan Turing, the founder of computer science, played a key role. The mathematician also paved the way for future research later conducted in the US.
The first machine translation system was developed in 1954 by the Russian-English team of Yehoshua Bar-Hillel and his colleagues at Harvard. While it seemed promising, its output was disappointing. This original system relied on word for word translation, ignoring syntactic structure. In the 1950s and 60s, new avenues were explored, but the results remained unsatisfactory. As this translation system became less appealing over time, research was set aside and significant efforts only resumed in the early 90s. The new approaches that emerged were based on statistical models, artificial neural networks and deep learning, and proved useful to the translation industry.
State of the art and the future of machine translation
While machine translation systems were built on statistical models using small datasets, large text corpuses and increasingly powerful computers led to the development of three essential techniques:
- Neural machine translation: technique based on networks of artificial neurones. This approach can understand more nuances and subtleties than statistical model-based machine translation. Neural machine translation is currently one of the most promising techniques and further progress is expected in coming years.
- Artificial Intelligence: the use of AI in machine translation can solve more complex problems by harnessing Machine Learning algorithms. Systems based on AI can learn to translate by analysing millions of sentences and texts, which improves translation quality over time.
- Deep Learning: this machine translation technique is based on AI, and helps computers learn to recognise complex models using networks of artificial neurones. This technique is very useful, since it helps the computer ‘understand’ linguistic nuances and subtleties and learn from its mistakes. The machine becomes better over time as it is trained.
We are well aware that together, these systems are revolutionising the world of translation – as well as many other industries, extending beyond the sphere of the profession.
Take ChatGPT, for instance, which is currently making headlines. We can already see how it could change our work habits and even become a daily companion in some fields. Linguists have already adopted machine translation in similar ways in recent years.
The issue of how it is used is foundational, and it all comes back to how we value quality. While AI can save time, it is not (yet) able to provide the best quality outputs.
We know that currently, no machine translation can possibly be delivered without at least one stage of editing (at Version internationale, we generally submit projects to two stages of editing). We also know that some types of content cannot be processed by machine translation. In cases where context, cultural aspects, tone, style or the spirit of a text cannot be conveyed by a machine, only humans have the ability to meet the goals.
Why do some tech companies not use machine translation?
For five years now, we have been working for one of the Big Five, and this client has not once asked us to use machine translation, despite themselves having originated the OPT-175B language model used to automatise chatbots, write product descriptions and translate text.
Why?
Because for this client, quality and consistent content are of the utmost importance. Because terminology is queen and the linguists working on this account have been trained to the company’s internal jargon. Because these same linguists work together to build the best possible version of the language needed to fit the client’s intentions and be relevant. A machine cannot possibly render these subtleties.
Machine translations are often imprecise and inconsistent. They also generate grammatical and syntactic errors that may compromise the reader’s ability to understand the text. It is easy to understand why companies with the highest standards would choose full human translation when they want to maintain control over the creative process of translation while adhering to strict pre-established rules. Machine translations can to a certain extent bias or corrupt content.
Other reasons can also explain this disengagement from this solution:
Companies need to adapt their content to specific cultures in order to effectively communicate with their clients and partners abroad. Unlike a native linguist, machine translation cannot grasp the cultural or contextual nuances needed to achieve this adaptation.
Most idiomatic expression or unusual linguistic constructs require detailed understanding of the language and context in order to be properly translated.
And machine translation cannot be used for more creative writing, found in marketing or literature, as it is built upon flourishes, humour and metaphors. Losing sight of the author’s intention when using machine translation is out of the question. The art of the linguist is to play upon the delicate nuances of their own language. Pragmatically, your brand image and ability to reach your target audiences with finesse and respect are at stake.
Machine translation and confidentiality
Another blocking factor is confidentiality – it is impossible to have complete transparency on the level of data protection machine translation tools can guarantee. For confidential projects, sensitive information, and specific know-how, we do not know exactly how well the content sent to the machine is protected. They enter a ‘great void’, and there is a real risk of breach of confidentiality for the content translated by the machine.
This issue is what leads many companies to choose not to submit their source material to machine translations and to check what protective rules translation companies follow. With the large volume of content within their systems, these companies often have a solid data protection system in place.
Cost: machine translation vs. human translation
Of course, issues of cost and turnaround time are a solid argument in favour of machine translation. The main thing is to know what you want to prioritise.
In most cases, the output from a post-edited machine translation that is further edited by two additional linguists is satisfactory. But keep in mind that even with high-quality texts, the source document was not fully translated by a linguist, so their personal touch, sensitivity and experience are absent. The machine will produce a text based on an algorithm which lacks the subtlety of a human brain. There is undeniably a loss there – the machine translation skews the style of the work, and the resulting translation is not necessarily what the linguist would otherwise have chosen.
The same can be said for copywriters who use ChatGPT for a first draft. Since they did not write it, the resulting text is very different to the one they would otherwise have written from scratch.
In terms of pricing, although machine translation is used to create the first version of the text, which is more or less half of the work, this does not mean the price will be halved, since linguists will still need to work on it in order to deliver a high-quality text.
For polished results, the first linguist post-edits the content by comparing source and target, checks that the translation is suitable and consistent, makes stylistic improvements and applies the required terminology. A second linguist will then edit the text by comparing the source and target texts, making sure nothing has been left out, that the instructions have been followed and that the meaning remains intact. Sometimes it is judicious to have a third linguist work on the text. In this case, they only edit the target text.
Post-editing is a fundamentally different task for a linguist. Some enjoy it, others not so much. This is understandable. But in the current context, it is difficult to avoid post-editing work altogether.
Conclusion
While the margins for the progress of machine translation narrow over time to provide better quality results, many challenges remain. Linguists’ roles are evolving but remain as vital as ever. The age of high-quality full machine translation remains in the distant future.
There may come a time when it becomes difficult to tell a human translation from a machine translation. The ability to distinguish them might even become a paid service. Recent polemics surrounding the use of ChatGPT in education have sparked malaise. Will we soon stop producing anything new and rely solely on AI? This would mean losing faith in what individuals bring to the table – their singular creativity, the expression of their unique personality, what ultimately makes us who we are and our value.
What would become of opinionated content, intellectual flourishes, artistic endeavours, sparks of genius and unconditional freedom?
Added value always comes from people. In matters of both translation and writing, the role of artificial intelligence cannot cross the boundaries of the mind, which, for now at least, remains our birthright.