Human and Machine Translation: Advantages and Limitations

Human and Machine Translation: Advantages and Limitations

Abstract

The aim of this paper is to make a comparison between machine and human translations of a technical text from English to Brazilian Portuguese. Our main objective is to discover which are the major challenges faced by current MT systems in order to make better translations. Also, we want to measure the efficiency of the machine by determining the incidence of its errors.

Introduction

Since the process of globalization has started there has been a lot of investment in Machine Translation (MT) systems. With a globalized world of business, the demand for more translations has increased significantly and companies require them in the shortest time possible. The only way of attending this market was to make use of new tools like the MT which automatically translates a text from one language into another and consequently saves a lot of time in comparison to the human translation. It also avoids some human errors like skipping a sentence or word during the translation process or misspelling.

However, these automatic systems also present a lot of limitations. They constantly commit mistakes that in a human’s work would hardly ever occur because they do not make any sense in the specific context of that text. The human translator is much more aware of these aspects than the machine, although MTs based on context are already being developed. Besides that, errors at the sentence structure transformation from one language into another are very common. The result is a text that does not seem “human” but rather unusual.

All of these limitations vary according to the pair of languages with which the software of MT is working. When languages are very distinct from each other, the errors are more critical. That is the case, for instance, of translating Arabic into English. Aberdeen (2010) studied the performance of the MT when dealing with dialogues in Arabic to be translated into English. The same process was adopted on the study by Bandyopadhyay (2005) who wanted to discover the status of the MT software when translating English to Indian languages. Ågren (1997) was also concerned with the comparison of texts translated by machine and by humans with text written in Finnish and translated into English.

Nevertheless, no studies could be found that analyzed the performance of the MT system when the text would be translated from English into Brazilian Portuguese. This is the aim of the present study, to compare a text translated both by machine and human in order to make a comparison between them and discover what the limitations of the system are and how efficient it is.

The plan for this paper is to describe the material (text) and the tool (MT system) selected for the translation process in the Methodology section. After that, in the Results and Discussion section the significant errors committed by the machine will be presented. Finally, in the Conclusion we will evaluate the status of the current MT for translating from English to Brazilian Portuguese.

Methodology

For the purpose of investigating MT limitations for translations from English into Brazilian Portuguese, we selected technical texts in English which are divided in 3 excerpts (sections) from the Quick Reference Guide for a notebook with its correspondent human translated version in Brazilian Portuguese (see Appendix A and B for these excerpts). These 3 passages were chosen based on their larger number of sentences per paragraph throughout the whole guide.

It is known that the technical text presents fewer difficulties for the machine than literary ones because of its less complex sentence structure. Most of them are instructions which should not be such a challenge for the software. Also, there are many translators who work mostly with this kind of text, mainly those who are specialists on the subject of Informatics. Thus, we assumed it was better to start with this type of text and discover the machine’s basic problems first.

The MT system chosen to translate the excerpts was the Google Translator which is commonly employed by internet users around the world. Because it is a software that works online and is largely implemented, it has one of the richest databases among all current MT systems. Every user can collaborate adding the correct translation on it. Therefore, it is probably one of the finest MT systems. The excerpts were translated by this tool (see Appendix C for the MT version) and then each sentence was evaluated in comparison with the human translated version, which was the parameter for a good quality translation.

We searched for a) grammatical errors (classified as type 1); b) translation errors (misinterpretation, omission and inaccuracy of terms – classified as type 2), and also c) the occurrence of inadequacies that are related to the writing style of Brazilian Portuguese (classified as type 3). These would be sentences that we can understand with some effort but they are untypical. The errors were counted only once.

Results and Discussion

The total of 26 errors were counted through 19 sentences, which indicates an average of 1,3 errors per sentence. We consider that a reasonable error incidence would be the maximum of 1 mistake per sentence. Hence, this number shows that there are still many challenges to the machine translation of English into Brazilian Portuguese.

However, most of the errors that occurred in the excerpts were classified as the type 3, the ones which are not really errors but writing style inadequacies. Actually, they represent half of the total incidence, as shown by the table in figure 1.

Errors Incidence of errors Percentage of errors
Type 1 – Grammatical errors 6 23,1%
Type 2 – Translation errors 7 26,9%
Type 3 – Inadequacies of style 13 50%
Figure 1 – Incidence and Percentage of errors

These errors correspond to the unnecessary repetition that are not eliminated by the machine that consequently writes short sentences using the same word three times, like in this sentence: “Quando você conecta o computador a uma tomada elétrica ou instala uma bateria enquanto o computador estiver conectado a uma tomada elétrica, o computador verifica a carga da bateria e temperatura”.

Another example is the use of words that are not appropriate in the context. They have the right meaning but they are not commonly used in that particular context making the text seem unnatural, like “a carga da bateria é de aproximadamente 90% empobrecido” versus “a carga da bateria estiver esgotada em aproximadamente 90%”. Although one can understand the meaning of the sentence and some people consider these as minor errors, they can make a text very unusual and most of the times difficult to read.

The translation errors had seven occurrences and they are related to omissions, inaccuracy or misinterpretation. Some words of the original were simply ignored and are not translated by the machine. The word “largely” from the first sentence of the first excerpt was not translated by the machine and it could not be noticed at first because it is just an extra information complementing the meaning of the verb determined in the sentence “The battery operating time is largely determined by…”.

Nevertheless, some other words omitted made the text nonsense. The machine omitted the verb “remains” twice in the first excerpt and the text is incomprehensible. It recognizes the word “remains” as “permanece”, but probably because of its position in the sentence it was not identified.
The cases of inaccuracy refer to words that have a meaning that can be ambiguous making the text less precise, which would be a very bad characteristic of a reference guide translation. For instance, “low battery warning” was translated as “aviso de bateria fraca” which is ambiguous because we do not know if the battery’s capacity of charging is damaged or the charge itself is low. In this case the battery is operating well, but the charge is at a low level, so it would be better translated as “bateria com pouca carga”.

Among the grammatical mistakes, which were the less frequent ones, the incidence of errors related to verbs was the highest. We believe it is because of the its complex structure. Some errors relate to the BP verbal time and others to the person, plural or singular. The machine had difficulties in choosing the suitable verbs, as we can see in “Desligue o computador da tomada elétrica e permitem que o computador e a bateria para esfriar a temperatura ambiente.” (“Disconnect the computer from the electrical outlet and allow the computer and the battery to cool to room temperature.”)

Conclusion

Through the evaluation of these three excerpts we discovered a variety of errors committed by the machine when translating a text from English into Brazilian Portuguese. However the major challenge for the machine is not the grammar but another aspect which is much more subtle: the context awareness, which demands judgment of coherence and also writing style, stylistics.

Because of the occurrence of atypical mistakes, the machine translation is easily identified as being an automatic process. These errors refer to words used in the wrong context which are never committed by human translators. This probably is going to be harder for the computational linguists to deal with. How to teach a machine to write a more “human like” text and make it conscious of context implications?

Due to time constraints we evaluated just excerpts from technical texts, but it would be also important to evaluate the performance of the machine with the translation of literary texts and compare its difficulties.
In conclusion, there is much more to be examined in MT and this automatic process has to be improved in many aspects to produce a good quality translation. While this is not possible, the human translator is fundamental to correct its mistakes and improve the text. For now, it is just a tool to help the translator.

References

Aberdeen, J. et al (2010). Evaluation of machine translation errors in English and Iraqi Arabic. LREC 2010: proceedings of the seventh international conference on Language Resources and Evaluation, 17-23 May 2010, Valletta, Malta. Available at: http://www.mt-archive.info/LREC-2010-Condon.pdf

Ågren, M. (1997). The strong and the weak points of texts translated by machine in comparison with texts translated by humans. Unpublished. Available at: http://www.mt-archive.info/Agren-1997.pdf

Bandyopadhyay, S. (2005). Use of machine translation in India: current status. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit. Available at: http://www.mt-archive.info/MTS-2005-Naskar-2.pdf

Hutchins, J. and Harold, S. (1992). An Introduction to Machine Translation. London: Academic Press

Appendix A
Quick Reference Guide (English Version)

Health Gauge

The battery operating time is largely determined by the number of times it is charged. After hundreds of charge and discharge cycles, batteries lose some charge capacity, or battery health. To check the battery health, press and hold the status button on the battery charge gauge for at least 3 seconds. If no lights appear, the battery is in good condition, and more than 80 percent of its original charge capacity remains. Each light represents incremental degradation. If five lights appear, less than 60 percent of the charge capacity remains, and you should consider replacing the battery. See Specifications for more information about the battery operating time.

Low Battery Warning

A low-battery warning occurs when the battery charge is approximately 90 percent depleted. The computer beeps once, indicating that minimal battery operating time remains. During that time, the speaker beeps periodically. If two batteries are installed, the low-battery warning means that the combined charge of both batteries is approximately 90 percent depleted. The computer enters hibernate mode when the battery charge is at a critically low level. For more information about low-battery alarms, see “configuring Power Management Settings” in User’s Guide.

Charging the Battery

When you connect the computer to an electrical outlet or install a battery while the computer is connected to an electrical outlet, the computer checks the battery charge and temperature. If necessary, the AC adapter then charges the battery and maintains the battery charge. If the battery is hot from being used in your computer or being in a hot environment, the battery may not charge when you connect the computer to an electrical outlet. The battery is too hot to start charging if the light flashes alternately green and orange. Disconnect the computer from the electrical outlet and allow the computer and the battery to cool to room temperature. Then connect the computer to an electrical outlet to continue charging the battery.

Appendix B
Guia de Referência Rápida (PB Version – Human Translated)

Indicador de saúde
O tempo de operação da bateria é determinado, em grande parte, pelo número de vezes em que ela é carregada. Após centenas de ciclos de carga e descarga, as baterias perdem um pouco de capacidade de carga ou da saúde. Para verificar a saúde da bateria, pressione e mantenha pressionado o botão de status no indicador de carga de bateria durante pelo menos três segundos. Se nenhuma luz acender, a bateria está em boas condições, e restam mais de 80% da capacidade de carga original. Cada luz representa uma degradação incremental. Se aparecerem cinco luzes, menos 60% da capacidade de carga estará disponível e você deverá começar a pensar em trocar a bateria. Consulte “Especificações” no Guia do usuário para obter mais informações sobre o tempo de operação da bateria.

Advertência sobre bateria com pouca carga
Uma advertência de bateria com carga baixa ocorre quando a carga da bateria estiver esgotada em aproximadamente 90%. O computador emitirá um bipe uma vez, indicando que resta um tempo mínimo de operação. Durante esse período, o alto falante emitirá bipes periodicamente. Se houver duas baterias instaladas, a advertência de bateria com pouca carga indicara que a carga combinada das duas baterias está esgotada em cerca de 90%. O computador entrará no modo de hibernação quando a carga da bateria atingir um nível crítico. Para obter mais informações sobre alarmes de bateria com pouca carga, consulte “Como configurar parâmetros de gerenciamento de energia” no Guia do usuário.

Como carregar a bateria
Quando você conecta o computador a uma tomada elétrica ou instala uma bateria em um computador conectado a uma tomada elétrica, ele verifica a carga e a temperatura da bateria. Se necessário, o adaptador CA carrega a bateria e mantém sua carga.
Se a bateria estiver quente devido ao uso no computador ou porque a temperatura ambiente está elevada, talvez ela não seja carregada quando o computador for conectado a uma tomada elétrica.
A bateria estará quente demais para começar a carregar se a luz piscar alternadamente em verde e laranja. Desconecte o computador da tomada elétrica e deixe que a bateria e o computador esfriem até atingirem a temperatura ambiente. Em seguida, conecte o computador à tomada elétrica para continuar a carregar a bateria.

Appendix C
Guia de Referência Rápida (MT Version)

Indicador de saúde

O tempo de operação da bateria é determinada (largely) pelo número de vezes que ele é cobrado. Após centenas de ciclos de carga e descarga, as baterias perdem um pouco da capacidade de carga, ou da saúde. Para verificar a saúde da bateria, pressione e segure o botão de status no indicador de carga da bateria por pelo menos 3 segundos. Se nenhuma luz acender, a bateria está em bom estado, e mais de 80 por cento da sua capacidade de carga original. Cada luz representa uma degradação incremental. Se aparecerem cinco luzes, menos de 60 por cento da capacidade de carga, e você deve pensar em trocar a bateria. Consulte Especificações para obter mais informações sobre o tempo de funcionamento da bateria.

Aviso de bateria fraca

Um aviso de bateria fraca ocorre quando a carga da bateria é de aproximadamente 90 por cento empobrecido. O computador emite um sinal sonoro, indicando que o tempo mínimo de operação da bateria. Durante esse tempo, o alto-falante bipa periodicamente. Se houver duas baterias instaladas, a advertência de bateria fraca significa que a carga combinada das duas baterias é de aproximadamente 90 por cento empobrecido. O computador entra no modo de hibernação quando a carga da bateria está em um nível criticamente baixo. Para obter mais informações sobre os alarmes de bateria fraca, consulte "Configuração de gerenciamento de energia" no Guia do Usuário.

Como carregar a bateria

Quando você conecta o computador a uma tomada elétrica ou instala uma bateria enquanto o computador estiver conectado a uma tomada elétrica, o computador verifica a carga da bateria e temperatura. Se necessário, o adaptador CA carrega a bateria e mantém a carga da bateria.
Se a bateria estiver quente devido ao uso no seu computador ou estar em um ambiente quente, a bateria não pode cobrar quando você ligar o computador a uma tomada elétrica.

A bateria estará quente demais para começar a carregar se a luz estiver piscando alternadamente entre verde e laranja. Desligue o computador da tomada elétrica e permitem que o computador e a bateria para esfriar a temperatura ambiente. Em seguida, conecte o computador a uma tomada elétrica para continuar a carregar a bateria.