Globalisation is continuing to spread and the demand for translators is higher than ever. The growth of globalisation comes alongside the demands for the increase of translators’ productivity, which means that more texts are expected to be translated in a shorter period of time. Therefore, automated translation – machine translation (MT) is nowadays increasingly used for translating different types of texts. However, not all of the text genres are suitable for MT systems and cannot be translated without major mistakes. This thesis will analyse what text genres could potentially be translated successfully by MT in the future. The goal is not to compare human vs. machine translation, but rather to analyse the quality of raw MT output. To discover how far MT has gone in terms of quality, three different text genres, technical, legal and literary, machine-translated from English into Slovak, will be analysed. The main focus of this thesis will be on Google Translate (GT) in relation to the English – Slovak language pair. Slovak, a less spoken language with around 5.5 million of native speakers, is my mother tongue, and I would like to contribute to the field of MT as there has been little research conducted for English into Slovak machine translations. However, the results may be different for languages which have more native speakers such as Spanish, French or German, since more resources are invested in MT of these languages.
Having carried out the research, I aim to discover whether the technology/machine translation development might have an impact on translation industry; in what sectors/genres it might have the greatest impact in the future; and whether translators will truly turn into post editors as Pym (2013) or Gouadec (2007) prophesied.
The first chapter, Literature Review, will analyse the research on MT, GT, translators as post editors and MT of various text genres. However, the technology is constantly evolving and many of the relevant articles found were written three or four years ago. The Methodology chapter will discuss the method that will be adopted to discover which text genres are currently the most suitable for GT, and what makes them the most suitable. The main body, Data Gathering and Data Analysis, will analyse the quality of selected text types translated by GT by applying the method described in the Methodology chapter. This chapter will also comment on errors found in machine-translated texts and discuss the overall results of the text analyses. The source texts (ST), Slovak translations (translated by GT) and their back translations (BT) will be included in the Appendices.
Since there has been a significant improvement and development in MT over the last years, I believe that this topic and overall findings might interest translators who ask themselves how MT may affect the role of translators.
1. Literature Review
The first chapter of this thesis is a Literature Review which is divided into several subsections. In this chapter, the research on MT, its development and translation quality of various text genres previously conducted by other researchers will be commented on and further examined. The main focus will be on GT in relation to the English – Slovak language pair. Having carried out research on MT, its development and translation quality, it has been discovered that the translation technology has improved immensely.
1.1 MT Development/Improvement
Machine translation, from a technological perspective, dates back to the 1940s and since then it has been both criticised and praised, and has been through an immense development (Hutchins, 2000). It has evolved from simple mechanical dictionaries into more advanced neural translation systems such as Google Translate, which will be discussed in the following section. MT usage accelerated in the 1990s, since many agencies and large companies needed to translate their documents into different languages as time-efficiently as possible (Hutchins, 2000). The more popular MT became over the years; the more money was invested in making it available to the public.
1.1.1 Google Translate
In the last few years MT has developed considerably and the public realised its significant benefits. Google Translate, launched in 2006 as a multilingual statistical machine translation system, is one of the examples of automated translation systems, which is considered popular and globally used mainly because it is available to the public for free (Le and Schuster, 2016).
‘Google translates more words in one minute than all human translators in one year’, technical communication expert Stefan Gentz stated in his presentation discussed in an online article written by a freelance translator Christelle Maignan (2015, my italics). However, it has been noticed that he carefully chose the word “words” and not “sentences”. Would the statement be true if we changed “words” for “sentences”? Until 2016, GT was translating words and phrases rather than whole sentences. However, in 2016 its status changed to a neural machine translation system (NMTS) (Eadiccico, 2016). This is considered a considerable improvement since GT no longer needs to break a sentence into several chunks to be able to translate it. Instead, it takes a sentence as one unit and renders it as a whole (Le and Schuster, 2016). In 2016, Google NMTS was integrated into eight languages; however, as Holič (2017) states, another eight languages, the Slovak language among them, have started using NMTS since April 2017. The GT product lead Barak Turovsky (2016) explains that the aim is to integrate NMTS to all 103 languages that GT currently translates from and into. Although Turovsky (2016) states that NMTS is now producing human-like translations (see Figure 1) with proper grammar, Peris, Domingo and Casacuberta (2017) who have conducted research on NMTSs, argue that NMTs still have many errors and the need for human post editing is required.
Figure 1: The change in translation – Statistical MT vs. Neural MT
(Turovsky, 2016)
This leads to another aspect of MT, the quality. The quality can be measured by the amount of errors encountered in the MT output, however, one can argue that it can be also measured by the time that needs to be spent on post-editing. As the Translation Model vs. Translation Quality (see Figure 2) shows, in some language combinations the MT quality is relatively similar to the quality of human translations.
Figure 2: ‘Data from side-by-side evaluations, where human raters compare the quality of translations for a given source sentence. Scores range from 0 to 6, with 0 meaning “completely nonsense translation”, and 6 meaning “perfect translation”.’
(V. Le and Schuster, 2016)
The language service provider Niki’s Int’l Ltd. (2016) agrees with the Model by stating that ‘there is actually a blurred line linking human and machine translations as MT systems are built directly from human translations’. This is because GT users can contribute to its translation memory. Pym (2013, p. 488) concurs that humans build MT systems by discussing a ‘virtuous circle’, where there are two features which make MT better every day. Firstly, he explains that ‘the more we use them (well), the better they get’, and secondly, the reason why he refers to the virtuous cycle is that the more MT systems are online and freely accessible the more people can use them.
As Pym later argues, the MT users who are not informed about how to use MT properly often contribute their unedited or not proofread translations, that they think are correct, to translation memory, and this leads to ‘recycling errors that are fed back into the very database that statistics operate’, and therefore a virtuous circle become ‘a vicious one’ (Pym, 2013, p. 489). The increasing amount of translations done by technology brings us to a new job role – a post editor, who is no longer a translator but rather a competent technical communicator or proof reader (Sverak, 2014).
1.2 Translator as a MT Post Editor – An Impact of MT on Translation Jobs
When reading through various online articles, blogs and journals, it becomes obvious that people’s opinion on MT and its impact on translation jobs differs one from another. Pym (2013) is convinced that MT will have a great impact on the translation industry and, eventually, may replace translators as it was destined for this purpose. His opinion is considered generic and might be of further discussion when the different text genres are taken into consideration. Others, such as an unknown author of the Economist article (2015), think that ‘technology, far from replacing humans, is instead a tool that helps them keep up with a surging demand for high quality translation’ or Niki’s Int’l Ltd. (2016) refers to ‘marriage of man and machine to produce quality and consistently high volumes’ which means that MT and human translators can work alongside and can help each other. Niki’s positive view on future of MT can be underpinned by a Custom MT Platform –Kantan MT (2014) which states that MT post editors work at a rate 7,000 words per day whereas human translators who translate from scratch normally translate 2000 words per day showing that productivity increases with collaboration of technology and human intervention.
Based on the abovementioned, the translation productivity has already increased by using MT systems. However, there have always been concerns about what types of texts are suitable for MT. Based on previous research, the following section will analyse translations of different text genres that were machine translated the most accurately.
1.3 MT Output
1.3.1 MT of Different Text Genres
English language, amongst other languages, has many words which have multiple meanings as well as using many synonyms, words or phrases with hidden meaning or meaning above the sentence level. Although NMTS makes translations more understandable with better grammar, it does not work for all types of texts. Awareness of what text types are suitable for MT systems is important in order to know what quality of translations should be anticipated.
In order to improve the quality of MT, there is the need to evaluate the MT output first. The case study carried out by Salimi (2014) analyses, evaluates and compares GT accuracy of fictional and non-fictional texts translated from English into Swedish. He compared translations of several non-fictional texts such as medical, legal or technical texts with fictional texts such as children’s literature, crime or mystery using BLEU, an automated evaluation metric system for MT (Salimi, 2014). The findings show that although GT had problems with longer sentences of non-fictional texts, their translations were still more accurate than literary translations. As Salimi (2014) later explains, the problem of MT lies in similarity or dissimilarity of the texts. The more the texts are similar in language use or sentence structure, the more accurate the translations will be. In spite of the fact that literary texts normally have shorter sentences, the sentences used in such texts have often a meaning above sentence level that MT are, so far, unable to translate. Further mistranslations might be found in translating jokes, poems, children’s rhymes or idioms as long time ago Volk (1998) proved in his research on automatic translation of idioms stating that MT are not suitable for translations of non-literal meaning. Having mentioned translating children’s rhymes, Andy Martin (2018), the author of ‘Reacher Said Nothing: Lee Child and the Making of Make Me’, and a teacher of translation at the University of Cambridge, agrees with Volk (1998) and states several reasons why GT will never be able to translate literary texts correctly. He states children’s rhymes as an example, since Google cannot differentiate that the ST is a rhyme and that the translation should also use rhyming monosyllables. The real translators need to live the translation not just translate it, they need to become the author of the book to be able to feel the person which will be reflected in the translation as well. This is what GT lacks the most, its role is to translate, and therefore it will never be able to feel that happiness or sadness hidden in literary texts (Martin, 2018).
As described above, not all of the text types are suitable for MT. Salimi (2014) concluded that the most accurate translations were to be found in legal texts. The question then arises as to whether legal translators will eventually be replaced by technology due to the ever-increasing sophistication of NMTS.
1.3.2 MT Evaluation and Analysis
As mentioned above, one of the metrics used to measure MT quality is automatic evaluation, however, there is also manual (human evaluation) which will be used to measure quality of English-Slovak translations and will be described in more detail in the following chapter. Both, manual and automatic evaluation systems have their advantages and disadvantages. Compared to manual evaluation, automatic evaluation is cost and time–saving, however, it is still limited as it is not precise enough (Zaretskaya, Pastor and Seghiri, 2016).
Llitjós et al. (2005) identified an error classification scheme, part of which will be used when analysing errors in English – Slovak translations. The simplified version of this scheme can be seen below (see Figure 3). The scheme consists of 5 categories and several subcategories and was proposed for English to Spanish translations. Since the Slovak, being a West-Slavic language, differs from Spanish, a Western-Romance language, several other evaluation categories will be introduced.
Figure 3: Error analysis scheme for MT
(Llitjós et al., 2005)
1.3.3 English vs. Slovak
The Slovak language, in contrast to English, uses a more complex morphological system, a more flexible word order and a different sentence structure. Another difference appears to be absence of articles, where the Slovak language sometimes uses “one” instead of “a/an” and “this/ that/these/those” instead of “the”.
All of the above mentioned might make MT output from English into this minor, less spoken language less understandable or with several significant errors, and therefore more post editing is required. Having undertaken research on analysis of MT outputs for this language combination, the areas which need further research have been identified since the research shows that English – Slovak translations of different text genres have not been analysed from a MT perspective. Some general research has been conducted in the past on MT output quality for this language combination, however, at that time NMTS had not been integrated into Slovak language. The next chapter will describe the chosen texts and discuss the method which will be used to analyse and evaluate three different text genres translated by GT from English into Slovak.
2. Methodology
The aims of my research project are to discover which text genres are the most suitable for machine translation using GT to translate from English into Slovak language, and to measure the translation quality of MT output for three different text genres. Eventually, depending on the findings, I aim to discover what type of translators might potentially be replaced by MT. The goal is not to compare human vs. machine translation but rather the quality of raw MT output. Hereby, I would like to contribute to the field of MT as there has been little research conducted for English into Slovak machine translations.
This chapter will justify the research and its relevance, what texts were chosen, the possible limitations of the study conducted, and it will discuss the method that has been adopted to discover which text genres are the most suitable for GT and what makes them the most suitable.
2.1 Qualitative Corpus-Based Research
The corpus-based study was chosen as the method which will be adopted to underpin my research. The reason for choosing this approach lies in its suitability since I am undertaking linguistic research which involves the analysis of multilingual corpus data. This includes different texts in multiple languages, which were selected according to specific criteria described below. Furthermore, the qualitative method was employed in order to carry out an in-depth text analysis. As Silverman (2014) points out, qualitative research is more flexible in terms of interpreting errors or mistakes found in translation. Furthermore, this type of research is based on grounded theory, which is theory developed from systematic research, and subjective hypothesis generated from the analysis. This approach entirely suits my study case as I will be making subjective assumptions and hypotheses of what types of texts are the most suitable for MT use. As far as time is concerned, this method, unlike a quantitative method, has been considered more time-effective in order to understand and answer the research objectives and questions. However, I believe that although the quantitative method would be more time-consuming, as it would involve an analysis of many texts, the results would be more valid.
2.2 Selected Text Genres
From a number of different text genres, the choice was limited to three main text types: manuals/technical texts, legal/EU texts and literary texts. All the selected texts were taken from publicly available websites, and therefore, no ethical approval was needed. These particular texts were chosen as each of them had different features such as sentence structure, register or style, thus represented different translation challenges. The following step was to decide on the length of the chosen texts. The texts should be long enough to comprise a broad range of potential translation difficulties or errors, but not too long as this is small-scale research; that is why the length of the selected texts was limited to 200-300 words (Calude, 2003). The following sub-chapters will describe each of the selected text genres in more detail.
2.2.1 Technical Texts/Manuals
One of the features of technical texts is the presence of specialised terminology which might be considered a translation challenge. Technical genre covers a wide range of specialised texts, and a translator or reader should possess higher knowledge of the specialised area to be able to translate or understand it correctly.
Camera instruction manual for Canon EOS 60D, which has 209 words, will be translated as a first example of a technical text. The second text, Carbon fiber raises consumer performance index: Music to the ears – and more, is a 275-word extract from an online article about carbon fibre and its application in audio equipment. These two extracts were selected to see how GT manages the length of the sentences, sentence structure, style, rendering the message or the specific terminology presented in the extracts.
2.2.2 Legal/EU Texts
As a second text type, the legal/EU texts were chosen. On the one hand, GT uses transcripts and other documents that originate in the European Parliament and, for this reason, it might be assumed that translations of this text type will be the most accurate. On the other hand, sentences in legal texts are of considerable length and comprise many specialised legal terms. This might be considered a translation problem since the long sentences should be decoded first and then translated into the Slovak language using a different sentence structure. As the first sample of legal texts, Presentation of the Court European Reports was chosen. The text to be translated has 266 words. The second selected text is a 237-word-extract from the speech by Lord Hoffman, a member of the Appellate Committee of the House of Lords. These texts, taken from publicly available websites, were chosen as each of them is different, but both contain long sentences and specialised legal terminology.
2.2.3 Literary Texts
The last text genre to be machine-translated and evaluated is a literary text. Literary texts are different from the other two above-mentioned. In most instances, these texts do not comprise any specialised terminology and the register of literary texts may include both, formal and colloquial language. Although literary texts might appear to be easy to translate using GT, they have other features which might make them challenging. As Hussain (2017, p.79) states, it is impossible to translate foreign features, such as translation of slang, hidden meaning, humour or fixed expressions or cultural elements, without using ‘violence’. This means that translators often need to add, omit or change words or sentences in order to make it understandable for the target audience and render the message. As the first sample of literary texts, a mystery thriller novel Origin by Dan Brown was chosen. Dan Brown sometimes uses sophisticated language in his thriller novels. The first sample was taken from this book to see how GT can deal with sophisticated language and cultural information. The sample was taken from the first chapter of this book and has 211 words. This book has already been translated into Slovak, and therefore the MT output can even be compared to the existing Slovak version. The second sample, containing many culture-bound elements, is an extract from the novel Miss Garnet’s Angel by Salley Vickers. The 215 word- extract is about a former teacher visiting basilica in Venice.
2.3 Error Analysis
Once the text types have been identified and specific texts have been chosen, the way in which results or data are to be analysed will be explained. Manual (human) evaluation will be used to measure the quality of English-Slovak translations. The texts will be translated using GT and the raw output will be saved in order to be examined methodically later on. When all the texts are translated, the errors, divided into two groups, will be identified. The first group will include major errors when the meaning of the context is changed. The second group will contain errors which did not necessarily change the message but rather made the context/sentence hard to understand by changing the word order or not using correct terminology. As mentioned in the Literature Review, error analysis scheme which will be applied to English-Slovak translations of selected texts was inspired by the one introduced by Llitjós et al. (2005). Several main errors encountered in each text will be commented on and discussed in the following chapter Data Gathering and Data Analysis. In some examples, a correct translation will be provided to make it more comprehensible to non-Slovak speakers. The overall amount of errors for each text will be stated in Table 1. The STs, Slovak translations (translated by GT) and their back translations will be included in Appendices. After all the texts are analysed and all the errors are encountered, results will be discussed and conclusions will be made according to the findings.
2.4 Anticipated Limitations
Before discussing the results, anticipated limitations of my research project should be stated. Having conducted a study at a small-scale by analysing solely three text types, two texts for each text type, to discover their suitability for MT use, the validation of the results needs to be borne in mind when final conclusions will be made. The research might be also limited when considering types of the texts that were chosen in terms of its semantic and pragmatic levels. The more linguistic pragmatics, such as presupposition, conversational implicature, context of sentences and other linguistic signs it has, the more difficult it might be for GT to translate it. To manage the limitations, I will try to find at least two texts of each type that address similar topics in order to minimise variables.
This chapter described the selected text types and discussed the method that will be used to discover the quality of English-Slovak machine translated texts. The following chapter, Data Gathering and Data Analysis, will analyse the quality of selected text types translated by GT by applying the method described above. Based on the analysis, the findings of the research project including the suitability of each text type for MT will be further discussed and conclusions will be made.