Title: Increasing SMT and NMT Performance by Corpus Extension with Free Online Machine Translation Services
Abstract:In machine translation, parallel corpora of source-target language pair are essential to improve the performance of the translation. However, the existing parallel corpora for the low resource languag...In machine translation, parallel corpora of source-target language pair are essential to improve the performance of the translation. However, the existing parallel corpora for the low resource language is not sufficient to improve the quality of the translation. In this paper, we explore the role of corpus extension by using the three freely available online machine translation services; "Google Translate", "SYSTRAN Translate" and "Yandex Translate" for English and Thai language pair. We compare three statistical and neural machine translation performances between the original ASEAN-MT corpus, and their extended version, which double the original size of the ASEAN-MT. The results showed that, for SMT models, extended Thai corpus can help improve the translation performance for th-en translation up to 2.6% and the extended English corpus can do so significantly for en-th translation up to 4.2%. While for the NMT model, the extended Thai corpus can improve the translation performance up to 5.5%.Read More
Publication Year: 2020
Publication Date: 2020-11-04
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 1
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot