Title: Sentence Splitting for Vietnamese-English Machine Translation
Abstract: Translation quality is often disappointed when a phrase based machine translation system deals with long sentences. Because of syntactic structure discrepancy between two languages, the translation output will not preserve the same word order as the source. When a sentence is long, it should be partitioned into several clauses and the word reordering in the translation should be done within clauses, not between clauses. In this paper, a rule-based technique is proposed to split long Vietnamese sentences based on linguistic information. We use splitting boundaries for translating sentences with two type of constrains: wall and zone. This method is useful for preserving word order and improving translation quality. We describe experiments on translation from Vietnamese to English, showing an improvement BLEU and NIST score.
Publication Year: 2012
Publication Date: 2012-08-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 14
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot