Welcome to Francis Academic Press

Academic Journal of Humanities & Social Sciences, 2023, 6(21); doi: 10.25236/AJHSS.2023.062102.

An Analysis of the Factors of Mistranslation in Statistical Machine Translation—Taking Prose Text Translation as Example


Yon Jee Kwun (Yang Yikun)1, Yon Jee Han (Yang Yihan)2

Corresponding Author:
Yon Jee Kwun (Yang Yikun)

1Foreign Language School, Gannan Normal University, Ganzhou, China

2School of Event and Communication, Suibe, Shanghai, China


This article analyzes the factors of mistranslation in Statistical Machine Translation (STM) by taking prose translation as an example. First, it reviews the basics of STM and prose translation, and then discusses the common mistranslation types in STM, including lexical, syntactic, semantic, and pragmatic mistakes. Next, it identifies possible factors contributing to these mistranslations, including the quality of training data, the selected translation model, and the limitations of machine learning algorithms. Finally, it proposes some possible solutions to reduce mistranslation in STM, such as improving the quality of training data, selecting more appropriate translation models, and exploring new mistranslation detection techniques. The analysis presented in this article provides valuable insights into the challenges and opportunities in STM, and helps improve the accuracy and quality of machine translation.


Statistical Machine Translation; Prose Text Translation; Mistranslation

Cite This Paper

Yon Jee Kwun (Yang Yikun), Yon Jee Han (Yang Yihan). An Analysis of the Factors of Mistranslation in Statistical Machine Translation—Taking Prose Text Translation as Example. Academic Journal of Humanities & Social Sciences (2023) Vol. 6, Issue 21: 6-12. https://doi.org/10.25236/AJHSS.2023.062102.


[1] Cohen, K. Bretonnel, and Andrew Dolbey. “Foundations of Statistical Natural Language Processing (Review).” Language, vol. 78, no. 3, Jan. 2002, pp. 599–599, doi:https://doi.org/10.1353/lan.2002.0150. 

[2] Artetxe, Mikel, et al. Unsupervised Statistical Machine Translation. Sept. 2018, doi:https://doi. org/10.18653/v1/d18-1399. 

[3] Niehues, Jan, and Eunah Cho. Exploiting Linguistic Resources for Neural Machine Translation Using Multi-Task Learning. Aug. 2017, doi: https: // doi. org/10.18653/v1/w17-4708. 

[4] Peris, lvaro, et al. “Interactive Neural Machine Translation.” Computer Speech & Language, vol. 45, Sept. 2017, pp. 201–20, doi: https: // doi. org/ 10.1016/j.csl.2016.12.003. 

[5] Toral, Antonio, and Andy Way. “Machine-Assisted Translation of Literary Text.” Translation Spaces, vol. 4, no. 2, Jan. 2015, pp. 240–67, doi: https: //doi.org/10.1075/ts.4.2.04tor. 

[6] Yang Haoou. “24 styles of Chinese prose”. Sichuan people's publishing house, April. 2023, pp. 003-005. 

[7] Karageorgakis, P., et al. Towards Incorporating Language Morphology into Statistical Machine Translation Systems. Jan. 2005, doi: https: // doi. org/ 10.1109/asru.2005.1566533.

[8] Khalilov, Maxim, and José A. R. Fonollosa. “Syntax-Based Reordering for Statistical Machine Translation.” Computer Speech & Language, vol. 25, no. 4, Oct. 2011, pp. 761–88, doi:https://doi. org/10.1016/j.csl.2011.01.001. 

[9] Wong, Yuk Shan, and Raymond J. Mooney. Learning for Semantic Parsing with Statistical Machine Translation. June 2006, doi: https: // doi. org/ 10.3115/1220835.1220891.