Constructing an ESP Bilingual Parallel Corpus Based on AntConc: Application and Assessment

<p>Xi Lu<sup>1</sup>, Kai Jiang<sup>2</sup></p>

doi:10.25236/FER.2021.040910

Frontiers in Educational Research, 2021, 4(9); doi: 10.25236/FER.2021.040910.

Constructing an ESP Bilingual Parallel Corpus Based on AntConc: Application and Assessment

Author(s)

Xi Lu¹, Kai Jiang²

Corresponding Author:

Kai Jiang

Affiliation(s)

¹Department of Common Required Courses, Hubei Institute of Fine Arts, Wuhan, China

²College of Foreign Languages, Huazhong Agricultural University, Wuhan, China

Download PDF
|
Download: 117
|
View: 4415

Abstract

AntConc is a free and green corpus tool developed by Japanese scholar Laurence Anthony featured by three main functions: concordance, wordlist and keywords. The author first describes the principles and process of constructing a web-based bilingual parallel corpus for the purpose of translation studies and ESP research based on AntConc. Constructing principles include strict linguistic standards, balance of corpus, and appropriate size. Based on the principles, language data is accumulated, processed and entered. After that, language processing software CLAWS part-of-speech tagger and Wmatrix are respectively used for text marking, annotating, high-frequency vocabulary extracting and corpus distribution balancing. In the end, texts, paragraphs and sentences are aligned by the corpus tool ParaConc. After the construction, the author uses the retrieval software Wordsmith 4.0 and statistical software SPSS 11.5 to prove the feasibility and effectiveness of the corpus with a six-month experiment that covers 16 translators, 25 teachers and 285 students.

Keywords

Computer Aided Translation, corpus, AntConc, Wmatrix, computational linguistics

Cite This Paper

Xi Lu, Kai Jiang. Constructing an ESP Bilingual Parallel Corpus Based on AntConc: Application and Assessment. Frontiers in Educational Research (2021) Vol. 4, Issue 9: 53-58. https://doi.org/10.25236/FER.2021.040910.

References

[1] P. F. Brown, J. C. Lai, R. L. Mercer. “Aligning sentences in parallel corpora”. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, pp.169-176, 1991.

[2] W.A. Gale, K.W. Church. “A program for aligning sentences in bilingual corpora”. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, pp.177-184, 1991.

[3] M. Kay, M. Roscheisen. “Text-translation alignment”, Computational Linguistics, vol.19, issue 1, pp.121-142, 1993.

[4] J. Olive , C. Christianson, J. McCary. Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation. New York: Springer, 2011.

[5] B.B. Chang, X.J. Bai, “Beijing University Chinese-English Bilingual Corpus Markup Specification”, Journal of Chinese Language and Computing, vol.02, pp.195-214, 2003.

[6] M.C. Liang, J.J. Xu. “The addition of meta-information and two-level alignment of paragraphs and sentences in bilingual corpus construction”, Chinese Foreign Language, vol.06, pp.37-42, 2012.

[7] Y. Zhao, S.T Zheng, “Focus on the core vocabulary in college English textbooks”, Foreign Languages and Foreign Language Teaching, vol. 06, pp.21-24, 2003.

[8] W.L. Liu, “Application of electronic corpus in translation teaching”, Shanghai Translation, vol. 04, pp. 67-72, 2013.

[9] K.F. Wang, H.W. Qin, “The use of parallel corpora in the teaching of translation”, Foreign Language Teaching and Research, vol. 47, issue 5, pp.763-772, 2015.

[10] K.F. Wang, “On the design and construction of the super-large-scale China English-Chinese parallel corpus (CECPC)”, Foreign Languages in China, vol. 9, issue 6, pp. 23-27, 2012.