Application of Speech Recognition Technology on the Evaluation of English Pronunciation Teaching

<p>Fanyu Wang</p>

doi:10.25236/AJETS.030008

Frontiers in Educational Research, 2018, 1(2); doi: 10.25236/AJETS.030008.

Application of Speech Recognition Technology on the Evaluation of English Pronunciation Teaching

Author(s)

Fanyu Wang

Corresponding Author:

Fanyu Wang

Affiliation(s)

Baotou Vocational＆Technical College,Inner Mongolia, Baotou, 014035, China

Download PDF
|
Download: 68
|
View: 1995

Abstract

In order to better the current English learning environment and teaching mode so as to improve the efficiency of spoken English learning, the speech recognition technology is applied to the scoring of English pronunciation teaching. It is pointed out that the feature score be divided into three domains, including the pronunciation segment, the hyper articulation segment and the perceptual domain. The speech recognition technology is used to identify the pronunciation features, and the value of recognition accuracy is used to get the fractional value. After calculating and measuring the correlation coefficient, the correlation coefficient between the synthesized machine score by the three domains and expert score is higher than that of the pronunciation segment score, which is also higher than the effect of the synthesized machine score of any two fields. The research shows that the performance of the scoring mechanism is much better than the previous scoring mechanism, suggesting that it is helpful to the evaluation of English pronunciation teaching.

Keywords

Speech recognition technology; English pronunciation teaching; pronunciation section; scoring mechanism

Cite This Paper

Fanyu Wang, Application of Speech Recognition Technology on the Evaluation of English Pronunciation Teaching. Frontiers in Educational Research (2018) Vol. 1: 10-16. https://doi.org/10.25236/AJETS.030008.

References

[1] Li, J., Deng, L., Gong, Y. (2014). An Overview of Noise-Robust Automatic Speech Recognition, IEEE/ACM Transactions on Audio Speech & Language Processing, Vol.22, No.4, pp.745-777.
[2] Waibel, A. (2014). Modular Construction of Time-Delay Neural Networks for Speech Recognition, Neural Computation, Vol.1, No.1, pp.39-46.
[3] Besacier, L., Barnard, E., Karpov, A. (2014). Automatic speech recognition for under-resourced languages: A survey, Speech Communication, Vol.56, No.1, pp.85-100.
[4] Saon, G., Kuo, H. K. J., Rennie, S. (2015). The IBM 2015 English Conversational Telephone Speech Recognition System, Eurasip Journal on Advances in Signal Processing, Vol.2008, No.1, pp.1-15.
[5] Kim, C., Stern, R. M. (2016). Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition, IEEE/ACM Transactions on Audio Speech & Language Processing, Vol.24, No.7, pp.1315-1329.
[6] Swietojanski, P., Ghoshal, A., Renals, S. (2014). Convolutional Neural Networks for Distant Speech Recognition, IEEE Signal Processing Letters, Vol.21, No.9, pp.1120-1124.
[7] Healy, E. W., Yoho, S. E., Wang, Y. (2013). An algorithm to improve speech recognition in noise for hearing-impaired listeners, Journal of the Acoustical Society of America, Vol.134, No.4, pp.3029.
[8] Narayanan, A., Wang, D. L. (2014). Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition, IEEE/ACM Transactions on Audio Speech & Language Processing, Vol.22, No.4, pp.826-835.