Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2023, 6(6); doi: 10.25236/AJCIS.2023.060602.

Chinese Named Entity Recognition Based on Multi-Feature Fusion FLAT Model in the Medical Field

Author(s)

Ning Wang, Lin Ni

Corresponding Author:
Lin Ni
Affiliation(s)

University of Science and Technology of China, Hefei, 230027, Anhui, China

Abstract

The complexity of syntax and the specialized nature of Chinese electronic medical record data make it challenging to accurately identify medical entities using named entity recognition models. In order to precisely extract complex medical vocabulary from electronic medical records, this paper proposes a multi-feature-based named entity recognition model that addresses the issue of insufficient internal feature extraction in the FLAT model. Firstly, the radical and pinyin features of Chinese characters are extracted for enrichment and perfection of their semantic information. These features are then combined with the word embeddings extracted by the FLAT-lattice method. Finally, the fused features and position encoding are input to the Transformer encoder for encoding, followed by decoding using the CRF method. Experimental results demonstrate that the proposed model outperforms many existing algorithms on the CCL2021 dataset, with an F1 score of up to 91.75%.

Keywords

Chinese Electronic Medical Records; Named Entity Recognition; FLAT-Lattice

Cite This Paper

Ning Wang, Lin Ni. Chinese Named Entity Recognition Based on Multi-Feature Fusion FLAT Model in the Medical Field. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 6: 8-15. https://doi.org/10.25236/AJCIS.2023.060602.

References

[1] Ma, R., Peng, M., Zhang, Q., Wei, Z. and Huang, X., Simplify the Usage of Lexicon in Chinese NER; proceedings of the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, F July, 2020. Association for Computational Linguistics.

[2] Yang, J., Teng, Z., Zhang, M. and Zhang, Y., Combining discrete and neural features for sequence labeling; proceedings of the Computational Linguistics and Intelligent Text Processing: 17th International Conference, F, 2018.

[3] He, H. and Sun, X., A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media; proceedings of the Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, F, 2017.

[4] Tian, Y., Song, Y., Xia, F., Zhang, T. and Wang, Y., Improving Chinese word segmentation with wordhood memory networks; proceedings of the Proceedings of the 58th annual meeting of the association for computational linguistics, F, 2020.

[5] Lu, Y., Zhang, Y. and Ji, D., Multi-prototype Chinese character embedding; proceedings of the Proceedings of the tenth international conference on language resources and evaluation (LREC'16), F, 2016.

[6] Dong, C., Zhang, J., Zong, C., Hattori, M. and Di, H., Character-based LSTM-CRF with radical-level features for Chinese named entity recognition; proceedings of the The 5th Conference on Natural Language Processing and Chinese Computing & The 24th International Conference on Computer Processing of Oriental Languages, F, 2016.

[7] Liu, Z., Zhu, C. and Zhao, T., Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words?; proceedings of the Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing, F, 2010.

[8] Li, H., Hagiwara, M., Li, Q. and Ji, H., Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese; proceedings of the Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), F May, 2014.

[9] Zhang, Y. and Yang, J., Chinese NER Using Lattice LSTM; proceedings of the Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, F July, 2018.

[10] Li, X., Yan, H., Qiu, X. and Huang, X., FLAT: Chinese NER Using Flat-Lattice Transformer; proceedings of the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, F July, 2020. Association for Computational Linguistics.

[11] Zhu, W., Jin, X., Ni, J., Wei, B. and Lu, Z. (2018) Improve word embedding using both writing and pronunciation. PloS one, 13 (12): e0208785.

[12] Chen, A. and Yin, C., CRW-NER: Exploiting Multiple Embeddings for Chinese Named Entity Recognition; proceedings of the 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), F, 2021. IEEE.

[13] Yang, J., Wang, H., Tang, Y. and Yang, F., Incorporating lexicon and character glyph and morphological features into BiLSTM-CRF for Chinese medical NER; proceedings of the 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), F, 2021. IEEE.

[14] Zhang, Y., Liu, Y., Zhu, J., Zheng, Z., Liu, X., Wang, W., Chen, Z. and Zhai, S., Learning Chinese word embeddings from stroke, structure and pinyin of characters; proceedings of the Proceedings of the 28th ACM International Conference on Information and Knowledge Management, F, 2019.

[15] Gui, T., Ma, R., Zhang, Q., Zhao, L., Jiang, Y.-G. and Huang, X., CNN-Based Chinese NER with Lexicon Rethinking; proceedings of the ijcai, F, 2019.