Academic Journal of Computing & Information Science, 2024, 7(9); doi: 10.25236/AJCIS.2024.070905.
Yi Ren, Yu Liu
The Department of Computer Science and Engineering, Shenyang Jianzhu University, Shenyang, Liaoning, China
Most of the current methods used existing models for named entity recognition tasks, but this could only obtain character vectors and could not solve the problem of polysemy. This study proposed a new model based on multi-level semantic features fusion and dictionary of keyword to solve this problem. This method first uses a keyword dictionary for random entity replacement to achieve data augmentation; then, it utilizes the pre-trained BERT model to transfer prior knowledge to this task to obtain multi-level semantic features; Secondly, in order to obtain more comprehensive sequence information, the vector is input into the multi-semantic feature fusion layer to extract global information; Finally, after correcting the results with the CRF, the output is obtained. Compared with traditional models such as BiLSTM-CRF and BERT-CRF, this model has achieved good results on news domain datasets, with an accuracy rate of 94.95% and an F1 value of 94.99%.
News Domain, Named Entity Recognition, Multi-level Semantic Features Fusion, Keyword Data Augmentation
Yi Ren, Yu Liu. Recognition of News Named Entity Based on Multi-Level Semantic Features Fusion and Keyword Dictionary. Academic Journal of Computing & Information Science (2024), Vol. 7, Issue 9: 32-40. https://doi.org/10.25236/AJCIS.2024.070905.
[1] Roald Eiselen, Andiswa Bukula. IsiXhosa Named Entity Recognition Resources. ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 22, Issue 2, pp. 1-19, June 2022.
[2] Nada Boudjellal, Huaping Zhang, Asif Khan, et al. ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition. Complexity, Volume 2021, Issue 3, pp. 1-6, March 2021.
[3] Xunwei Yin, Shuang Zheng, Quanmin Wang. Fine-Grained Chinese Named Entity Recognition Based on RoBERTa-WWM-BiLSTM-CRF Model. Conference: 2021 6th International Conference on Image, Vision and Computing (ICIVC), pp. 408-413, July 2021.
[4] Xiaoyong Tang, Yong Huang, Meng Xia, et al. A Multi-Task BERT-BiLSTM-AM-CRF Strategy for Chinese Named Entity Recognition. Neural Processing Letters, Volume 55, Issue. 3, pp. 1209-1229, July 2022.
[5] Wang Jun, Wang Xiulai, Luan Weixian, et al. Research on named entity recognition of scientific research talents field based on BERT model. Computer Technology and Development, Volume 31, Issue 11, pp. 21-27, 2021.
[6] Hongchao Jiang; Baoqi Yang; Li Jin, et al. A BERT-Bi-LSTM-Based Knowledge Graph Question Answering Method, 2021 International Conference on Communications, Information System and Computer Engineering , CISCE, pp. 308-312, May 2021.
[7] Zhang Yue, Yang Jie. Chinese NER Using Lattice LSTM. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics Volume 1, pp. 1554-1564, 2018.
[8] Jiang Yang, Hongman Wang, Yuting Tang, et al. Incorporating lexicon and character glyph and morphological features into BiLSTM-CRF for Chinese medical NER. 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), pp. 12-17, January 2021.
[9] Yang Yang; Zhenying Qu; Zefan Yan, et al. Network Configuration Entity Extraction Method Based on Transformer with Multi-Head Attention Mechanism. Computers, Materials & Continua, Volume 78, Issue 1, pp. 735-757, January 2024.
[10] Shengmin Cui; Inwhee Joe. A multi-head adjacent attention-based pyramid layered model for nested named entity recognition. Neural Computing and Applications. Volume 35, Issue 4, pp. 2561-2574, September 2022.
[11] Kaihong Zheng; Lingyun Sun; Xin Wang, et al. Named Entity Recognition in Electric Power Metering Domain Based on Attention Mechanism. IEEE Access, Volume 9, pp. 152564-152573, January 2021.
[12] Tong Liu; Jian Gao a; Weijian Ni, et al. A Multi-Granularity Word Fusion Method for Chinese NER. Applied Sciences. Volume 13, Issue 5, pp. 2789, February 2023.
[13] Park Hosik, Lee Minsu, Hwang S, et al. TF-IDF based association rule analysis system for medical data. KIPS Transactions on Software and Data Engineering, Volume 5, Issue 3, pp. 145-154, 2016.
[14] Gao Yan, Liu Chenchen, Zhao Liangyuan, et al. Multi-attribute group decision-making method based on time-series q-rung orthopair fuzzy sets. Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, Volume 41, Issue 1, pp. 2161-2170, 2021.