Label Oriented Hierarchical Attention Neural Network for Short Text Classification

<p>Feng Xia<sup>1,2</sup></p>

doi:10.25236/AJETS.2022.050111

Academic Journal of Engineering and Technology Science, 2022, 5(1); doi: 10.25236/AJETS.2022.050111.

Label Oriented Hierarchical Attention Neural Network for Short Text Classification

Author(s)

Feng Xia^1,2

Corresponding Author:

Feng Xia

Affiliation(s)

¹Ping An Healthcare and Technology Co Ltd, Shanghai, China

²Renmin University of China, Beijing, China

Download PDF
|
Download: 49
|
View: 1175

Abstract

Text classification task is a very common task in natural language processing. The conventional text classification task model often uses word bag model or representation model, but the existing model usually deals with long text classification task, which is not suitable for short text classification. Short text features are relatively fewer and often need more sophisticated feature extraction, and the role of keywords play great importance roles in the classification of short text. In this paper, we propose a label-oriented hierarchical attention mechanism network for short text classification. The model achieves better results on public data sets of Tiao and Weibo, compared with convolutional neural network CNN, GATE control unit neural network GRU, gate control unit neural network fusion convolutional neural network GRU-CNN and translation model Transfomer. It is proved that the model has good performance. Our model has two significant advantages: (1) it is a hierarchical structure consisting of two levels of attention mechanisms, which facilitates interpretation and analysis; (2) Compared with other architectures, this model can be used to extend multi-label text classification tasks. In addition, it can be used for long-term importance analysis in many industrial scenarios.

Keywords

short text classification; Hierarchical Attention; term importance; Convolution network; Natural language processing; Neural Network

Cite This Paper

Feng Xia. Label Oriented Hierarchical Attention Neural Network for Short Text Classification. Academic Journal of Engineering and Technology Science (2022) Vol. 5, Issue 1: 53-62. https://doi.org/10.25236/AJETS.2022.050111.

References

[1] Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. Springer.

[2] Ikonomakis, M., Sotiris Kotsiantis, and V. Tampakas. "Text classification using machine learning techniques." WSEAS transactions on computers 4.8 (2005): 966-974.

[3] Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom. "A convolutional neural network for modelling sentences." arXiv preprint arXiv: 1404.2188 (2014).

[4] Kim, Yoon. “Convolutional Neural Networks for Sentence Classification.” EMNLP (2014)

[5] Luan, Yuandong, and Shaofu Lin. "Research on text classification based on CNN and LSTM." 2019 IEEE international conference on artificial intelligence and computer applications (ICAICA). IEEE, 2019.

[6] Wang, Haitao, et al. "A short text classification method based on N-gram and CNN." Chinese Journal of Electronics 29.2 (2020): 248-254.

[7] Liu, Zhenyu, et al. "Multichannel cnn with attention for text classification." arXiv preprint arXiv: 2006.16174 (2020).

[8] Wang, Shiyao, Minlie Huang, and Zhidong Deng. "Densely connected CNN with multi-scale feature attention for text classification." IJCAI. 2018.

[9] Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.

[10] Lai, Siwei, et al. "Recurrent convolutional neural networks for text classification." Twenty-ninth AAAI conference on artificial intelligence. 2015.

[11] Zulqarnain, Muhammad, et al. "Efficient processing of GRU based on word embedding for text classification." JOIV: International Journal on Informatics Visualization 3.4 (2019): 377-383.

[12] Yang, Zichao, et al. "Hierarchical attention networks for document classification." Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. 2016.

[13] Yu, Shujuan, et al. "Attention-based LSTM, GRU and CNN for short text classification." Journal of Intelligent & Fuzzy Systems 39.1 (2020): 333-340.

[14] Kowsari, Kamran, et al. "Text classification algorithms: A survey." Information 10.4 (2019): 150.

[15] Aggarwal, Charu C., and ChengXiang Zhai. "A survey of text classification algorithms." Mining text data. Springer, Boston, MA, 2012. 163-222.

[16] Zhang, Jiong, et al. "Fast multi-resolution transformer fine-tuning for extreme multi-label text classification." Advances in Neural Information Processing Systems 34 (2021).

[17] Tezgider, Murat, Beytullah Yildiz, and Galip Aydin. "Text classification using improved bidirectional transformer." Concurrency and Computation: Practice and Experience (2021): e6486.

[18] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv: 1810.04805 (2018).