Multimodal sentiment recognition based on Bi-LSTM and fusion mechanism

<p>Haoxia Guo<sup>1</sup>, Ziheng Gao<sup>2</sup></p>

doi:10.25236/AJCIS.2023.060620

Academic Journal of Computing & Information Science, 2023, 6(6); doi: 10.25236/AJCIS.2023.060620.

Multimodal sentiment recognition based on Bi-LSTM and fusion mechanism

Author(s)

Haoxia Guo¹, Ziheng Gao²

Corresponding Author:

Haoxia Guo

Affiliation(s)

¹Lanzhou University of Technology, Faculty of Mechatronic Engineering, Lanzhou, China, 73000

²Guilin University of Technology, Faculty of Science, Guilin, China, 541000

Download PDF
|
Download: 81
|
View: 496

Abstract

The research of multimodal emotion recognition has important application value in artificial intelligence, human-computer interaction and other fields. With the development of deep learning, emotion recognition has been paid more and more attention by researchers. Existing research has solved the problem of unimodal emotion recognition, but neglected the research on the combination of bidirectional long and short neural networks and attention mechanisms. Based on this, we propose an emotion recognition model based on Bi-LSTM and multi-head attention mechanism, which combines the characteristics of LSTM for long-term memory and the advantage that the attention mechanism can quickly screen out more important information among many information, and further improves the accuracy of multimodal emotion recognition. Experimental results show that compared with CNN, CMN, BC-LSTM and other models, this model has better accuracy and f1-score.

Keywords

Attention mechanism, LSTM, multimodal emotion recognition

Cite This Paper

Haoxia Guo, Ziheng Gao. Multimodal sentiment recognition based on Bi-LSTM and fusion mechanism. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 6: 127-132. https://doi.org/10.25236/AJCIS.2023.060620.

References

[1] Shou Y, Meng T, Ai W, et al. Conversational emotion recognition studies based on graph convolutional neural networks and a dependent syntactic analysis [J]. Neurocomputing, 2022, 501: 629-639.

[2] Baltrušaitis T, Ahuja C, Morency L P. Multimodal machine learning: a survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(2): 423–443

[3] Shou Y, Meng T, Ai W, et al. Object Detection in Medical Images Based on Hierarchical Transformer and Mask Mechanism [J]. Computational Intelligence and Neuroscience, 2022.

[4] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780. [doi: 10.1162/neco. 1997.9.8.1735]

[5] Liu Jingjing, Wu Xiaofeng. Multimodal emotion recognition and spatial annotation based on long short-term memory network [J]. Fudan Journal (Natural Science Edition), 2020, 59(5): 565-574.

[6] Hazarika D, Gorantla S, Poria S, Et Al. Self-attentive feature-level fusion for multimodal emotion detec tion[C]//Proceedings of the IEEE 1st Conference on Multi-media Information Processing and Retrieval, Miami, Apr 10-12, 2018. Piscataway: IEEE, 2018: 196-201.

[7] Huang J, Tao J H, Liu B, et al. Multimodal transfo rmer fusion for continuous emotion recognition [C]//Procee-dings of the IEEE 2020 International Conference on Acou stics, Speech and Signal Processing, Barcelona, May 4-8, 2020. Piscataway: IEEE, 2020: 3507-3511.

[8] Wang Lanxin, Wang Weiya, Cheng Xin. Bi-LSTM-CNN dual-modal emotion recognition model for speech and text [J]. Computer Engineering and Applications, 2022, 58(4): 192-197.)

[9] Ying R K, Shou Y, Liu C. Prediction Model of Dow Jones Index Based on LSTM-Adaboost [C]//2021 International Conference on Communications, Information System and Computer Engineering (CISCE). IEEE, 2021: 808-812.

[10] Meng T, Shou Y, Ai W, et al. A Multi-Message Passing Framework Based on Heterogeneous Graphs in Conversational Emotion Recognition [J]. Available at SSRN 4353605, 2021.