Welcome to Francis Academic Press

Academic Journal of Computing & Information Science, 2023, 6(5); doi: 10.25236/AJCIS.2023.060511.

Research of Using Deep Learning Language Model to Classify Depression by Level


Ziyang Liu

Corresponding Author:
Ziyang Liu

Cate School, Carpinteria, California, America


This article presents a multimodal neural network method that can process audio and text data simultaneously. The method uses BiLSTM and BiGRU network structures and has broad clinical and public application prospects. It has significant advantages in depression screening with high accuracy, low cost, and fast speed. The method can be applied to the whole population, especially those who are not easily accessible to healthcare. Additionally, it can be used as a fast and effective monitoring tool for continuous monitoring of the deterioration and improvement of depression.


Multimodal neural network, Depression screening, BiLSTM, BiGRU, Continuous monitoring

Cite This Paper

Ziyang Liu. Research of Using Deep Learning Language Model to Classify Depression by Level. Academic Journal of Computing & Information Science (2023), Vol. 6, Issue 5: 85-90. https://doi.org/10.25236/AJCIS.2023.060511.


[1] Lazar, M. and Davenport, L. (2018) “Barriers to health care access for low income families: A review of literature,” Journal of Community Health Nursing, 35(1), pp. 28–37. Available at: https://doi.org/10.1080/07370016.2018.1404832. 

[2] The Lancet: Prevalence and treatment of depressive disorders in China. Available at: https://news.medlive.cn/psy/info-progress/show-181716_60.html (Accessed: April 9, 2022). 

[3] Yang, L. et al. (2017) “Hybrid depression classification and estimation from audio video and text information,” Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge [Preprint]. Available at: https://doi.org/10.1145/3133944.3133950. 

[4] de Melo, W.C., Granger, E. and Hadid, A. (2019) “Combining global and local convolutional 3D networks for detecting depression from facial expressions,” 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019) [Preprint]. Available at: https: // doi. org/ 10. 1109/fg.2019.8756568. 

[5] Al Hanai, T., Ghassemi, M. and Glass, J. (2018) “Detecting depression with audio/text sequence modeling of interviews,” Interspeech 2018 [Preprint]. Available at: https: // doi. org/ 10. 21437/ interspeech. 2018-2522. 

[6] Burdisso, S.G., Errecalde, M. and Montes-y-Gómez, M. (2019) “A text classification framework for simple and effective early depression detection over social media streams,” Expert Systems with Applications, 133, pp. 182–197. Available at: https://doi.org/10.1016/j.eswa. 2019.05.023. 

[7] Shen, T. et al. (2018) “Cross-domain depression detection via harvesting social media,” Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence [Preprint]. Available at: https://doi.org/10.24963/ijcai.2018/223. 

[8] Shen, Y., Yang, H. and Lin, L. (2022) “Automatic depression detection: An emotional audio-textual corpus and a GRU/BILSTM-based model,” ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) [Preprint]. Available at: https: // doi. org/ 10. 1109/ icassp43922. 2022.9746569. 

[9] Bouchard-Cannon, P. et al. (2013) “The circadian molecular clock regulates adult hippocampal neurogenesis by controlling the timing of cell-cycle entry and exit,” Cell Reports, 5(4), pp. 961–973. Available at: https://doi.org/10.1016/j.celrep.2013.10.037. 

[10] Kim, S.-W., Choi, S.-P. (2018) “Research on joint models for Korean word spacing and pos (part-of-speech) tagging based on bidirectional LSTM-CRF,” Journal of KIISE, 45(8), pp. 792–800. Available at: https://doi.org/10.5626/jok.2018.45.8.792. 

[11] Zhang, R. et al. (2022) “Bi-directional gated recurrent unit recurrent neural networks for failure prognosis of proton exchange membrane fuel cells,” International Journal of Hydrogen Energy, 47(77), pp. 33027–33038. Available at: https://doi.org/10.1016/j.ijhydene.2022.07.188.