Academic Journal of Computing & Information Science, 2026, 9(1); doi: 10.25236/AJCIS.2026.090110.
Yirui Sun
The National University of Malaysia, Bangi, Selangor, 43000, Malaysia
Facial Expression Recognition (FER) has increasingly become a research focal point in the domain of affective computing. However, traditional feature extraction algorithms and shallow neural networks often encounter limitations in capturing robust semantic features within complex, unconstrained scenarios characterized by drastic illumination fluctuations and partial facial occlusions. To deeply evaluate the performance of various deep neural networks in complex recognition tasks, this study conducts a systematic comparative analysis of VGG16, DenseNet, and ResNet variants (ResNet18, ResNet34, and ResNet50) based on the large-scale public benchmark dataset FER-2013. Addressing the grayscale nature of the image data, we performed single-channel adaptation on the input layers of each model and integrated Dropout and Batch Normalization strategies into the fully connected layers to effectively suppress overfitting. Experimental results demonstrate that ResNet50 achieves a superior validation accuracy of 85.71%, effectively bypassing the gradient vanishing bottleneck via its residual learning mechanism. This performance far surpasses that of VGG16 and DenseNet, both of which failed to maintain adequate generalization due to limited representational capacity or catastrophic overfitting. Ultimately, ResNet50 demonstrates exceptional robustness and a decisive advantage over other baseline architectures in capturing the complex nuances of human emotions.
machine learning, deep learning, facial expression recognition, comparative study
Yirui Sun. A Comparative Study of Facial Expression Recognition Based on Deep Residual Networks. Academic Journal of Computing & Information Science (2026), Vol. 9, Issue 1: 80-86. https://doi.org/10.25236/AJCIS.2026.090110.
[1] Li, S., & Deng, W. (2020). Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing, 13(3), 1195–1215.
[2] Wang, K., Peng, X., Yang, J., et al. (2020). Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(pp. 6897–6906).
[3] Ekman, P. (1971). Universals and cultural differences in facial expressions of emotion. In Nebraska Symposium on Motivation. University of Nebraska Press.
[4] Shan, C., Gong, S., & McOwan, P. W. (2009). Facial expression recognition based on local binary patterns: A comprehensive study. Image and Vision Computing, 27(6), 803–816.
[5] Carcagnì, P., Del Coco, M., Leo, M., et al. (2015). Facial expression recognition and histograms of oriented gradients: A comprehensive study. SpringerPlus, 4(1), 645.
[6] LeCun, Y., Bottou, L., Bengio, Y., et al. (2002). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
[7] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[8] Szegedy, C., Liu, W., Jia, Y., et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(pp. 1–9).
[9] Tang, Y. (2013). Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239.
[10] Adyapady, R. R., & Annappa, B. (2023). A comprehensive review of facial expression recognition techniques. Multimedia Systems, 29(1), 73–103.
[11] Durmuşoğlu, A., & Kahraman, Y. (2016). Facial expression recognition using geometric features. In 2016 International Conference on Systems, Signals and Image Processing (IWSSIP)(pp. 1–5). IEEE.
[12] Dhavalikar, A. S., & Kulkarni, R. K. (2014). Facial expression recognition using Euclidean distance method. Journal of Telematics and Informatics, 2(1), 1–6.
[13] Li, Y., Zhou, Z., Feng, Q., et al. (2025). Analysis and comparison of machine learning-based facial expression recognition algorithms. Algorithms, 18(12), 800.
[14] Jaffar, M. A. (2017). Facial expression recognition using hybrid texture features based ensemble classifier. International Journal of Advanced Computer Science and Applications, 8(6).
[15] Sharma, M., Jalal, A. S., & Khan, A. (2019). Emotion recognition using facial expression by fusing key points descriptor and texture features. Multimedia Tools and Applications, 78(12), 16195–16219.
[16] Liao, J., Lin, Y., Ma, T., et al. (2023). Facial expression recognition methods in the wild based on fusion feature of attention mechanism and LBP. Sensors, 23(9), 4204.
[17] Liu, K., Zhang, M., & Pan, Z. (2016). Facial expression recognition with CNN ensemble. In 2016 International Conference on Cyberworlds (CW)(pp. 163–166). IEEE.
[18] Abdullah, S. M. S., & Abdulazeez, A. M. (2021). Facial expression recognition based on deep learning convolution neural network: A review. Journal of Soft Computing and Data Mining, 2(1), 53–65.
[19] Abdulsattar, N. S., & Hussain, M. N. (2022). Facial expression recognition using transfer learning and fine-tuning strategies: A comparative study. In 2022 International Conference on Computer Science and Software Engineering (CSASE)(pp. 101–106). IEEE.
[20] Rawat, U., & Rai, C. S. (2023). Improving facial emotion recognition through transfer learning with deep convolutional neural network (DCNN) models. In 2023 3rd International Conference on Technological Advancements in Computational Sciences (ICTACS)(pp. 1335–1339). IEEE.
[21] Talele, M., & Jain, R. (2025). A comparative analysis of CNNs and ResNet50 for facial emotion recognition. Engineering, Technology & Applied Science Research, 15(2), 20693–20701.
[22] Zhao, Z., Li, Y., Yang, J., et al. (2024). A lightweight facial expression recognition model for automated engagement detection. Signal, Image and Video Processing, 18(4), 3553–3563.
[23] Sawan, H., Deka, R., Saikia, S., et al. (2025). Optimizing CNN models for facial expression recognition: A comparative study of fine-tuning impact. In International Conference on Sustainable Science and Technology for Tomorrow (SciTech 2024)(pp. 163–179). Atlantis Press.
[24] Islam, M. A., Kowal, M., Esser, P., et al. (2021). Shape or texture: Understanding discriminative features in CNNs. arXiv preprint arXiv:2101.11604.
[25] Perveen, G., Ali, S. F., Ahmad, J., et al. (2023). Multi-stream deep convolution neural network with ensemble learning for facial micro-expression recognition. IEEE Access, 11, 118474–118489.
[26] He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(pp. 770–778).
[27] Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations.