YOLOv8-Plus: A Small Object Detection Model Based on Fine Feature Capture and Enhanced Attention Convolution Fusion

<p>Hui Li, Xiaoyan Pang</p>

doi:10.25236/AJCIS.2025.080316

Academic Journal of Computing & Information Science, 2025, 8(3); doi: 10.25236/AJCIS.2025.080316.

YOLOv8-Plus: A Small Object Detection Model Based on Fine Feature Capture and Enhanced Attention Convolution Fusion

Author(s)

Hui Li, Xiaoyan Pang

Corresponding Author:

Hui Li

Affiliation(s)

School of Software, Henan Polytechnic University, Jiaozuo, 454000, China

Download PDF
|
Download: 33
|
View: 2710

Abstract

Small object detection holds significant value in various practical applications. However, due to their limited pixel coverage, weak feature information, and susceptibility to background noise, YOLOv8 faces challenges in detecting small objects, including low recognition accuracy and missed detections. To address these issues, we propose an improved small object detection model, YOLOv8-Plus. First, to tackle the difficulty in detecting subtle features of small objects in the YOLOv8 model, we add a dedicated output layer, TDLayer, in addition to the original three output layers. This new layer generates larger feature maps, allowing for better differentiation of fine details in small objects. Second, to improve feature processing, we design the C2FDSC module, which adaptively adjusts detection strategies based on the shape and characteristics of small objects, ensuring fine details are captured. Finally, to mitigate the impact of background noise, we introduce the EACF module, which combines the advantages of CNNs and attention mechanisms to effectively reduce noise interference, improving both accuracy and robustness in small object detection. Experimental results on the VisDrone2019 dataset show that the improved YOLOv8-Plus model achieves a 6.7% and 4.7% increase in mAP50, respectively, compared to the baseline model. YOLOv8-Plus outperforms other state-of-the-art models, demonstrating superior performance in small object detection tasks in complex scenarios.

Keywords

Small object detection, YOLOv8, Convolutional neural network, Attention mechanism

Cite This Paper

Hui Li, Xiaoyan Pang. YOLOv8-Plus: A Small Object Detection Model Based on Fine Feature Capture and Enhanced Attention Convolution Fusion. Academic Journal of Computing & Information Science (2025), Vol. 8, Issue 3: 116-125. https://doi.org/10.25236/AJCIS.2025.080316.

References

[1] Wang Q , Ye G , Chen S W F .A UAV perspective based lightweight target detection and tracking algorithm for intelligent transportation[J].complex & intelligent systems, 2025, 11(1). DOI:10.1007/s40747-024-01687-7.

[2] Zhang Y Z J .An improved tiny-yolov3 pedestrian detection algorithm[J]. Optik - International Journal for Light and Electron Optics, 2019.

[3] Chen W, Huang H , Peng S ,et al.YOLO-face: a real-time face detector[J].Springer Berlin Heidelberg, 2021(4).DOI:10.1007/s00371-020-01831-7.

[4] Liu S , Li J .EC-PFN: a multiscale woven fusion network for industrial product surface defect detection[J]. complex & intelligent systems, 2025, 11(1).DOI:10.1007/s40747-024-01699-3.

[5] Reis D H D, Welfer D , Cuadros M A D S L ,et al.Mobile Robot Navigation Using an Object Recognition Software with RGBD Images and the YOLO Algorithm[J].Applied Artificial Intelligence [2025-04-04]. DOI:10.1080/08839514.2019.1684778.

[6] Li M, Chen Y, Zhang T, et al. TA-YOLO: a lightweight small object detection model based on multi-dimensional trans-attention module for remote sensing images[J]. Complex & Intelligent Systems, 2024, 10(4): 5459-5473.

[7] Chen F, Ding Q, Hui B, et al. Multi-scale kernel correlation filter algorithm for visual tracking based on the fusion of adaptive features[J]. Acta Optics, 2020, 40: 109-120.

[8] Zhang H, Zhang J, Zhong X, et al. MSM-TDE: multi-scale semantics mining and tiny details enhancement network for retinal vessel segmentation[J]. Complex & Intelligent Systems, 2025, 11(1): 114.

[9] Girshick R , Donahue J , Darrell T ,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[J].IEEE Computer Society, 2014.DOI:10.1109/CVPR.2014.81.

[10] Girshick R .Fast R-CNN[J].Computer Science, 2015.DOI:10.1109/ICCV.2015.169.

[11] Ren S , He K , Girshick R ,et al.Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.DOI:10.1109/TPAMI.2016.2577031.

[12] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

[13] Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, 2016: 21-37.

[14] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.

[15] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.

[16] ocher G. YOLOv5. GitHub code repository. Available at: https://www.github.com/ultralytics/yolov5.2022.

[17] Li C, Li L, Jiang H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J]. arXiv preprint arXiv:2209.02976, 2022.

[18] Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 7464-7475.

[19] Jocher, G., Chaurasia, A., & Qiu, J. Ultralytics YOLO (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics.2023.

[20] Qiao S, Chen L C, Yuille A. Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 10213-10224.

[21] Hong M, Li S, Yang Y, et al. SSPNet: Scale selection pyramid network for tiny person detection from UAV images[J]. IEEE geoscience and remote sensing letters, 2021, 19: 1-5.

[22] Wang M, Yang W, Wang L, et al. FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection[J]. Journal of Visual Communication and Image Representation, 2023, 90: 103752.

[23] Li Y, Zeng J, Shan S, et al. Occlusion aware facial expression recognition using CNN with attention mechanism[J]. IEEE transactions on image processing, 2018, 28(5): 2439-2450.

[24] Tang X, Du D K, He Z, et al. Pyramidbox: A context-assisted single shot face detector[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 797-813.

[25] Hu H, Gu J, Zhang Z, et al. Relation networks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 3588-3597.

[26] Wang H, Wang J, Bai K, et al. Centered multi-task generative adversarial network for small object detection[J]. Sensors, 2021, 21(15): 5194.

[27] Liu J, Li C, Liang F, et al. Inception convolution with efficient dilation search[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 11486-11495.

[28] Rabbi J, Ray N, Schubert M, et al. Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network[J]. Remote Sensing, 2020, 12(9): 1432.

[29] Qi Y, He Y, Qi X, et al. Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2023: 6070-6079.

[30] Hu S, Gao F, Zhou X, et al. Hybrid convolutional and attention network for hyperspectral image denoising[J]. IEEE Geoscience and Remote Sensing Letters, 2024.

[31] Du D , Zhu P , Wen L ,et al.VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results[C]//ICCV visdrone workshop.2019.DOI:10.1109/ICCVW.2019.00030.

[32] Wang C Y, Yeh I H, Mark Liao H Y. Yolov9: Learning what you want to learn using programmable gradient information[C]//European conference on computer vision. Cham: Springer Nature Switzerland, 2024: 1-21.