Academic Journal of Computing & Information Science, 2025, 8(11); doi: 10.25236/AJCIS.2025.081106.
Sidan Hou, Yu Xiang, Wei Wang, Min Li
School of Information Science and Technology, Yunnan Normal University, Kunming, China
Data augmentation (DA) is a critical technique for addressing data scarcity in deep learning. However, traditional random augmentation methods are inefficient, while active learning (AL) often overlooks the high-order correlations among samples. To address these limitations, we propose the Progressive Active Data Augmentation (PADA) framework, which applies the intelligent selection principles of AL to the data augmentation process. Within this framework, we designed the core selection strategy, Hyper-Ada. This strategy leverages an Adaptive Hypergraph Convolution (AdaHGConv) network to refine new features embedded with high-order relations, derived from the global context of all augmented samples. We innovatively combine the model’s prediction "Certainty" with our proposed "Representation Shift"—the magnitude of feature change before and after refinement—as the selection criterion to identify high-quality "anchor" samples. Experiments on CIFAR-10, SVHN, and CIFAR-100 demonstrate that Hyper-Ada significantly outperforms supervised baselines and traditional AL methods, particularly in data-sparse scenarios. This validates the efficacy of guiding data augmentation through high-order feature refinement.
Data Augmentation, High - order Features, Active Learning, Adaptive Hypergraph, Image Classification
Sidan Hou, Yu Xiang, Wei Wang, Min Li. Hyper-Ada: Active Data Augmentation via High-Order Feature Refinement with Adaptive Hypergraphs. Academic Journal of Computing & Information Science (2025), Vol. 8, Issue 11: 52-61. https://doi.org/10.25236/AJCIS.2025.081106.
[1] Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS), 25 (2012).
[2] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016).
[3] Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–12 (1994).
[4] Sener, O., Savarese, S.: Active learning for convolutional neural networks: A core-set approach. In International Conference on Learning Representations (ICLR) (2018).
[5] Lei, M., Li, S., Wu, Y., et al.: YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception. arXiv preprint arXiv:2506.17733 (2025).
[6] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).
[7] Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 13001–13008 (2020).
[8] Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond empirical risk minimization. In International Conference on Learning Representations (ICLR) (2018).
[9] Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6023–6032 (2019).
[10] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 113–123 (2019).
[11] Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 702–703 (2020).
[12] Settles, B., Craven, M., Ray, S.: Multiple-instance active learning. In Advances in Neural Information Processing Systems (NIPS), 21 (2008).
[13] Ash, J.T., Zhang, C., Krishnamurthy, A., Langford, J., Agarwal, A.: Deep batch active learning by diverse gradient embeddings. In International Conference on Learning Representations (ICLR) (2020).
[14] Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR) (2017).
[15] Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. In International Conference on Learning Representations (ICLR) (2018).
[16] Berge, C.: Graphs and Hypergraphs. North-Holland Publishing Company (1973).
[17] Zhou, D., Huang, J., Schölkopf, B.: Learning with hypergraphs: Clustering, classification, and embedding. In Advances in Neural Information Processing Systems (NIPS), 19 (2006).
[18] Gao, Y.W., Wang, J., Zhang, Z., Ji, R., Wu, Y.: Hypergraph-induced semantic tuplet loss for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7646–7655 (2022).
[19] An, C., Chen, S., Zhang, Z., Wang, X., Zhang, Z.: Graph-based active learning for memory-efficient deep learning. arXiv preprint arXiv:2106.14227 (2021).
[20] Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical Report, University of Toronto (2009).
[21] Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2011).
[22] Zagoruyko, S., Komodakis, N.: Wide residual networks. In British Machine Vision Conference (BMVC) (2016).