Academic Journal of Computing & Information Science, 2026, 9(1); doi: 10.25236/AJCIS.2026.090108.
Yonghao Wu1, Yu Xiang1, Wei Wang1, Tongzhu Zhao1, Tiancai Zhu1
1School of Information Science and Technology, Yunnan Normal University, Kunming, China
Test-Time Augmentation (TTA) has the potential to improve robustness in text classification, yet its effectiveness is often limited by semantic drift introduced by indiscriminate text transformations. We propose Model-Guided Test-Time Augmentation (MG-TTA), a selective and training-free algorithm that adaptively chooses augmentation strategies for each test instance. MG-TTA leverages a pre-trained labeling model to evaluate label consistency of candidate augmented samples and selects only those most aligned with the model’s prediction. The selected augmentations are combined with the original input to form a compact test-time ensemble, and final predictions are obtained by simple probability averaging. Experiments on multiple text classification benchmarks demonstrate that MG-TTA consistently outperforms fixed and random augmentation baselines, highlighting the importance of model-guided augmentation selection at inference time.
Test-Time Augmentation, Selective Augmentation, Text Classification, Transformer Models, Model-Guided Inference
Yonghao Wu, Yu Xiang, Wei Wang, Tongzhu Zhao, Tiancai Zhu. Model-Guided Test-Time Augmentation for Robust Text Classification. Academic Journal of Computing & Information Science (2026), Vol. 9, Issue 1: 63-69. https://doi.org/10.25236/AJCIS.2026.090108.
[1] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [J]. NAACL, 2019.
[2] Liu, Y., Ott, M., Goyal, N., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach [J]. arXiv:1907.11692, 2019.
[3] Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention Is All You Need [J]. NeurIPS, 2017.
[4] Wei, J., Zou, K.: EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks [J]. EMNLP-IJCNLP, 2019.
[5] Sennrich, R., Haddow, B., Birch, A.: Improving Neural Machine Translation Models with Monolingual Data [J]. ACL, 2016.
[6] Miller, G.A.: WordNet: A Lexical Database for English [J]. Communications of the ACM, 1995.
[7] Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval [J]. Information Processing & Management, 1988.
[8] Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: Machine Learning in Python [J]. JMLR, 2011.
[9] Wolf, T., Debut, L., Sanh, V., et al.: Transformers: State-of-the-Art Natural Language Processing [J]. EMNLP: System Demonstrations, 2020.
[10] Paszke, A., Gross, S., Massa, F., et al.: PyTorch: An Imperative Style, High-Performance Deep Learning Library [J]. NeurIPS, 2019.
[11] Lu, H., Shanmugam, D., Suresh, H., Guttag, J.: Improved Text Classification via Test-Time Augmentation [J]. arXiv:2206.13607, 2022.
[12] Wang, G., Li, W., Aertsen, M., et al.: Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation [J]. arXiv:1807.07356, 2019.
[13] Feng, S.Y., Gangal, V., Wei, J., et al.: A Survey of Data Augmentation Approaches for NLP [J]. Findings of ACL, 2021.
[14] Socher, R., Perelygin, A., Wu, J., et al.: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank [J]. EMNLP, 2013.
[15] Loshchilov, I., Hutter, F.: Decoupled Weight Decay Regularization [J]. ICLR, 2019.
[16] Lee, D.-H.: Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks [J]. Workshop/Technical report, 2013.
[17] Ashukha, A., Lyzhov, A., Molchanov, D., Vetrov, D.: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles [J]. NeurIPS, 2020.
[18] Sharma, M., Borisov, L., Hassani, S., Skoglund, A.: Unsupervised Learning for Robust Image Classification with Test-Time Augmentation [J]. CVPR, 2018.