Model-Guided Test-Time Augmentation for Robust Text Classification

<p>Yonghao Wu<sup>1</sup>, Yu Xiang<sup>1</sup>, Wei Wang<sup>1</sup>, Tongzhu Zhao<sup>1</sup>, Tiancai Zhu<sup>1</sup></p>

doi:10.25236/AJCIS.2026.090108

Academic Journal of Computing & Information Science, 2026, 9(1); doi: 10.25236/AJCIS.2026.090108.

Model-Guided Test-Time Augmentation for Robust Text Classification

Author(s)

Yonghao Wu¹, Yu Xiang¹, Wei Wang¹, Tongzhu Zhao¹, Tiancai Zhu¹

Corresponding Author:

Yu Xiang

Affiliation(s)

¹School of Information Science and Technology, Yunnan Normal University, Kunming, China

Download PDF
|
Download: 16
|
View: 1152

Abstract

Test-Time Augmentation (TTA) has the potential to improve robustness in text classification, yet its effectiveness is often limited by semantic drift introduced by indiscriminate text transformations. We propose Model-Guided Test-Time Augmentation (MG-TTA), a selective and training-free algorithm that adaptively chooses augmentation strategies for each test instance. MG-TTA leverages a pre-trained labeling model to evaluate label consistency of candidate augmented samples and selects only those most aligned with the model’s prediction. The selected augmentations are combined with the original input to form a compact test-time ensemble, and final predictions are obtained by simple probability averaging. Experiments on multiple text classification benchmarks demonstrate that MG-TTA consistently outperforms fixed and random augmentation baselines, highlighting the importance of model-guided augmentation selection at inference time.

Keywords

Test-Time Augmentation, Selective Augmentation, Text Classification, Transformer Models, Model-Guided Inference

Cite This Paper

Yonghao Wu, Yu Xiang, Wei Wang, Tongzhu Zhao, Tiancai Zhu. Model-Guided Test-Time Augmentation for Robust Text Classification. Academic Journal of Computing & Information Science (2026), Vol. 9, Issue 1: 63-69. https://doi.org/10.25236/AJCIS.2026.090108.

References

[1] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [J]. NAACL, 2019.

[2] Liu, Y., Ott, M., Goyal, N., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach [J]. arXiv:1907.11692, 2019.

[3] Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention Is All You Need [J]. NeurIPS, 2017.

[4] Wei, J., Zou, K.: EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks [J]. EMNLP-IJCNLP, 2019.

[5] Sennrich, R., Haddow, B., Birch, A.: Improving Neural Machine Translation Models with Monolingual Data [J]. ACL, 2016.

[6] Miller, G.A.: WordNet: A Lexical Database for English [J]. Communications of the ACM, 1995.

[7] Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval [J]. Information Processing & Management, 1988.

[8] Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: Machine Learning in Python [J]. JMLR, 2011.

[9] Wolf, T., Debut, L., Sanh, V., et al.: Transformers: State-of-the-Art Natural Language Processing [J]. EMNLP: System Demonstrations, 2020.

[10] Paszke, A., Gross, S., Massa, F., et al.: PyTorch: An Imperative Style, High-Performance Deep Learning Library [J]. NeurIPS, 2019.

[11] Lu, H., Shanmugam, D., Suresh, H., Guttag, J.: Improved Text Classification via Test-Time Augmentation [J]. arXiv:2206.13607, 2022.

[12] Wang, G., Li, W., Aertsen, M., et al.: Test-time augmentation with uncertainty estimation for deep learning-based medical image segmentation [J]. arXiv:1807.07356, 2019.

[13] Feng, S.Y., Gangal, V., Wei, J., et al.: A Survey of Data Augmentation Approaches for NLP [J]. Findings of ACL, 2021.

[14] Socher, R., Perelygin, A., Wu, J., et al.: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank [J]. EMNLP, 2013.

[15] Loshchilov, I., Hutter, F.: Decoupled Weight Decay Regularization [J]. ICLR, 2019.

[16] Lee, D.-H.: Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks [J]. Workshop/Technical report, 2013.

[17] Ashukha, A., Lyzhov, A., Molchanov, D., Vetrov, D.: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles [J]. NeurIPS, 2020.

[18] Sharma, M., Borisov, L., Hassani, S., Skoglund, A.: Unsupervised Learning for Robust Image Classification with Test-Time Augmentation [J]. CVPR, 2018.