Academic Journal of Computing & Information Science, 2025, 8(3); doi: 10.25236/AJCIS.2025.080303.
Mengsong Wang, Xingchen Wu
College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo City, China
To address the problem of data classification uncertainty caused by redundant information in information systems (IIS), a tolerance relation is used to expand rough sets, and an instance selection algorithm (ISM) and attribute reduction algorithm (ARM) based on the nearest tolerance relation are proposed. Firstly, the process of computing the approximate set is vectorized using a matrix approach. Based on the results of the lower approximations, an instance selection algorithm (ISM) is designed to determine the instance set. An attribute reduction algorithm (ARM) is designed using attribute dependency as heuristic information. Starting from a core set of attributes in a bottom-up manner, non-core attributes are added to the core set based on the importance of external attributes, resulting in a core reduction set. The experimental results on nine UCI datasets show that the matrix based on ISM algorithm and ARM algorithm effectively remove redundant samples and attributes, and improve various performance indicators. Compared with the original dataset, the ISM algorithm achieved an average instance reduction ratio of 44.2%, and the ARM algorithm achieved an average attribute reduction ratio of 33%. Compared with other attribute reduction algorithms, the ARM algorithm has an overall improvement of 1.15% in classification accuracy on KNN and SVM classifiers.
tolerance relation; instance selection; attribute reduction; attribute dependency; matrix
Mengsong Wang, Xingchen Wu. Instance selection and attribute reduction based on the nearest tolerance relation. Academic Journal of Computing & Information Science(2025), Vol. 8, Issue 3: 18-27. https://doi.org/10.25236/AJCIS.2025.080303.
[1] PAWLAK Z. Rough sets[J]. International Journal of Computer & Information Sciences, 1982, 11(5): 341-356.
[2] Guoqiang Wang, Tianrui Li, Pengfei Zhang, Qianqian Huang and Hongmei Chen. Double-local rough sets for efficient data mining[J]. Information Sciences, 2021, 571: 475-498.
[3] Shuyin Li, Yang Liu. Classification Rule Mining Algorithm for Weighted Fuzzy Rough Sets[J]. Computer Engineering, 2019, 45(9): 211-215.
[4] Kryszkiewicz M, Rough Set Approach to Incomplete Information System[J]. Information Science, 1998, 11(2): 39-49.
[5] STEFANOWSKI J, TSOUKIAS A. Incomplete Information Tables and Rough Classification[J]. Computational Intelligence, 2011, 17(3): 545-566.
[6] Yi Xu and Shanzhong Hu. Extended Rough Set Model Based on Modified Data-driven Valued Tolerance Relation[J]. Journal of Intelligent & Fuzzy Systems, 2019, 36(2): 1615-1625.
[7] DERIS M M, HAMID M A, NORAINI I, et al. Data Reduction Using Similarity Class and Enhanced Tolerance Relation for Complete and Incomplete Information Systems[C]//Proceedings of the 2019 10th International Conference on Information and Communication Systems, Irbid, June 11-13, 2019: 134-139.
[8] Shangzhi Wu, Litai Wang, Shuyue Ge, Zheng Xiong and Jie Liu. Feature Selection Algorithm Using Neighborhood Equivalence Tolerance Relation for Incomplete Decision Systems[J]. Applied Soft Computing, 2024, 157: 111463.
[9] ROHMAT S R, HAIRULNIZAM M, SHAHREEN K, et al. A Relative Tolerance Relation of Rough Set for Incomplete Information System[C]//Proceedings of the 3rd International Conference on Soft Computing and Data Mining, Johor, February 6-8, 2018, 700:72-81.
[10] Wenhao Shu and Hong Shen. Incremental Feature Selection Based on Rough set in Dynamic Incomplete Data[J]. Pattern Recognition, 2014, 47(12): 3890-3906.
[11] Hailiang Zhang and Runliang Jia. Dynamic Attribute Reduction Algorithm Based on Neighborhood Dominance Rough Set[J]. Computer Engineering and Design, 2024, 45(08): 2320-2328.
[12] Lianhui Luo, Jilin Yang, Xianyong Zhang and Junfang Luo. Tri-level Attribute Reduction Based on Neighborhood Rough Sets[J]. Applied Intelligence, 2024, 54(5): 3786-3807.
[13] Jiucheng Xu, Shan Zhang and Qing Bai. Attribute Reduction Algorithm Based on Fuzzy Neighborhood Relative Decision Entropy [J/OL]. Computer Science, 1-13
[14] Bingying Xia and Chen Wu. Research on Attribute Reduction Algorithms Based on Knowledge Dependence by Tolerance Relation [J]. Journal of Jiangsu University of Science and Technology (Natural Science Edition), 2020, 34(02): 72-79.
[15] Jie Zhao, Yun Ling, Faliang Huang, Jiahai Wang and See-To Eric W.K. Incremental Feature Selection for Dynamic Incomplete Data Using Sub-tolerance Relations[J]. Pattern Recognition, 2024, 148: 110125.
[16] Wenhao Shu and Hong Shen. Incremental Feature Selection Based on Rough Set in Dynamic Incomplete Data[J]. Pattern Recognition, 2014, 47(12): 3890-3906.
[17] Ngoc N T, Sartra W. A Novel Feature Selection Method for High-Dimensional Mixed Decision Tables[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(7): 3024-3037.
[18] Xiaojun Xie and Xiaolin Qin. A Novel Incremental Attribute Reduction Approach for Dynamic Incomplete Decision Systems[J]. International Journal of Approximate Reasoning, 2018, 93: 443-462.
[19] Chuan Luo, Tianrui Li, Hongmei Chen, Jianchen Lv and Yi Zhang. Fusing Entropy Measures for Dynamic Feature Selection in Incomplete Approximation Spaces[J]. Knowledge-Based Systems, 2022, 252: 109329.