چکیده
|
Abstract Purpose – The purpose of this study is to enhance data quality and overall accuracy and improve certainty by reducing the negative impacts of the FCM algorithm while clustering real-world data and also decreasing the inherent noise in data sets. Design/methodology/approach – The present study proposed a new effective model based on fuzzy C-means (FCM), ensemble filtering (ENS) and machine learning algorithms, called an FCM-ENS model. This model is mainly composed of three parts: noise detection, noise filtering and noise classification. Findings – The performance of the proposed model was tested by conducting experiments on six data sets from the UCI repository. As shown by the obtained results, the proposed noise detection model very effectively detected the class noise and enhanced performance in case the identified class noisy instances were removed. Originality/value – To the best of the authors’ knowledge, no effort has been made to improve the FCM algorithm in relation to class noise detection issues. Thus, the novelty of existing research is combining the FCM algorithm as a noise detection technique with ENS to reduce the negative effect of inherent noise and increase data quality and accuracy.
|