Model
Digital Document
Publisher
Florida Atlantic University
Description
This thesis presents a noise handling technique that attempts to improve the quality of training data for classification purposes by eliminating instances that are likely to be noise. Our approach uses twenty five different classification techniques to create an ensemble of classifiers that acts as a noise filter on real-world software measurement datasets. Using a relatively large number of base-level classifiers for the ensemble-classifier filter facilitates in achieving the desired level of noise removal conservativeness with several possible levels of filtering. It also provides a higher degree of confidence in the noise elimination procedure as the results are less likely to get influenced by (possible) inappropriate learning bias of a few algorithms with twenty five base-level classifiers than with a relatively smaller number of base-level classifiers. Empirical case studies of two different high assurance software projects demonstrate the effectiveness of our noise elimination approach by the significant improvement achieved in classification accuracies at various levels of filtering.
Member of