Computer software--Quality control

Predicting decay in program modules of legacy software systems

Model

Digital Document

Publisher

Florida Atlantic University

Description

Legacy software systems may go through many releases. It is important to ensure that the reliability of a system improves with subsequent releases. Methods are needed to identify decaying software modules, i.e., modules for which quality decreases with each system release. Early identification of such modules during the software life cycle allows us to focus quality improvement efforts in a more productive manner, by reducing resources wasted for testing and improving the entire system. We present a scheme to classify modules in three groups---Decayed, Improved, and Unchanged---based on a three-group software quality classification method. This scheme is applied to three different case studies, using a case-based reasoning three-group classification model. The model identifies decayed modules, and is validated over different releases. The main goal of this work is to focus on the evolution of program modules of a legacy software system to identify modules that are difficult to maintain and may need to be reengineered.

Member of

FAU Theses and Dissertations

Combining decision trees for software quality classification: An empirical study

Model

Digital Document

Geleyn, Erik.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

The increased reliance on computer systems in the modern world has created a need for engineering reliability control of computer systems to the highest standards. Software quality classification models are one of the important tools to achieve high reliability. They can be used to calibrate software metrics-based models to predict whether software modules are fault-prone or not. Timely use of such models can aid in detecting faults early in the life cycle. Individual classifiers may be improved by using the combined decision from multiple classifiers. Several algorithms implement this concept and are investigated in this thesis. These combined learners provide the software quality modeling community with accurate, robust, and goal oriented models. This study presents a comprehensive comparative evaluation of meta learners using a strong and a weak learner, C4.5 and Decision Stump, respectively. Two case studies of industrial software systems are used in our empirical investigations.

Member of

FAU Theses and Dissertations

Software quality classification using rule-based modeling

Model

Digital Document

Mao, Meihui.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Software-based products are part of our daily life. They can be encountered in most of the systems we interact with. This reliance on software products generates a strong need for better software reliability, reducing the cost associated with potential failures. Reliability in software systems may be achieved by using additional testing. However, extensive software testing is expensive and time consuming. Software quality classification models provide an early prediction of a module's quality. Boolean Discriminant Function (BDF), Generalized Boolean Discriminant Function (GBDF), and Rule-Based Modeling (RBM) can be used as classification models. This thesis demonstrates the ability of GBDF and RBM to correctly classify modules. The introduction of the AND operator in the GBDF model and the customizable outcomes for the rules in RBM, enhanced the discriminating quality of GBDF and RBM as compared to BDF. Furthermore, they also yielded better balances for the misclassification rates.

Member of

FAU Theses and Dissertations

Classification of software quality with tree modeling using C4.5 algorithm

Model

Digital Document

Ponnuswamy, Viswanathan Kolathupalayam.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Developing highly reliable software is a must in today's competitive environment. However quality control is a costly and time consuming process. If the quality of software modules being developed can be predicted early in their life cycle, resources can be effectively allocated improving quality, reducing cost and development time. This study examines the C4.5 algorithm as a tool for building classification trees, classifying software module either as fault-prone or not fault-prone. The classification tree models were developed based on four consecutive releases of a very large legacy telecommunication system. The first two releases were used as training data sets and the subsequent two releases were used as test data sets to evaluate the model. We found out that C4.5 was able to build compact classification trees models with balanced misclassification rates.

Member of

FAU Theses and Dissertations

empirical study of analogy-based software quality classification models

Model

Digital Document

Ross, Fletcher Douglas.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Time and cost are among the most important elements in a software project. By efficiently using time and resources we can reduce costs. Any program can potentially contain faults. If we can identify those program modules that have better quality and are less likely to be fault-prone, then we can reduce the effort and cost required in testing these modules. This thesis presents a series of studies evaluating the use of Case-Based Reasoning (CBR ) as an effective method for classifying program modules based upon their quality. We believe that this is the first time that the mahalanobis distance, a distance measure utilizing the covariance matrix of the independent variables which accounts for the multi-colinearity of the data without the necessity for preprocessing, and data clustering, wherein the data was separated into groups based on a dependent variable have been used as modeling techniques in conjunction with (CBR).

Member of

FAU Theses and Dissertations

Implementation of a three-group classification model using case-based reasoning

Model

Digital Document

Song, Huiming.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Reliability is becoming a very important and competitive factor for software products. Software quality models based on software metrics provide a systematic and scientific way to detect software faults early and to improve software reliability. Classification models for software quality usually classify observations using two groups. This thesis presents a new algorithm for classification using three groups, i.e., Three-Group Classification Model using Case Based Reasoning. The basic idea behind the algorithm is that it uses the commonly used two-group classification method three times. This algorithm can be implemented with other techniques such as logistic regression, classification tree models, etc. This work compares its quality with the Discriminant Analysis method. We find that our new method performs much better than Discriminant Analysis. We also show that the addition of object-oriented software measures yielded a model that a practitioner may actually prefer over the simpler procedural measures model.

Member of

FAU Theses and Dissertations

empirical study of module order models

Model

Digital Document

Adipat, Boonlit.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Most software reliability approaches classify modules as fault-prone or not fault-prone by way of a predetermined threshold. However, it may not be practical to predefine a threshold because the amount of resources for reliability enhancement may be unknown. Therefore, a module-order model (MOM) predicting the rank order of modules can be used to solve this problem. The objective of this research is to make an empirical study of MOMs based on five different underlying quantitative software quality models. We examine the benefits of principal components analysis with MOM and demonstrate that better accuracy of underlying techniques does not always yield better performance with MOM. Three case studies of large industrial software systems were conducted. The results confirm that MOM can create efficient models using different underlying techniques that provide various accuracy when predicting a quantitative software quality factor over the data sets.

Member of

FAU Theses and Dissertations

Software fault prediction using tree-based models

Model

Digital Document

Seliya, Naeem A.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Maintaining superior quality and reliability in software systems is of utmost importance in today's world. Early fault prediction is a proven method for achieving this. Tree based modelling is a simple and effective method that can be used to predict the number of faults in a software system. In this thesis, we use regression tree based modelling to predict the number of faults in a software module. The goal of this study is four-fold. First, a comparative study of the tree based modelling tools CART and S-PLUS. CART yielded simpler regression trees than those built by S-PLUS. Second, a comparative study of the least squares and the least absolute deviation methods of CART. It is shown that the latter yielded better results than the former. Third, a study of the possible benefits of using principal components analysis when performing regression tree modelling. The fourth and final study is a comparison of tree based modelling with other prediction techniques namely, Case Based Reasoning, Artificial Neural Networks and Multiple Linear Regression.

Member of

FAU Theses and Dissertations

empirical study of analogy-based software fault prediction

Model

Digital Document

Sundaresh, Nandini.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Ensuring quality and reliability in software is important with its growing use in day to day life. Having an estimate of the number of faults in software modules early in their life cycles will enable software project managers to direct testing efforts in those considered risky and reduce the waste of resources in testing the entire software system. Case-based reasoning, abbreviated CBR, is one of the methods which predicts the number of faults in a software. The scope of this thesis is two-fold. First, it empirically investigates the effects of the different factors on the predictive accuracy of CBR. Experiments were done to compare different similarity functions, solution processes, and maximum number of nearest neighbors. Second, it compares the predictive accuracy of CBR models with multiple linear regression and artificial neural network models. The average absolute error and average relative error are used to determine the model with a high accuracy of prediction.

Member of

FAU Theses and Dissertations

Modeling software quality at system and subsystem level with TREEDISC classification algorithm

Model

Digital Document

Liu, Jinxia.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Software quality models are tools for detecting faults early in the software development process. In this research, the TREEDISC algorithm and a general classification rule were used to create classification tree models and predict software quality by classifying software modules as fault-prone or not fault-prone. Software metrics were collected from four consecutive releases of a very large legacy telecommunications system with six subsystems. Using release 1, four classification tree models were built using raw metrics, and another four tree models were built using PCA metrics. Models were then selected based on release 2. Releases 3 and 4 were used to validate the selected model. Models that used PCA metrics were as good as or better than models that used raw metrics. This study also investigated the performance of classification tree models, when the subsystem identifier was included as a predictor.

Member of

FAU Theses and Dissertations

Computer software--Quality control