Computer software--Quality control

Rough Set-Based Software Quality Models and Quality of Data

Model

Digital Document

Publisher

Florida Atlantic University

Description

In this dissertation we address two significant issues of concern. These are software
quality modeling and data quality assessment. Software quality can be measured by software
reliability. Reliability is often measured in terms of the time between system failures. A
failure is caused by a fault which is a defect in the executable software product. The time
between system failures depends both on the presence and the usage pattern of the software.
Finding faulty components in the development cycle of a software system can lead to a
more reliable final system and will reduce development and maintenance costs. The issue of
software quality is investigated by proposing a new approach, rule-based classification model
(RBCM) that uses rough set theory to generate decision rules to predict software quality.
The new model minimizes over-fitting by balancing the Type I and Type II niisclassiflcation
error rates. We also propose a model selection technique for rule-based models called rulebased
model selection (RBMS). The proposed rule-based model selection technique utilizes
the complete and partial matching rule sets of candidate RBCMs to determine the model
with the least amount of over-fitting. In the experiments that were performed, the RBCMs
were effective at identifying faulty software modules, and the RBMS technique was able to
identify RBCMs that minimized over-fitting. Good data quality is a critical component for building effective software quality models.
We address the significance of the quality of data on the classification performance of learners
by conducting a comprehensive comparative study. Several trends were observed in the
experiments. Class and attribute had the greatest impact on the performance of learners
when it occurred simultaneously in the data. Class noise had a significant impact on the
performance of learners, while attribute noise had no impact when it occurred in less than
40% of the most significant independent attributes. Random Forest (RF100), a group of 100
decision trees, was the most, accurate and robust learner in all the experiments with noisy
data.

Member of

FAU Theses and Dissertations

Prediction of software quality using classification tree modeling

Model

Digital Document

Naik, Archana B.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Reliability of software systems is one of the major concerns in today's world as computers have really become an integral part of our lives. Society has become so dependent on reliable software systems that failures can be dangerous in terms of worsening a company's business, human relationships or affecting human lives. Software quality models are tools for focusing efforts to find faults early in the development. In this experiment, we used classification tree modeling techniques to predict the software quality by classifying program modules either as fault-prone or not fault-prone. We introduced the Classification And Regression Trees (scCART) algorithm as a tool to generate classification trees. We focused our experiments on very large telecommunications system to build quality models using set of product and process metrics as independent variables.

Member of

FAU Theses and Dissertations

Improved models of software quality

Model

Digital Document

Szabo, Robert Michael.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Though software development has been evolving for over 50 years, the development of computer software systems has largely remained an art. Through the application of measurable and repeatable processes, efforts have been made to slowly transform the software development art into a rigorous engineering discipline. The potential gains are tremendous. Computer software pervades modern society in many forms. For example, the automobile, radio, television, telephone, refrigerator, and still-camera have all been transformed by the introduction of computer based controls. The quality of these everyday products is in part determined by the quality of the computer software running inside them. Therefore, the timely delivery of low-cost and high-quality software to enable these mass market products becomes very important to the long term success of the companies building them. It is not surprising that managing the number of faults in computer software to competitive levels is a prime focus of the software engineering activity. In support of this activity, many models of software quality have been developed to help control the software development process and ensure that our goals of cost and quality are met on time. In this study, we focus on the software quality modeling activity. We improve existing static and dynamic methodologies and demonstrate new ones in a coordinated attempt to provide engineering methods applicable to the development of computer software. We will show how the power of separate predictive and classification models of software quality may be combined into one model; introduce a three group fault classification model in the object-oriented paradigm; demonstrate a dynamic modeling methodology of the testing process and show how software product measures and software process measures may be incorporated as input to such a model; demonstrate a relationship between software product measures and the testability of software. The following methodologies were considered: principal components analysis, multiple regression analysis, Poisson regression analysis, discriminant analysis, time series analysis, and neural networks. Commercial grade software systems are used throughout this dissertation to demonstrate concepts and validate new ideas. As a result, we hope to incrementally advance the state of the software engineering "art".

Member of

FAU Theses and Dissertations

Classification of software quality using tree modeling with the SPRINT/SLIQ algorithm

Model

Digital Document

Mao, Wenlei.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Providing high quality software products is the common goal of all software engineers. Finding faults early can produce large savings over the software life cycle. Therefore, software quality has become the main subject in our research field. This thesis presents a series of studies on a very large legacy telecommunication system. The system has significantly more than ten million lines of code written in a high-level language similar to Pascal. Software quality models were developed to predict the class of each module either as fault-prone or as not fault-prone. We used the SPRINT/SLIQ algorithm to build the classification tree models. We found out that SPRINT/ SLIQ as an improved CART algorithm can give us tree models with more accuracy, more balance, and less overfitting. We also found that software process metrics can significantly improve the predictive accuracy of software quality models.

Member of

FAU Theses and Dissertations

Cost of misclassification in software quality models

Model

Digital Document

Guan, Xin.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Reliability has become a very important and competitive factor for software products. Using software quality models based on software measurements provides a systematic and scientific way to detect software faults early and to improve software reliability. This thesis considers several classification techniques including Generalized Classification Rule, MetaCost algorithm, Cost-Boosting algorithm and AdaCost algorithm. We also introduce the weighted logistic regression algorithm, and a new method to evaluate the performance of classification models---ROC Analysis. We focus our experiments on a very large legacy telecommunications system (LLTS) to build software quality models with principal components analysis. Two other data sets, CCCS and LTS are also used in our experiments.

Member of

FAU Theses and Dissertations

Modeling software quality with TREEDISC algorithm

Model

Digital Document

Yuan, Xiaojing

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Software quality is crucial both to software makers and customers. However, in reality, improvement of quality and reduction of costs are often at odds. Software modeling can help us to detect fault-prone software modules based on software metrics, so that we can focus our limited resources on fewer modules and lower the cost but still achieve high quality. In the present study, a tree classification modeling technique---TREEDISC was applied to three case studies. Several major contributions have been made. First, preprocessing of raw data was adopted to solve the computer memory problem and improve the models. Secondly, TREEDISC was thoroughly explored by examining the roles of important parameters in modeling. Thirdly, a generalized classification rule was introduced to balance misclassification rates and decrease type II error, which is considered more costly than type I error. Fourthly, certainty of classification was addressed. Fifthly, TREEDISC modeling was validated over multiple releases of software product.

Member of

FAU Theses and Dissertations

Modeling software quality with classification trees using principal components analysis

Model

Digital Document

Shan, Ruqun.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Software quality models often have raw software metrics as the input data for predicting quality. Raw metrics are usually highly correlated with one another and thus may result in unstable models. Principal components analysis is a statistical method to improve model stability. This thesis presents a series of studies on a very large legacy telecommunication system. The system has significantly more than ten million lines of code written in a high level language similar to Pascal. Software quality models were developed to predict the class of each module either as fault-prone or as not fault-prone. We found out that the models based on principal components analysis were more robust than those based on raw metrics. We also found out that software process metrics can significantly improve the predictive accuracy of software quality models.

Member of

FAU Theses and Dissertations

Classification of software quality using tree modeling with the S-Plus algorithm

Model

Digital Document

Deng, Jianyu.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

In today's competitive environment for software products, quality has become an increasingly important asset to software development organizations. Software quality models are tools for focusing efforts to find faults early in the development. Delaying corrections can lead to higher costs. In this research, the classification tree modeling technique was used to predict the software quality by classifying program modules either as fault-prone or not fault-prone. The S-Plus regression tree algorithm and a general classification rule were applied to yield classification tree models. Two classification tree models were developed based on four consecutive releases of a very large legacy telecommunications system. The first release was used as the training data set and the subsequent three releases were used as evaluation data sets. The first model used twenty-four product metrics and four execution metrics as candidate predictors. The second model added fourteen process metrics as candidate predictors.

Member of

FAU Theses and Dissertations

Tree-based classification models for analyzing a very large software system

Model

Digital Document

Bullard, Lofton A.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Software systems that control military radar systems must be highly reliable. A fault can compromise safety and security, and even cause death of military personnel. In this experiment we identify fault-prone software modules in a subsystem of a military radar system called the Joint Surveillance Target Attack Radar System, JSTARS. An earlier version was used in Operation Desert Storm to monitor ground movement. Product metrics were collected for different iterations of an operational prototype of the subsystem over a period of approximately three years. We used these metrics to train a decision tree model and to fit a discriminant model to classify each module as fault-prone or not fault-prone. The algorithm used to generate the decision tree model was TREEDISC, developed by the SAS Institute. The decision tree model is compared to the discriminant model.

Member of

FAU Theses and Dissertations

improved neural net-based approach for predicting software quality

Model

Digital Document

Guasti, Peter John.

Khoshgoftaar, Taghi M.

Pandya, Abhijit S.

Publisher

Florida Atlantic University

Description

Accurately predicting the quality of software is a major problem in any software development project. Software engineers develop models that provide early estimates of quality metrics which allow them to take action against emerging quality problems. Most often the predictive models are based upon multiple regression analysis which become unstable when certain data assumptions are not met. Since neural networks require no data assumptions, they are more appropriate for predicting software quality. This study proposes an improved neural network architecture that significantly outperforms multiple regression and other neural network attempts at modeling software quality. This is demonstrated by applying this approach to several large commercial software systems. After developing neural network models, we develop regression models on the same data. We find that the neural network models surpass the regression models in terms of predictive quality on the data sets considered.

Member of

FAU Theses and Dissertations

Computer software--Quality control