Analysis of machine learning algorithms on bioinformatics data of varying quality

File
Publisher
Florida Atlantic University
Date Issued
2015
EDTF Date Created
2015
Description
One of the main applications of machine learning in bioinformatics is the construction of classification models which can accurately classify new instances using information gained from previous instances. With the help of machine learning algorithms (such as supervised classification and gene selection) new meaningful knowledge can be extracted from bioinformatics datasets that can help in disease diagnosis and prognosis as well as in prescribing the right treatment for a disease. One particular challenge encountered when analyzing bioinformatics datasets is data noise, which refers to incorrect or missing values in datasets. Noise can be introduced as a result of experimental errors (e.g. faulty microarray chips, insufficient resolution, image corruption, and incorrect laboratory procedures), as well as other errors (errors
during data processing, transfer, and/or mining). A special type of data noise
called class noise, which occurs when an instance/example is mislabeled. Previous
research showed that class noise has a detrimental impact on machine learning algorithms (e.g. worsened classification performance and unstable feature selection). In
addition to data noise, gene expression datasets can suffer from the problems of high
dimensionality (a very large feature space) and class imbalance (unequal distribution
of instances between classes). As a result of these inherent problems, constructing accurate classification models becomes more challenging.
Note

Includes bibliography.

Language
Type
Extent
154 p.
Identifier
FA00004425
Additional Information
Includes bibliography.
Dissertation (Ph.D.)--Florida Atlantic University, 2015.
FAU Electronic Theses and Dissertations Collection
Date Backup
2015
Date Created Backup
2015
Date Text
2015
Date Created (EDTF)
2015
Date Issued (EDTF)
2015
Extension


FAU

IID
FA00004425
Person Preferred Name

Shanab, Ahmad Abu

author

Graduate College
Physical Description

application/pdf
154 p.
Title Plain
Analysis of machine learning algorithms on bioinformatics data of varying quality
Use and Reproduction
Copyright © is held by the author, with permission granted to Florida Atlantic University to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
http://rightsstatements.org/vocab/InC/1.0/
Origin Information

2015
2015
Florida Atlantic University

Boca Raton, Fla.

Physical Location
Florida Atlantic University Libraries
Place

Boca Raton, Fla.
Sub Location
Digital Library
Title
Analysis of machine learning algorithms on bioinformatics data of varying quality
Other Title Info

Analysis of machine learning algorithms on bioinformatics data of varying quality