DNA microarrays

Model
Digital Document
Publisher
Florida Atlantic University
Description
This research is concerned with analyzing a set of viral genomes to elucidate the underlying characteristics and determine the information-theoretic aspects of the genomic signatures. The goal of this study thereof, is tailored to address the following: (i) Reviewing various methods available to deduce the features and characteristics of genomic sequences of organisms in general, and particularly focusing on the genomes pertinent to viruses; (ii) applying the concepts of information-theoretics (entropy principles) to analyze genomic sequences; (iii) envisaging various aspects of biothermodynamic energetics so as to determine the framework and architecture that decide the stability and patterns of the subsequences in a genome; (iv) evaluating the genomic details using spectral-domain techniques; (v) studying fuzzy considerations to ascertain the overlapping details in genomic sequences; (vi) determining the common subsequences among various strains of a virus by logistically regressing the data obtained via entropic, energetics and spectral-domain exercises; (vii) differentiating informational profiles of coding and non-coding regions in a DNA sequence to locate aberrant (cryptic) attributes evolved as a result of mutational changes and (viii) finding the signatures of CDS of genomes of viral strains toward rationally conceiving plausible designs of vaccines. Commensurate with the topics indicated above, necessary simulations are proposed and computational exercises are performed (with MatLabTM R2009b and other software as needed). Extensive data gathered from open-literature are used thereof and, simulation results are verified. Lastly, results are discussed, inferences are made and open-questions are identified for future research.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The efforts addressed in this thesis refer to assaying the extent of local features in 2D-images for the purpose of recognition and classification. It is based on comparing a test-image against a template in binary format. It is a bioinformatics-inspired approach pursued and presented as deliverables of this thesis as summarized below: 1. By applying the so-called 'Smith-Waterman (SW) local alignment' and 'Needleman-Wunsch (NW) global alignment' approaches of bioinformatics, a test 2D-image in binary format is compared against a reference image so as to recognize the differential features that reside locally in the images being compared 2. SW and NW algorithms based binary comparison involves conversion of one-dimensional sequence alignment procedure (indicated traditionally for molecular sequence comparison adopted in bioinformatics) to 2D-image matrix 3. Relevant algorithms specific to computations are implemented as MatLabTM codes 4. Test-images considered are: Real-world bio-/medical-images, synthetic images, microarrays, biometric finger prints (thumb-impressions) and handwritten signatures. Based on the results, conclusions are enumerated and inferences are made with directions for future studies.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Transcription factors are macromolecules that are involved in transcriptional regulation by interacting with specific DNA regions, and they can cause activation or silencing of their target genes. Gene regulation by transcriptional control explains different biological processes such as development, function, and disease. Even though transcriptional control has been of great interest for molecular biology, much still remains unknown. This study was designed to generate the most current list of human transcription factor genes. Unique entries of transcription factor genes were collected and entered into Microsoft Office 2007 Access Database along with information about each gene. Microsoft Office 2007 Access tools were used to analyze and group collected entries according to different properties such as activator or repressor record, or presence of certain protein domains. Furthermore, protein sequence alignments of members of different groups were performed, and phylogenetic trees were used to analyze relationship between different members of each group. This work contributes to the existing knowledge of transcriptional regulation in humans.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Microarray expression data which contains the expression levels of a large number of simultaneously observed genes have been used in many scientific research and clinical studies. Due to its high dimensionalities, selecting a small number of genes has shown to be beneficial for many tasks such as building prediction models from the microarray expression data or gene regulatory network discovery. Traditional gene selection methods, however, fail to take the class distribution into the selection process. In biomedical science, it is very common to have microarray expression data which is severely biased with one class of examples (e.g., diseased samples) significantly less than other classes (e.g., normal samples). These sample sets with biased distributions require special attention from researchers for identification of genes responsible for a particular disease. In this thesis, we propose three filtering techniques, Higher Weight ReliefF, ReliefF with Differential Minority Repeat and ReliefF with Balanced Minority Repeat to identify genes responsible for fatal diseases from biased microarray expression data. Our solutions are evaluated on five well-known microarray datasets, Colon, Central Nervous System, DLBCL Tumor, Lymphoma and ECML Pancreas. Experimental comparisons with the traditional ReliefF filtering method demonstrate the effectiveness of the proposed methods in selecting informative genes from microarray expression data with biased sample distributions.