Model
Digital Document
Publisher
Florida Atlantic University
Description
Genomics has been revolutionized by improved sequencing technologies, allowing for the detailed exploration of microbial dark matter and complex microscopic ecosystems. The bottleneck in genomic workflows has shifted from high-throughput sequencing to data analysis. This dissertation developed the Florida Center for Coastal and Human Health Shotgun Metagenomics Workflow (FCHsm) that is easy to use and tailor to unique datasets. This work acts as the beta-testing for the workflow, as it analyzes disparate biomes (environmental and host microbiomes) at varying sequencing depths (shallow and deep). FCHsm was used to resolve molecular dynamics and mine trans-kingdom metagenomes for secondary metabolic biosynthetic gene clusters (BGCs) in two marine environments— Indian River Lagoon toxic harmful algal blooms (IRL HABs) and the medicinal Leiodermatium sponge holobiont.
First, an in silico mock dataset was analyzed to benchmark the FCHsm workflow. Sourmash, coupled with the Genome Taxonomy Database, outcompeted the other taxonomic profilers by accurately predicting the size of the mock metagenome (450 genomes) and recalling the highest number of species (82 %) and strains (44 %). Nonpareil calculated the sequencing effort needed for 100 % coverage for all the datasets and correctly estimated the 75 Gbp of sequencing needed for almost 100 % coverage of the mock metagenomes (99.5 %). Next, the trans-kingdom metagenomes of the IRL were explored, and potential HAB biomarkers were identified.
First, an in silico mock dataset was analyzed to benchmark the FCHsm workflow. Sourmash, coupled with the Genome Taxonomy Database, outcompeted the other taxonomic profilers by accurately predicting the size of the mock metagenome (450 genomes) and recalling the highest number of species (82 %) and strains (44 %). Nonpareil calculated the sequencing effort needed for 100 % coverage for all the datasets and correctly estimated the 75 Gbp of sequencing needed for almost 100 % coverage of the mock metagenomes (99.5 %). Next, the trans-kingdom metagenomes of the IRL were explored, and potential HAB biomarkers were identified.
Member of