Heredia, Brian

Relationships
Member of: Graduate College
Person Preferred Name
Heredia, Brian
Model
Digital Document
Publisher
Florida Atlantic University
Description
One of the de ning characteristics of the modern Internet is its massive connectedness,
with information and human connection simply a few clicks away. Social
media and online retailers have revolutionized how we communicate and purchase
goods or services. User generated content on the web, through social media, plays
a large role in modern society; Twitter has been in the forefront of political discourse,
with politicians choosing it as their platform for disseminating information,
while websites like Amazon and Yelp allow users to share their opinions on products
via online reviews. The information available through these platforms can provide
insight into a host of relevant topics through the process of machine learning. Speci -
cally, this process involves text mining for sentiment analysis, which is an application
domain of machine learning involving the extraction of emotion from text.
Unfortunately, there are still those with malicious intent and with the changes
to how we communicate and conduct business, comes changes to their malicious practices.
Social bots and fake reviews plague the web, providing incorrect information
and swaying the opinion of unaware readers. The detection of these false users or
posts from reading the text is di cult, if not impossible, for humans. Fortunately, text mining provides us with methods for the detection of harmful user generated
content.
This dissertation expands the current research in sentiment analysis, fake online
review detection and election prediction. We examine cross-domain sentiment
analysis using tweets and reviews. Novel techniques combining ensemble and feature
selection methods are proposed for the domain of online spam review detection. We
investigate the ability for the Twitter platform to predict the United States 2016 presidential
election. In addition, we determine how social bots in
uence this prediction.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In recent years more and more researchers have begun to use data mining and
machine learning tools to analyze gene microarray data. In this thesis we have collected a
selection of datasets revolving around prediction of patient response in the specific area
of breast cancer treatment. The datasets collected in this paper are all obtained from gene
chips, which have become the industry standard in measurement of gene expression. In
this thesis we will discuss the methods and procedures used in the studies to analyze the
datasets and their effects on treatment prediction with a particular interest in the selection
of genes for predicting patient response. We will also analyze the datasets on our own in
a uniform manner to determine the validity of these datasets in terms of learning potential
and provide strategies for future work which explore how to best identify gene signatures.