Algorithms.

Model
Digital Document
Publisher
Florida Atlantic University
Description
The Louisiana coastal ecosystem is experiencing increasing threats from human flood control construction, sea-level rise (SLR), and subsidence. Louisiana lost about 4,833 km2 of coastal wetlands from 1932 to 2016, and concern exists whether remaining wetlands will persist while facing the highest rate of relative sea-level rise (RSLR) in the world. Restoration aimed at rehabilitating the ongoing and future disturbances is currently underway through the implementation of the Coastal Wetlands Planning Protection and Restoration Act of 1990 (CWPPRA). To effectively monitor the progress of projects in CWPPRA, the Coastwide Reference Monitoring System (CRMS) was established in 2006. To date, more than a decade of valuable coastal, environmental, and ground elevation data have been collected and archived. This dataset offers a unique opportunity to evaluate the wetland ground elevation dynamics by linking the Rod Surface Elevation Table (RSET) measurements with environmental variables like water salinity and biophysical variables like canopy coverage. This dissertation research examined the effects of the environmental and biophysical variables on wetland terrain elevation by developing innovative machine learning based models to quantify the contribution of each factor using the CRMS collected dataset. Three modern machine learning algorithms, including Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN), were assessed and cross-compared with the commonly used Multiple Linear Regression (MLR). The results showed that RF had the best performance in modeling ground elevation with Root Mean Square Error (RMSE) of 10.8 cm and coefficient of coefficient (r) = 0.74. The top four factors contributing to ground elevation are the distance from monitoring station to closest water source, water salinity, water elevation, and dominant vegetation height.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Data comes in many di erent shapes and sizes. In real life applications it is
common that data we are studying has features that are of varied data types. This
may include, numerical, categorical, and text. In order to be able to model this data
with machine learning algorithms, it is required that the data is typically in numeric
form. Therefore, for data that is not originally numerical, it must be transformed to
be able to be used as input into these algorithms.
Along with this transformation it is common that data we study has many
features relative to the number of samples in the data. It is often desirable to reduce
the number of features that are being trained in a model to eliminate noise and reduce
time in training. This problem of high dimensionality can be approached through
feature selection, feature extraction, or feature embedding. Feature selection seeks to
identify the most essential variables in a dataset that will lead to a parsimonious model
and high performing results, while feature extraction and embedding are techniques
that utilize a mathematical transformation of the data into a represented space. As a
byproduct of using a new representation, we are able to reduce the dimension greatly
without sacri cing performance. Oftentimes, by using embedded features we observe a gain in performance.
Though extraction and embedding methods may be powerful for isolated machine
learning problems, they do not always generalize well. Therefore, we are motivated
to illustrate a methodology that can be applied to any data type with little
pre-processing. The methods we develop can be applied in unsupervised, supervised,
incremental, and deep learning contexts. Using 28 benchmark datasets as examples
which include di erent data types, we construct a framework that can be applied for
general machine learning tasks.
The techniques we develop contribute to the eld of dimension reduction and
feature embedding. Using this framework, we make additional contributions to eigendecomposition
by creating an objective matrix that includes three main vital components.
The rst being a class partitioned row and feature product representation
of one-hot encoded data. Secondarily, the derivation of a weighted adjacency matrix
based on class label relationships. Finally, by the inner product of these aforementioned
values, we are able to condition the one-hot encoded data generated from the
original data prior to eigenvector decomposition. The use of class partitioning and
adjacency enable subsequent projections of the data to be trained more e ectively
when compared side-to-side to baseline algorithm performance. Along with this improved
performance, we can adjust the dimension of the subsequent data arbitrarily.
In addition, we also show how these dense vectors may be used in applications to
order the features of generic data for deep learning.
In this dissertation, we examine a general approach to dimension reduction and
feature embedding that utilizes a class partitioned row and feature representation, a
weighted approach to instance similarity, and an adjacency representation. This general
approach has application to unsupervised, supervised, online, and deep learning.
In our experiments of 28 benchmark datasets, we show signi cant performance gains
in clustering, classi cation, and training time.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Today, drones have been receiving a lot of notice from commercial businesses.
Businesses (mainly companies that have delivery services) are trying to expand their
productivity in order bring more satisfaction for their loyal customers. One-way
companies can expand their delivery services are through the use of delivery drones.
Drones are very powerful devices that are going through many evolutionary changes
for their uses throughout the years. For many years, researchers in academia have
been examining how drones can plan their paths along with avoiding collisions of
other drones and certain obstacles in the civil airspace. However, researchers have
not considered how the motion path planning can a ect the overall scheduling aspect
of civilian drones. In this thesis, we propose an algorithm for a collision-free scheduling
motion path planning of a set drones such that they avoid certain obstacles as well
as maintaining a safety distance from each other.
Model
Digital Document
Publisher
Florida Atlantic University
Description
A traditional machine learning environment is characterized by the training
and testing data being drawn from the same domain, therefore, having similar distribution
characteristics. In contrast, a transfer learning environment is characterized
by the training data having di erent distribution characteristics from the testing
data. Previous research on transfer learning has focused on the development and
evaluation of transfer learning algorithms using real-world datasets. Testing with
real-world datasets exposes an algorithm to a limited number of data distribution
di erences and does not exercise an algorithm's full capability and boundary limitations.
In this research, we de ne, implement, and deploy a transfer learning test
framework to test machine learning algorithms. The transfer learning test framework
is designed to create a wide-range of distribution di erences that are typically encountered
in a transfer learning environment. By testing with many di erent distribution
di erences, an algorithm's strong and weak points can be discovered and evaluated
against other algorithms.
This research additionally performs case studies that use the transfer learning
test framework. The rst case study focuses on measuring the impact of exposing algorithms to the Domain Class Imbalance distortion pro le. The next case study
uses the entire transfer learning test framework to evaluate both transfer learning
and traditional machine learning algorithms. The nal case study uses the transfer
learning test framework in conjunction with real-world datasets to measure the impact
of the base traditional learner on the performance of transfer learning algorithms.
Two additional experiments are performed that are focused on using unique realworld
datasets. The rst experiment uses transfer learning techniques to predict
fraudulent Medicare claims. The second experiment uses a heterogeneous transfer
learning method to predict phishing webgages. These case studies will be of interest to
researchers who develop and improve transfer learning algorithms. This research will
also be of bene t to machine learning practitioners in the selection of high-performing
transfer learning algorithms.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Ban and Kalies [3] proposed an algorithmic approach to compute attractor-
repeller pairs and weak Lyapunov functions based on a combinatorial multivalued
mapping derived from an underlying dynamical system generated by a continuous
map. We propose a more e cient way of computing a Lyapunov function for a Morse
decomposition. This combined work with other authors, including Shaun Harker,
Arnoud Goulet, and Konstantin Mischaikow, implements a few techniques that makes
the process of nding a global Lyapunov function for Morse decomposition very e -
cient. One of the them is to utilize highly memory-e cient data structures: succinct
grid data structure and pointer grid data structures. Another technique is to utilize
Dijkstra algorithm and Manhattan distance to calculate a distance potential, which is
an essential step to compute a Lyapunov function. Finally, another major technique
in achieving a signi cant improvement in e ciency is the utilization of the lattice
structures of the attractors and attracting neighborhoods, as explained in [32]. The
lattice structures have made it possible to let us incorporate only the join-irreducible
attractor-repeller pairs in computing a Lyapunov function, rather than having to use
all possible attractor-repeller pairs as was originally done in [3]. The distributive lattice structures of attractors and repellers in a dynamical
system allow for general algebraic treatment of global gradient-like dynamics. The
separation of these algebraic structures from underlying topological structure is the
basis for the development of algorithms to manipulate those structures, [32, 31].
There has been much recent work on developing and implementing general compu-
tational algorithms for global dynamics which are capable of computing attracting
neighborhoods e ciently. We describe the lifting of sublattices of attractors, which
are computationally less accessible, to lattices of forward invariant sets and attract-
ing neighborhoods, which are computationally accessible. We provide necessary and
su cient conditions for such a lift to exist, in a general setting. We also provide
the algorithms to check whether such conditions are met or not and to construct the
lift when they met. We illustrate the algorithms with some examples. For this, we
have checked and veri ed these algorithms by implementing on some non-invertible
dynamical systems including a nonlinear Leslie model.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Sentiment analysis of tweets is an application of mining Twitter, and is growing
in popularity as a means of determining public opinion. Machine learning algorithms
are used to perform sentiment analysis; however, data quality issues such as high dimensionality, class imbalance or noise may negatively impact classifier performance.
Machine learning techniques exist for targeting these problems, but have not been
applied to this domain, or have not been studied in detail. In this thesis we discuss
research that has been conducted on tweet sentiment classification, its accompanying
data concerns, and methods of addressing these concerns. We test the impact
of feature selection, data sampling and ensemble techniques in an effort to improve
classifier performance. We also evaluate the combination of feature selection and
ensemble techniques and examine the effects of high dimensionality when combining
multiple types of features. Additionally, we provide strategies and insights for
potential avenues of future work.