Algorithms

INVESTIGATING MACHINE LEARNING ALGORITHMS WITH IMBALANCED BIG DATA

Model

Digital Document

Publisher

Florida Atlantic University

Description

Recent technological developments have engendered an expeditious production of big data and also enabled machine learning algorithms to produce high-performance models from such data. Nonetheless, class imbalance (in binary classifications) between the majority and minority classes in big data can skew the predictive performance of the classification algorithms toward the majority (negative) class whereas the minority (positive) class usually holds greater value for the decision makers. Such bias may lead to adverse consequences, some of them even life-threatening, when the existence of false negatives is generally costlier than false positives. The size of the minority class can vary from fair to extraordinary small, which can lead to different performance scores for machine learning algorithms. Class imbalance is a well-studied area for traditional data, i.e., not big data. However, there is limited research focusing on both rarity and severe class imbalance in big data.

Member of

FAU Theses and Dissertations

CONTRIBUTIONS TO QUANTUM-SAFE CRYPTOGRAPHY: HYBRID ENCRYPTION AND REDUCING THE T GATE COST OF AES

Model

Digital Document

Pham, Hai

Steinwandt, Rainer

Publisher

Florida Atlantic University

Description

Quantum cryptography offers a wonderful source for current and future research. The idea started in the early 1970s, and it continues to inspire work and development toward a popular goal, large-scale communication networks with strong security guarantees, based on quantum-mechanical properties. Quantum cryptography builds on the idea of exploiting physical properties to establish secure cryptographic operations. A particular quantum-based protocol has gathered interest in recent years for its use of mesoscopic coherent states.
The AlphaEta protocol has been designed to exploit properties of coherent states of light to transmit data securely over an optical channel. AlphaEta aims to draw security from the uncertainty of any measurement of the transmitted coherent states due to intrinsic quantum noise. We propose a framework to combine this protocol with classical preprocessing, taking into account error-correction for the optical channel and establishing a strong provable security guarantee. Integrating a state-of-the-art solution for fast authenticated encryption is straightforward, but in this case the security analysis requires heuristic reasoning.

Member of

FAU Theses and Dissertations

An Algorithmic Approach to Tran Van Trung's Basic Recursive Construction of t-Designs

Model

Digital Document

Lopez, Oscar A.

Magliveras, Spyros S.

Publisher

Florida Atlantic University

Description

It was not until the 20th century that combinatorial design theory was studied as a formal subject. This field has many applications, for example in statistical experimental design, coding theory, authentication codes, and cryptography. Major approaches to the problem of discovering new t-designs rely on (i) the construction of large sets of t designs, (ii) using prescribed automorphism groups, (iii) recursive construction methods. In 2017 and 2018, Tran Van Trung introduced new recursive techniques to construct t – (v, k, λ) designs. These methods are of purely combinatorial nature and require using "ingredient" t-designs or resolutions whose parameters satisfy a system of non-linear equations. Even after restricting the range of parameters in this new method, the task is computationally intractable. In this work, we enhance Tran Van Trung's "Basic Construction" by a robust and efficient hybrid computational apparatus which enables us to construct hundreds of thousands of new t – (v, k, Λ) designs from previously known ingredient designs. Towards the end of the dissertation we also create a new family of 2-resolutions, which will be infinite if there are infinitely many Sophie Germain primes.

Member of

FAU Theses and Dissertations

Machine Learning Algorithms with Big Medicare Fraud Data

Model

Digital Document

Bauder, Richard Andrew

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

Healthcare is an integral component in peoples lives, especially for the rising elderly population, and must be affordable. The United States Medicare program is vital in serving the needs of the elderly. The growing number of people enrolled in the Medicare program, along with the enormous volume of money involved, increases the appeal for, and risk of, fraudulent activities. For many real-world applications, including Medicare fraud, the interesting observations tend to be less frequent than the normative observations. This difference between the normal observations and
those observations of interest can create highly imbalanced datasets. The problem of class imbalance, to include the classification of rare cases indicating extreme class
imbalance, is an important and well-studied area in machine learning. The effects of class imbalance with big data in the real-world Medicare fraud application domain, however, is limited. In particular, the impact of detecting fraud in Medicare claims is critical in lessening the financial and personal impacts of these transgressions. Fortunately, the healthcare domain is one such area where the successful detection
of fraud can garner meaningful positive results. The application of machine learning techniques, plus methods to mitigate the adverse effects of class imbalance and rarity, can be used to detect fraud and lessen the impacts for all Medicare beneficiaries. This dissertation presents the application of machine learning approaches to detect Medicare provider claims fraud in the United States. We discuss novel techniques
to process three big Medicare datasets and create a new, combined dataset, which includes mapping fraud labels associated with known excluded providers. We investigate the ability of machine learning techniques, unsupervised and supervised, to detect Medicare claims fraud and leverage data sampling methods to lessen the impact of class imbalance and increase fraud detection performance. Additionally, we extend the study of class imbalance to assess the impacts of rare cases in big data for Medicare fraud detection.

Member of

FAU Theses and Dissertations

Dosimetry comparison between treatment plans computed with Finite size pencil beam algorithm and Monte Carlo algorithm using InCise™ Multileaf collimator equipped CyberKnife® System

Model

Digital Document

Galpayage Dona, Kalpani Nisansala Udeni

Publisher

Florida Atlantic University

Description

Since the release of the Cyberknife Multileaf Collimator (CK-MLC), it has been a constant
concern on the realistic dose differences computed with its early-available Finite Size
Pencil Beam algorithm (FSPB) from those computed by using industry well-accepted
algorithms such as the Monte Carlo (MC) dose algorithm. In this study dose disparities
between FSPB and MC dose calculation algorithms for selected CK-MLC treatment plans
were quantified. The dosimetry for planning target volume (PTV) and major organs at risks
(OAR) was compared by calculating normalized percentage deviations (Ndev) between the
two algorithms. It is found that the FSPB algorithm overestimates D95 of PTV when
compared with the MC algorithm by averaging 24.0% in detached lung cases, and 15.0%
in non-detached lung cases which is attributed to the absence of heterogeneity correction
in the FSPB algorithm. Average dose differences are 0.3% in intracranial and 0.9% in
pancreas cases. Ndev for the D95 of PTV range from 8.8% to 14.1% for the CK-MLC lung
treatment plans with small field (SF ≤ 2x2cm2). Ndev is ranged from 0.5-7.0% for OARs.

Member of

FAU Theses and Dissertations

Algorithms in Elliptic Curve Cryptography

Model

Digital Document

Hutchinson, Aaron

Karabina, Koray

Publisher

Florida Atlantic University

Description

Elliptic curves have played a large role in modern cryptography. Most notably,
the Elliptic Curve Digital Signature Algorithm (ECDSA) and the Elliptic Curve
Di e-Hellman (ECDH) key exchange algorithm are widely used in practice today for
their e ciency and small key sizes. More recently, the Supersingular Isogeny-based
Di e-Hellman (SIDH) algorithm provides a method of exchanging keys which is conjectured
to be secure in the post-quantum setting. For ECDSA and ECDH, e cient
and secure algorithms for scalar multiplication of points are necessary for modern use
of these protocols. Likewise, in SIDH it is necessary to be able to compute an isogeny
from a given nite subgroup of an elliptic curve in a fast and secure fashion.
We therefore nd strong motivation to study and improve the algorithms used
in elliptic curve cryptography, and to develop new algorithms to be deployed within
these protocols. In this thesis we design and develop d-MUL, a multidimensional
scalar multiplication algorithm which is uniform in its operations and generalizes the
well known 1-dimensional Montgomery ladder addition chain and the 2-dimensional
addition chain due to Dan J. Bernstein. We analyze the construction and derive many
optimizations, implement the algorithm in software, and prove many theoretical and practical results. In the nal chapter of the thesis we analyze the operations carried
out in the construction of an isogeny from a given subgroup, as performed in SIDH.
We detail how to e ciently make use of parallel processing when constructing this
isogeny.

Member of

FAU Theses and Dissertations

An evaluation of Unsupervised Machine Learning Algorithms for Detecting Fraud and Abuse in the U.S. Medicare Insurance Program

Model

Digital Document

Da Rosa, Raquel C.

Khoshgoftaar, Taghi M.

Publisher

Florida Atlantic University

Description

The population of people ages 65 and older has increased since the 1960s
and current estimates indicate it will double by 2060. Medicare is a federal health
insurance program for people 65 or older in the United States. Medicare claims
fraud and abuse is an ongoing issue that wastes a large amount of money every year
resulting in higher health care costs and taxes for everyone. In this study, an empirical
evaluation of several unsupervised machine learning approaches is performed which
indicates reasonable fraud detection results. We employ two unsupervised machine
learning algorithms, Isolation Forest and Unsupervised Random Forest, which have
not been previously used for the detection of fraud and abuse on Medicare data.
Additionally, we implement three other machine learning methods previously applied
on Medicare data which include: Local Outlier Factor, Autoencoder, and k-Nearest
Neighbor. For our dataset, we combine the 2012 to 2015 Medicare provider utilization
and payment data and add fraud labels from the List of Excluded Individuals/Entities
(LEIE) database. Results show that Local Outlier Factor is the best model to use for
Medicare fraud detection.

Member of

FAU Theses and Dissertations

Perceptual methods for video coding

Model

Digital Document

Adzic, Velibor

Kalva, Hari

Publisher

Florida Atlantic University

Description

The main goal of video coding algorithms is to achieve high compression efficiency while
maintaining quality of the compressed signal at the highest level. Human visual system is
the ultimate receiver of compressed signal and final judge of its quality. This dissertation
presents work towards optimal video compression algorithm that is based on the
characteristics of our visual system. Modeling phenomena such as backward temporal
masking and motion masking we developed algorithms that are implemented in the state-of-
the-art video encoders. Result of using our algorithms is visually lossless compression
with improved efficiency, as verified by standard subjective quality and psychophysical
tests. Savings in bitrate compared to the High Efficiency Video Coding / H.265 reference
implementation are up to 45%.

Member of

FAU Theses and Dissertations

Design and modeling of hybrid software fault-tolerant systems

Model

Digital Document

Zhang, Man-xia Maria.

Wu, Jie

Publisher

Florida Atlantic University

Description

Fault tolerant programming methods improve software reliability using the principles of design diversity and redundancy. Design diversity and redundancy, on the other hand, escalate the cost of the software design and development. In this thesis, we study the reliability of hybrid fault tolerant systems. Probability models based on fault trees are developed for the recovery block (RB), N-version programming (NVP) and hybrid schemes which are the combinations of RB and NVP. Two heuristic methods are developed to construct hybrid fault tolerant systems with total cost constraints. The algorithms provide a systematic approach to the design of hybrid fault tolerant systems.

Member of

FAU Theses and Dissertations

Cryptanalysis of small private key RSA

Model

Digital Document

Guild, Jeffrey Kirk

Klingler, Lee

Publisher

Florida Atlantic University

Description

RSA cryptosystems with decryption exponent d less than N 0.292, for a given RSA modulus N, show themselves to be vulnerable to an attack which utilizes modular polynomials and the LLL Basis Reduction Algorithm. This result, presented by Dan Boneh and Glenn Durfee in 1999, is an improvement on the bound of N0.25 established by Wiener in 1990. This thesis examines in detail the LLL Basis Reduction Algorithm and the attack on RSA as presented by Boneh and Durfee.

Member of

FAU Theses and Dissertations

Algorithms