Neural networks (Computer science)

Model
Digital Document
Publisher
Florida Atlantic University
Description
Short-circuit faults can cause significant damage to power grid infrastructure, resulting in costly maintenance for utility providers. Rapid identification of fault locations can help mitigate these damages and associated expenses. Recent studies have demonstrated that graph neural network (GNN) models, using phasor data from various points in a power grid, can accurately locate fault events by accounting for the grid’s topology—a feature not typically leveraged by other machine learning methods. However, despite their high performance, GNN models are often viewed as ”black-box” systems, making their decision logic difficult to interpret. This thesis demonstrates that explanation methods can be applied to GNN models to enhance their transparency by clarifying the reasoning behind fault location predictions. By systematically benchmarking several explanation techniques for a GNN model trained for fault location detection, we assess and recommend the most effective methods for elucidating fault detection predictions in power grid systems.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The aim of this dissertation is to achieve a thorough understanding and develop an algorithmic framework for a crucial aspect of autonomous and artificial intelligence (AI) systems: Data Analysis. In the current era of AI and machine learning (ML), ”data” holds paramount importance. For effective learning tasks, it is essential to ensure that the training dataset is accurate and comprehensive. Additionally, during system operation, it is vital to identify and address faulty data to prevent potentially catastrophic system failures. Our research in data analysis focuses on creating new mathematical theories and algorithms for outlier-resistant matrix decomposition using L1-norm principal component analysis (PCA). L1-norm PCA has demonstrated robustness against irregular data points and will be pivotal for future AI learning and autonomous system operations.
This dissertation presents a comprehensive exploration of L1-norm techniques and their diverse applications. A summary of our contributions in this manuscript follows: Chapter 1 establishes the foundational mathematical notation and linear algebra concepts critical for the subsequent discussions, along with a review of the complexities of the current state-of-the-art in L1-norm matrix decomposition algorithms. In Chapter 2, we address the L1-norm error decomposition problem by introducing a novel method called ”Individual L1-norm-error Principal Component Computation by 3-layer Perceptron” (Perceptron L1 error). Extensive studies demonstrate the efficiency of this greedy L1-norm PC calculator.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In the current world of fast-paced data production, statistics and machine learning tools are essential for interpreting and utilizing the full potential of this data. This dissertation comprises three studies employing statistical analysis and Convolutional Neural Network models. First, the research investigates the genetic evolution of the SARS-CoV-2 RNA molecule, emphasizing the role of epistasis in the RNA virus’s ability to adapt and survive. Through statistical tests, this study validates the significant impacts of genetic interactions and mutations on the virus’s structural changes over time, offering insights into its evolutionary dynamics. Secondly, the dissertation explores medical diagnosis by implementing Convolutional Neural Networks to differentiate between lung CT-scans of COVID-19 and non-COVID patients. This portion of the research demonstrates the capability of deep learning to enhance diagnostic processes, thereby reducing time and increasing accuracy in clinical settings. Lastly, we delve into gravitational wave detection, an area of astrophysics requiring precise data analysis to identify signals from cosmic events such as black hole mergers. Our goal is to utilize Convolutional Neural Network models in hopes of improving the sensitivity and accuracy of detecting these difficult to catch signals, pushing the boundaries of what we can observe in the universe. The findings of this dissertation underscore the utility of combining statistical methods and machine learning models to solve problems that are not only varied but also highly impactful in their respective fields.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The relentless expansion of space exploration necessitates the development of robust and dependable anomaly detection systems (ADS) to safeguard the safety and efficacy of space missions. Conventional anomaly detection methods often falter in the face of the intricate and nuanced dynamics of space systems, resulting in a proliferation of false positives and/or false negatives. In this study, we explore into cutting-edge techniques in deep learning (DL) to tackle the challenges inherent in ADS. This research offers an in-depth examination of recent breakthroughs and hurdles in deep learning-driven anomaly detection tailored specifically for space systems and operations. A key advantage of deep learning-based anomaly detection lies in its adaptability to the diverse data encountered in space missions. For instance, Convolutional Neural Networks (CNNs) excel at capturing spatial dependencies in high-dimensional data, rendering them well-suited for tasks such as satellite imagery analysis. Conversely, Recurrent Neural Networks (RNNs), with their temporal modeling prowess, excel in identifying anomalies in time-series data generated by spacecraft sensors. Despite the potential of deep learning, several challenges persist in its application to anomaly detection in space systems. The scarcity of labeled data presents a formidable hurdle, as acquiring labeled anomalies during space operations is often prohibitively expensive and impractical. Additionally, the interpretability of deep learning models remains a concern, particularly in mission-critical scenarios where human operators need to comprehend the rationale behind anomaly predictions.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Automatizing optimal neural architectures is an under-explored domain; the majority of deep learning domains base their architecture on multiplexing different well-known architectures together based on past studies. Even after extensive research, the deployed algorithms may only work for specific domains, provide a minor boost, or even underperform compared to the previous state-of-the-art implementations. One approach, Neural architecture search, requires generating a pool of network topologies based on well-known kernel and activation functions. However, iteratively training the generated topologies and creating newer topologies based on the best-performing ones is computationally expensive and out of scope for most academic labs. In addition, the search space is constrained to the predetermined dictionary of kernel functions to generate the topologies. This thesis considers neural networks as a weighted directed graph, incorporating the ideas of message passing in graph neural networks to propagate the information from the input to the output nodes. We show that such a method relieves the dependency on a search space constrained to well-known kernel functions over any arbitrary graph structures. We test our algorithms in the RL environment and explore several optimization forays, such as graph attention and PPO to let us solve the problem. We improve upon the slow convergence of PPO using Neural CA approach as a self-organizing overhead towards generating adjacency matrices of network topologies. This exploration towards indirect encoding (an abstraction of DNA in neuro-developmental biology) yielded a much faster algorithm for convergence. In addition, we introduce 1D-involution as a way to implement message passing across nodes in a graph, which further reduces the parameter space to a significant degree without hindering performance.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The ability to recognize human actions is essential for individuals to navigate through their daily life. Biological motion is the primary mechanism people use to recognize actions quickly and efficiently, but their precision can vary. The development of Artificial Neural Networks (ANNs) has the potential to enhance the efficiency and effectiveness of accomplishing common human tasks, including action recognition. However, the performance of ANNs in action recognition depends on the type of model used. This study aimed to improve the accuracy of ANNs in action classification by incorporating biological motion information into the input conditions. The study used the UCF Crime dataset, a dataset containing surveillance videos of normal and criminal activity, and extracted biological motion information with OpenPose, a pose estimation ANN. OpenPose adjusted to create four condition types using the biological motion information (image-only, image with biological motion, only biological motion, and coordinates only) and used either a 3-Dimensional Convolutional Neural Network (3D CNN) or a Gated Recurrent Unit (GRU) to classify the actions. Overall, the study found that including biological motion information in the input conditions led to higher accuracy regardless of the number of action categories in the dataset. Moreover, the GRU model using the 'coordinates only' condition had the best accuracy out of all the action classification models. These findings suggest that incorporating biological motion into input conditions and using numerical format input data can benefit the development of accurate action classification models using ANNs.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Wall-bounded turbulent flows are pervasive in numerous physics and engineering applications. Such flows tend to have a strong impact on the design of ships, airplanes and rockets, industrial chemical mixing, wind and hydrokinetic energy, utility infrastructure and innumerable other fields. Understanding and controlling wall bounded turbulence has been a long-pursued endeavor yielding plentiful scientific and engineering discoveries, but there is much that remains unexplained from a fundamental viewpoint. One unexplained phenomenon is the formation and impact of coherent structures like the ejections of slow near-wall fluid into faster moving ow which have been shown to correlate with increases in friction drag. This thesis focuses on recognizing and regulating organized structures within wall-bounded turbulent flows using a variety of machine learning techniques to overcome the nonlinear nature of this phenomenon.
Deep Learning has provided new avenues of analyzing large amounts of data by applying techniques modeled after biological neurons. These techniques allow for the discovery of nonlinear relationships in massive, complex systems like the data found frequently in fluid dynamics simulation. Using a neural network architecture called Convolutional Neural Networks that specializes in uncovering spatial relationships, a network was trained to estimate the relative intensity of ejection structures within turbulent flow simulation without any a priori knowledge of the underlying flow dynamics. To explore the underlying physics that the trained network might reveal, an interpretation technique called Gradient-based Class Activation Mapping was modified to identify salient regions in the flow field which most influenced the trained network to make an accurate estimation of these organized structures. Using various statistical techniques, these salient regions were found to have a high correlation to ejection structures, and to high positive kinetic energy production, low negative production, and low energy dissipation regions within the flow. Additionally, these techniques present a general framework for identifying nonlinear causal structures in general three-dimensional data in any scientific domain where the underlying physics may be unknown.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Neural network models with many tunable parameters can be trained to approximate functions that transform a source distribution, or dataset, into a target distribution of interest. In contrast to low-parameter models with simple governing equations, the dynamics of transformations learned in deep neural network models are abstract and the correspondence of dynamical structure to predictive function is opaque. Despite their “black box” nature, neural networks converge to functions that implement complex tasks in computer vision, Natural Language Processing (NLP), and the sciences when trained on large quantities of data. Where traditional machine learning approaches rely on clean datasets with appropriate features, sample densities, and label distributions to mitigate unwanted bias, modern Transformer neural networks with self-attention mechanisms use Self-Supervised Learning (SSL) to pretrain on large, unlabeled datasets scraped from the internet without concern for data quality. SSL tasks have been shown to learn functions that match or outperform their supervised learning counterparts in many fields, even without task-specific finetuning. The recent paradigm shift to pretraining large models with massive amounts of unlabeled data has given credibility to the hypothesis that SSL pretraining can produce functions that implement generally intelligent computations.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In radiotherapy, radiobiological indices tumor control probability (TCP), normal tissue complication probability (NTCP), and equivalent uniform dose (EUD) are computed by analytical models. These models are rarely employed to rank and optimize treatment plans even though radiobiological indices weights more compared to dosimetric indices to reflect treatment goal. The objective of this study is to predict TCP, NTCP and EUDs for lung cancer radiotherapy treatment plans using an artificial neural network (ANN). A total of 100 lung cancer patients’ treatment plans were selected for this study. Normal tissue complication probability (NTCP) of organs at risk (OARs) i.e., esophagus, spinal cord, heart and contralateral lung and tumor control probability (TCP) of treatment target volume (i.e., tumor) were calculated by the equivalent uniform dose (EUD) model. TCP/NTCP pairing with corresponding EUD are used individually as outputs for the neural network. The inputs for ANN are planning target volume (PTV), treatment modality, tumor location, prescribed dose, number of fractions, mean dose to PTV, gender, age, and mean doses to the OARs. The ANN is based on Levenberg-Marquardt algorithm with one hidden layer having 13 inputs and 2 outputs. 70% of the data was used for training, 15% for validation and 15% for testing the ANN. Our ANN model predicted TCP and EUD with correlation coefficient of 0.99 for training, 0.96 for validation, and 0.94 for testing. In NTCP and EUD prediction, averages of correlation coefficients are 0.94 for training, 0.89 for validation and 0.84 for testing. The maximum mean squared error (MSE) for the ANN is 0.025 in predicting the NTCP and EUD of heart. Our results show that an ANN model can be used with high discriminatory power to predict the radiobiological indices for lung cancer treatment plans.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Although state-of-the-art Convolutional Neural Networks (CNNs) are often viewed as a model of biological object recognition, they lack many computational and architectural motifs that are postulated to contribute to robust perception in biological neural systems. For example, modern CNNs lack lateral connections, which greatly outnumber feed-forward excitatory connections in primary sensory cortical areas and mediate feature-specific competition between neighboring neurons to form robust, sparse representations of sensory stimuli for downstream tasks. In this thesis, I hypothesize that CNN layers equipped with lateral competition better approximate the response characteristics and dynamics of neurons in the mammalian primary visual cortex, leading to increased robustness under noise and/or adversarial attacks relative to current robust CNN layers. To test this hypothesis, I develop a new class of CNNs called LCANets, which simulate recurrent, feature-specific lateral competition between neighboring neurons via a sparse coding model termed the Locally Competitive Algorithm (LCA). I first perform an analysis of the response properties of LCA and show that sparse representations formed by lateral competition more accurately mirror response characteristics of primary visual cortical populations and are more useful for downstream tasks like object recognition than previous sparse CNNs, which approximate competition with winner-take-all mechanisms implemented via thresholding.