Model
Digital Document
Publisher
Florida Atlantic University
Description
Automatizing optimal neural architectures is an under-explored domain; the majority of deep learning domains base their architecture on multiplexing different well-known architectures together based on past studies. Even after extensive research, the deployed algorithms may only work for specific domains, provide a minor boost, or even underperform compared to the previous state-of-the-art implementations. One approach, Neural architecture search, requires generating a pool of network topologies based on well-known kernel and activation functions. However, iteratively training the generated topologies and creating newer topologies based on the best-performing ones is computationally expensive and out of scope for most academic labs. In addition, the search space is constrained to the predetermined dictionary of kernel functions to generate the topologies. This thesis considers neural networks as a weighted directed graph, incorporating the ideas of message passing in graph neural networks to propagate the information from the input to the output nodes. We show that such a method relieves the dependency on a search space constrained to well-known kernel functions over any arbitrary graph structures. We test our algorithms in the RL environment and explore several optimization forays, such as graph attention and PPO to let us solve the problem. We improve upon the slow convergence of PPO using Neural CA approach as a self-organizing overhead towards generating adjacency matrices of network topologies. This exploration towards indirect encoding (an abstraction of DNA in neuro-developmental biology) yielded a much faster algorithm for convergence. In addition, we introduce 1D-involution as a way to implement message passing across nodes in a graph, which further reduces the parameter space to a significant degree without hindering performance.
Member of