Computer vision

Model
Digital Document
Publisher
Florida Atlantic University
Description
Although state-of-the-art Convolutional Neural Networks (CNNs) are often viewed as a model of biological object recognition, they lack many computational and architectural motifs that are postulated to contribute to robust perception in biological neural systems. For example, modern CNNs lack lateral connections, which greatly outnumber feed-forward excitatory connections in primary sensory cortical areas and mediate feature-specific competition between neighboring neurons to form robust, sparse representations of sensory stimuli for downstream tasks. In this thesis, I hypothesize that CNN layers equipped with lateral competition better approximate the response characteristics and dynamics of neurons in the mammalian primary visual cortex, leading to increased robustness under noise and/or adversarial attacks relative to current robust CNN layers. To test this hypothesis, I develop a new class of CNNs called LCANets, which simulate recurrent, feature-specific lateral competition between neighboring neurons via a sparse coding model termed the Locally Competitive Algorithm (LCA). I first perform an analysis of the response properties of LCA and show that sparse representations formed by lateral competition more accurately mirror response characteristics of primary visual cortical populations and are more useful for downstream tasks like object recognition than previous sparse CNNs, which approximate competition with winner-take-all mechanisms implemented via thresholding.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The field of computer vision has grown by leaps and bounds in the past decade. The rapid advances can be largely attributed to advances made in the field of Artificial Neural Networks and more specifically can be attributed to the rapid advancement of Convolutional Neural Networks (CNN) and Deep Learning. One area that is of great interest to the research community at large is the ability to detect the quality of images in the sense of technical parameters such as blurriness, encoding artifacts, saturation, and lighting, as well as for its’ aesthetic appeal. The purpose of such a mechanism could be detecting and discarding noisy, blurry, dark, or over exposed images, as well as detecting images that would be considered beautiful by a majority of viewers. In this dissertation, the detection of various quality and aesthetic aspects of an image using CNNs is explored. This research produced two datasets that are manually labeled for quality issues such as blur, poor lighting, and digital noise, and for their aesthetic qualities, and Convolutional Neural Networks were designed and trained using these datasets. Lastly, two case studies were performed to show the real-world impact of this research to traffic sign detection and medical image diagnosis.
Model
Digital Document
Publisher
Florida Atlantic University
Description
This experiment used different methodologies and comparisons that helped to determine the direction of future research on water-based perception systems for unmanned surface vehicles (USV) platforms. This would be using a stereo-vison based system. Presented in this work is object color and shape classification in the real-time maritime environment. This was coupled with HSV color space that allowed for different thresholds to be identified and detected. The algorithm was then calibrated and executed to configure the depth, color and shape accuracies. The approach entails the characterization of a stereo-vision camera and mount that was designed with 8.5° horizontal viewing increments and mounted on the WAMV.
This characterization has depth, color and shape object detection and its classification. Different shapes and buoys were used to complete the testing with assorted colors and shapes. The main program used was OpenCV which entails Gaussian blurring, Morphological operators and Canny edge detection libraries with a ROS integration. The code focuses on the area size and the number of contours detected on the shape for successes. A summary of what this thesis entails is the installation and characterization of the stereovision system on the WAMV-USV by obtaining specific inputs to the high-level controller.
Model
Digital Document
Publisher
Florida Atlantic University
Description
This Thesis surveys the landscape of Data Augmentation for image datasets. Completing this survey inspired further study into a method of generative modeling known as Generative Adversarial Networks (GANs). A survey on GANs was conducted to understood recent developments and the problems related to training them. Following this survey, four experiments were proposed to test the application of GANs for data augmentation and to contribute to the quality improvement in GAN-generated data. Experimental results demonstrate the effectiveness of GAN-generated data as a pre-training metric. The other experiments discuss important characteristics of GAN models such as the refining of prior information, transferring generative models from large datasets to small data, and automating the design of Deep Neural Networks within the context of the GAN framework. This Thesis will provide readers with a complete introduction to Data Augmentation and Generative Adversarial Networks, as well as insights into the future of these techniques.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In this research, image segmentation and visual odometry estimations in real time
are addressed, and two main contributions were made to this field. First, a new image
segmentation and classification algorithm named DilatedU-NET is introduced. This deep
learning based algorithm is able to process seven frames per-second and achieves over
84% accuracy using the Cityscapes dataset. Secondly, a new method to estimate visual
odometry is introduced. Using the KITTI benchmark dataset as a baseline, the visual
odometry error was more significant than could be accurately measured. However, the
robust framerate speed made up for this, able to process 15 frames per second.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Autonomous video surveillance systems are usually built with several functional blocks
such as motion detection, foreground and background separation, object tracking, depth
estimation, feature extraction and behavioral analysis of tracked objects. Each of those
blocks is usually designed with different techniques and algorithms, which may need
significant computational and hardware resources. In this thesis we present a surveillance
system based on an optical flow concept, as a main unit on which other functional blocks
depend. Optical flow limitations, capabilities and possible problem solutions are
discussed in this thesis. Moreover, performance evaluation of various methods in
handling occlusions, rigid and non-rigid object classification, segmentation and tracking
is provided for a variety of video sequences under different ambient conditions. Finally,
processing time is measured with software that shows an optical flow hardware block can
improve system performance and increase scalability while reducing the processing time
by more than fifty percent.
Model
Digital Document
Publisher
Florida Atlantic University
Description
A methodology to estimate the state of a moving marine vehicle, defined by its position, velocity and heading, from an unmanned surface vehicle (USV), also in motion, using a stereo vision-based system, is presented in this work, in support of following a target vehicle using an USV.
Model
Digital Document
Publisher
Florida Atlantic University
Description
There is a substantial amount of evidence that suggests that driver drowsiness
plays a significant role in road accidents. Alarming recent statistics are raising the
interest in equipping vehicles with driver drowsiness detection systems. This dissertation describes the design and implementation of a driver drowsiness detection system that is based on the analysis of visual input consisting of the driver's face and eyes. The resulting system combines off-the-shelf software components for face detection, human skin color detection and eye state classification in a novel way. It follows a behavioral methodology by performing a non-invasive monitoring of external cues describing a driver's level of drowsiness. We look at this complex problem from a
systems engineering point of view in order to go from a proof-of-concept prototype to
a stable software framework. Our system utilizes two detection and analysis methods:
(i) face detection with eye region extrapolation and (ii) eye state classification.
Additionally, we use two confirmation processes - one based on custom skin color
detection, the other based on nod detection - to make the system more robust and
resilient while not sacrificing speed significantly. The system was designed to be dynamic and adaptable to conform to the current conditions and hardware capabilities.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In this dissertation we apply sparse constraints to improve optical flow and
trajectories. We apply sparsity in two ways. First, with 2-frame optical flow, we
enforce a sparse representation of flow patches using a learned overcomplete dictionary. Second, we apply a low rank constraint to trajectories via robust coupling. We begin with a review of optical flow fundamentals. We discuss the commonly used flow estimation strategies and the advantages and shortcomings of each. We introduce the concepts associated with sparsity including dictionaries and low rank matrices.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Vision systems have been widely used for parts inspection in electronics assembly lines. In order to improve the overall performance of a visual inspection system, it is important to employ an efficient object recognition algorithm. In this thesis work, a genetic algorithm based correlation algorithm is designed for the task of visual electronic parts inspection. The proposed procedure is composed of two stages. In the first stage, a genetic algorithm is devised to find a sufficient number of candidate image windows. For each candidate window, the correlation is performed between the sampled template and the image pattern inside the window. In the second stage, local searches are conducted in the neighborhood of these candidate windows. Among all the searched locations, the one that has a highest correlation value with the given template is selected as the best matched location. To apply the genetic algorithm technique, a number of important issues, such as selection of a fitness function, design of a coding scheme, and tuning of genetic parameters are addressed in the thesis. Experimental studies have confirmed that the proposed GA-based correlation method is much more effective in terms of accuracy and speed in locating the desired object, compared with the existing Monte-Carlo random search method.