Image processing

Model
Digital Document
Publisher
Florida Atlantic University
Description
Within Deep CNNs there is great excitement over breakthroughs in network performance on benchmark datasets such as ImageNet. Around the world competitive teams work on new ways to innovate and modify existing networks, or create new ones that can reach higher and higher accuracy levels. We believe that this important research must be supplemented with research into the computational dynamics of the networks themselves. We present research into network behavior as it is affected by: variations in the number of filters per layer, pruning filters during and after training, collapsing the weight space of the trained network using a basic quantization, and the effect of Image Size and Input Layer Stride on training time and test accuracy. We provide insights into how the total number of updatable parameters can affect training time and accuracy, and how “time per epoch” and “number of epochs” affect network training time. We conclude with statistically significant models that allow us to predict training time as a function of total number of updatable parameters in the network.
Model
Digital Document
Publisher
Florida Atlantic University
Description
High-resolution imagery is becoming readily available to the public. Private firms and government organizations are using high-resolution images but are running into problems with storage space and processing time. High-resolution images are extremely large files, and have proven cumbersome to work with and control. By resampling fine resolution imagery to a lower resolution, storage and processing space can be dramatically reduced. Fine-resolution imagery is not needed to map most features and resampled high-resolution imagery can be used as a replacement for low-resolution satellite imagery in some cases. The effects of resampling on the spectral quality of a high-resolution image can be demonstrated by answering the following questions: (1) Is the quality of spectral information on a color infrared DOQQ comparable to SPOT and TM Landsat satellite imagery for the purpose of digital image classification? (2) What is the appropriate resolution for mapping surface features using high-resolution imagery for spectral categories of information? (3) What is the appropriate resolution for mapping surface features using high-resolution imagery for land-use land-cover information?
Model
Digital Document
Publisher
Florida Atlantic University
Description
Much of the recent research concerning the use of GIS has revolved around data quality. Types of errors inherent in GIS data layers, and also errors that may be produced through the creation and manipulation of data layers have been identified. Definitions of these errors, and observations of how these errors occur have been offered. However, the majority of the research is qualitative. It is known that positional variation is produced through differing interpretations and generalization of points, lines, and polygons, but it is not known to what extent. This information would be extremely helpful in allowing the user of the information to fine tune the application, based on the accuracy of the data. Providing this type of information is the goal of this research. Quantitative analysis of the results of a series of experiments will give a numerical range of possible positional errors produced through database creation via aerial photo interpretation.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The focus of this thesis is the kinematic calibration of a SCARA arm with a hand-mounted camera. Kinematic calibration can greatly improve the accuracy of SCARA arms, which are widely used in electronic assembly lines. Vision-based robot calibration has the potential of being a fast, nonintrusive, low-cost, and autonomous approach. In this thesis, we apply a vision-based technique to calibrate SCARA arms. The robot under investigation is modeled by the modified complete and parametrically continuous model. By repeatedly calibrating the camera, the pose of the robot end-effector are collected at various robot measurement configurations. A least squares technique is then applied to estimate the geometric error parameters of the SCARA arm using the measured robot poses. In order to improve the robustness of the method, a new approach is proposed to calibrate the hand-mounted camera. The calibration algorithm is designed to deal with the case in which the camera sensor plane is nearly-parallel to the camera calibration board. Practical issues regarding robot calibration in general and SCARA arm calibration in particular are also addressed. Experiment studies reveal that the proposed camera-aided approach is a viable means for accuracy enhancement of SCARA arms.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In this thesis we describe a local-neighborhood-pixel-based adaptive algorithm to track image features, both spatially and temporally, over a sequence of monocular images. The algorithm assumes no a priori knowledge about the image features to be tracked, or the relative motion between the camera and the 3-D objects. The features to be tracked are selected by the algorithm and they correspond to the peaks of '2-D intensity correlation surface' constructed from a local neighborhood in the first image of the sequence to be analyzed. Any kind of motion, i.e., 6 DOF (translation and rotation), can be tolerated keeping in mind the pixels-per-frame motion limitations. No subpixel computations are necessary. Taking into account constraints of temporal continuity, the algorithm uses simple and efficient predictive tracking over multiple frames. Trajectories of features on multiple objects can also be computed. The algorithm accepts a slow, continuous change of brightness D.C. level in the pixels of the feature. Another important aspect of the algorithm is the use of an adaptive feature matching threshold that accounts for change in relative brightness of neighboring pixels. As applications of the feature-tracking algorithm and to test the accuracy of the tracking, we show how the algorithm has been used to extract the Focus of Expansion (FOE) and compute the Time-to-contact using real image sequences of unstructured, unknown environments. In both these applications, information from multiple frames is used.
Model
Digital Document
Publisher
Florida Atlantic University
Description
A novel neural network, trained with the Alopex algorithm to recognize handprinted characters, was developed in this research. It was constructed by an encoded fully connected multi-layer perceptron (EFCMP). It consists of one input layer, one intermediate layer, and one encoded output layer. The Alopex algorithm is used to supervise the training of the EFCMP. Alopex is a stochastic algorithm used to solve optimization problems. The Alopex algorithm has been shown to accelerate the rate of convergence in the training procedure. Software simulation programs were developed for training, testing and analyzing the performance of this EFCMP architecture. Several neural networks with different structures were developed and compared. Optimization of the Alopex algorithm was explored through simulations of the EFCMP training procedure with the use of different parametric values for Alopex.
Model
Digital Document
Publisher
Florida Atlantic University
Description
This thesis is concerned with the estimation of motion parameters of planar-object surfaces viewed with a binocular camera configuration. Possible application of this method includes autonomous guidance of a moving platform (AGVS) via imaging, and segmentation of moving objects by the use of the information concerning the motion and the structure. The brightness constraint equation is obtained by assuming the brightness of a moving patch as almost invariant. This equation is solved for single camera case as well as binocular camera case by knowing values of the surface normal or by iteratively determining it using the estimates of motion parameters. For this value of the surface normal, rotational and translational motion components are determined over the entire image using a least squares algorithm. This algorithm is tested for simulated images as well as real images pertaining to a single camera as well as binocular camera situations. (Abstract shortened with permission of author.)
Model
Digital Document
Publisher
Florida Atlantic University
Description
This research aims at proposing a model for visual pattern recognition inspired by the neural circuitry in the brain. Our attempt is to propose few modifications in the Alopex algorithm and try to use it for the calculations of the receptive fields of neurons in the trained network. We have developed a small-scale, four-layered neural network model for simple character recognition as well as complex image patterns, which can recognize the patterns transformed by affine conversion. Here Alopex algorithm is presented as an iterative and stochastic processing method, which was proposed for optimization of a given cost function over hundreds or thousands of iterations. In this case the receptive fields of the neurons in the output layers are obtained using the Alopex algorithm.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Third-order synthetic neural networks are applied to the recognition of isodensity facial images extracted from digitized grayscale facial images. A key property of neural networks is their ability to recognize invariances and extract essential parameters from complex high-dimensional data. In pattern recognition an input image must be recognized regardless of its position, size, and angular orientation. In order to achieve this, the neural network needs to learn the relationships between the input pixels. Pattern recognition requires the nonlinear subdivision of the pattern space into subsets representing the objects to be identified. Single-layer neural networks can only perform linear discrimination. However, multilayer first-order networks and high-order neural networks can both achieve this. The most significant advantage of a higher-order net over a traditional multilayer perceptron is that invariances to 2-dimensional geometric transformations can be incorporated into the network and need not be learned through prolonged training with an extensive family of exemplars. It is shown that a third-order network can be used to achieve translation-, scale-, and rotation-invariant recognition with a significant reduction in training time over other neural net paradigms such as the multilayer perceptron. A model based on an enhanced version of the Widrow-Hoff training algorithm and a new momentum paradigm are introduced and applied to the complex problem of human face recognition under varying facial expressions. Arguments for the use of isodensity information in the recognition algorithm are put forth and it is shown how the technique of coarse-coding is applied to reduce the memory required for computer simulations. The combination of isodensity information and neural networks for image recognition is described and its merits over other image recognition methods are explained. It is shown that isodensity information coupled with the use of an "adaptive threshold strategy" (ATS) yields a system that is relatively impervious to image contrast noise. The new momentum paradigm produces much faster convergence rates than ordinary momentum and renders the network behaviour independent of its training parameters over a broad range of parameter values.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The dual issues of extracting and tracking eye features from video images are addressed in this dissertation. The proposed scheme is different from conventional intrusive eye movement measuring system and can be implemented using an inexpensive personal computer. The desirable features of such a measurement system are low cost, accuracy, automated operation, and non-intrusiveness. An overall scheme is presented for which a new algorithm is forwarded for each of the function blocks in the processing system. A new corner detection algorithm is presented in which the problem of detecting corners is solved by minimizing a cost function. Each cost factor captures a desirable characteristic of the corner using both the gray level information and the geometrical structure of a corner. This approach additionally provides corner orientations and angles along with corner locations. The advantage of the new approach over the existing corner detectors is that it is able to improve the reliability of detection and localization by imposing criteria related to both the gray level data and the corner structure. The extraction of eye features is performed by using an improved method of deformable templates which are geometrically arranged to resemble the expected shape of the eye. The overall energy function is redefined to simplify the minimization process. The weights for the energy terms are selected based on the normalized value of the energy term. Thus the weighting schedule of the modified method does not demand any expert knowledge for the user. Rather than using a sequential procedure, all parameters of the template are changed simultaneously during the minimization process. This reduces not only the processing time but also the probability of the template being trapped in local minima. An efficient algorithm for real-time eye feature tracking from a sequence of eye images is developed in the dissertation. Based on a geometrical model which describes the characteristics of the eye, the measurement equations are formulated to relate suitably selected measurements to the tracking parameters. A discrete Kalman filter is then constructed for the recursive estimation of the eye features, while taking into account the measurement noise. The small processing time allows this tracking algorithm to be used in real-time applications. This tracking algorithm is suitable for an automated, non-intrusive and inexpensive system as the algorithm is capable of measuring the time profiles of the eye movements. The issue of compensating head movements during the tracking of eye movements is also discussed. An appropriate measurement model was established to describe the effects of head movements. Based on this model, a Kalman filter structure was formulated to carry out the compensation. The whole tracking scheme which cascades two Kalman filters is constructed to track the iris movement, while compensating the head movement. The presence of the eye blink is also taken into account and its detection is incorporated into the cascaded tracking scheme. The above algorithms have been integrated to design an automated, non-intrusive and inexpensive system which provides accurate time profile of eye movements tracking from video image frames.