Computer vision

Model
Digital Document
Publisher
Florida Atlantic University
Description
The fundamental goal of a machine vision system in the inspection of an assembled printed circuit board is to locate the integrated circuit(IC) components. These components are then checked for their position and orientation with respect to a given position and orientation of the model and to detect deviations. To this end, a method based on a modified two-level correlation scheme is presented in this thesis. In the first level, Low-Level correlation, a modified two-stage template matching method is proposed. It makes use of the random search techniques, better known as the Monte Carlo method, to speed up the matching process on binarized version of the images. Due to the random search techniques, there is uncertainty involved in the location where the matches are found. In the second level, High-Level correlation, an evidence scheme based on the Dempster-Shafer formalism is presented to resolve the uncertainty. Experiment results performed on a printed circuit board containing mounted integrated components is also presented to demonstrate the validity of the techniques.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In this dissertation, visual cues using an active monocular camera for autonomous vehicle navigation are investigated. A number of visual cues suitable to such an objective are proposed and effective methods to extract them are developed. Unique features of these visual cues include: (1) There is no need to reconstruct the 3D scene; (2) they utilize short image sequences taken by a monocular camera; and (3) they operate on local image brightness information. Taking these features into account, the algorithms developed are computationally efficient. Simulation and experimental studies confirm the efficacy of the algorithms developed. The major contribution of the research work in this dissertation is the extraction of visual information suitable for autonomous navigation in an active monocular camera without 3D reconstruction by use of local image information. In the studies addressed, the first visual cue is related to camera focusing parameters. An objective function relating focusing parameters to local image brightness is proposed. A theoretical development is conducted to show that by maximizing the objective function one can focus successfully the camera by choosing the focusing parameters. As a result, the dense distance map between a camera and a front scene can be estimated without using the Gaussian spread function. The second visual cue, namely, the clearance invariant (first proposed by Raviv (97)), is extended here to include arbitrary translational motion of a camera. It is shown that the angle between the optical axis and moving direction of a camera can be estimated by minimizing the relevant estimated error residual. This method needs only one image projection from a 3D surface point at an arbitrary time instant. The third issue discussed in this dissertation refers to extracting the looming and the magnitude of rotation using a new visual cue designated as the rotation invariant under the camera fixation. An algorithm to extract the looming is proposed using the image information available from only one 3D surface point at an arbitrary time instant. Further, an additional algorithm is proposed to estimate the magnitude of rotational velocity of the camera by using the image projections of only two 3D surface points measured over two time instants. Finally, a method is presented to extract the focus of expansion robustly without using image brightness derivatives. It decomposes an image projection trajectory into two independent linear models, and applies the Kalman filters to estimate the focus of expansion.
Model
Digital Document
Publisher
Florida Atlantic University
Description
This dissertation deals with novel vision-based motion cues called the Visual Threat Cues (VTCs), suitable for autonomous navigation tasks such as collision avoidance and maintenance of clearance. The VTCs are time-based and provide some measure for a relative change in range as well as clearance between a 3D surface and a moving observer. They are independent of the 3D environment around the observer and need almost no a-priori knowledge about it. For each VTC presented in this dissertation, there is a corresponding visual field associated with it. Each visual field constitutes a family of imaginary 3D surfaces attached to the moving observer. All the points that lie on a particular imaginary 3D surface, produce the same value of the VTC. These visual fields can be used to demarcate the space around the moving observer into safe and danger zones of varying degree. Several approaches to extract the VTCs from a sequence of monocular images have been suggested. A practical method to extract the VTCs from a sequence of images of 3D textured surfaces, obtained by a visually fixation, fixed-focus moving camera is also presented. This approach is based on the extraction of a global image dissimilarity measure called the Image Quality Measure (IQM), which is extracted directly from the raw data of the gray level images. Based on the relative variations of the measured IQM, the VTCs are extracted. This practical approach to extract the VTCs needs no 3D reconstruction, depth information, optical flow or feature tracking. This algorithm to extract the VTCs was tested on several indoor as well as outdoor real image sequences. Two vision-based closed-loop control schemes for autonomous navigation tasks were implemented in a-priori unknown textured environments using one of the VTCs as relevant sensory feedback information. They are based on a set of IF-THEN fuzzy rules and need almost no a-priori information about the vehicle dynamics, speed, direction of motion, etc. They were implemented in real-time using a camera mounted on a six degree-of-freedom flight simulator.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The objective of this dissertation is to develop effective algorithms for texture characterization, segmentation and labeling that operate selectively to label image textures, using the Gabor representation of signals. These representations are an analog of the spatial frequency tuning characteristics of the visual cortex cells. The Gabor function, of all spatial/spectral signal representations, provides optimal resolution between both domains. A discussion of spatial/spectral representations focuses on the Gabor function and the biological analog that exists between it and the simple cells of the striate cortex. A simulation generates examples of the use of the Gabor filter as a line detector with synthetic data. Simulations are then presented using Gabor filters for real texture characterization. The Gabor filter spatial and spectral attributes are selectively chosen based on the information from a scale-space image in order to maximize resolution of the characterization process. A variation of probabilistic relaxation that exploits the Gabor filter spatial and spectral attributes is devised, and used to force a consensus of the filter responses for texture characterization. We then perform segmentation of the image using the concept of isolation of low energy states within an image. This iterative smoothing algorithm, operating as a Gabor filter post-processing stage, depends on a line processes discontinuity threshold. Selection of the discontinuity threshold is obtained from the modes of the histogram of the relaxed Gabor filter responses using probabilistic relaxation to detect the significant modes. We test our algorithm on simple synthetic and real textures, then use a more complex natural texture image to test the entire algorithm. Limitations on textural resolution are noted, as well as for the resolution of the image segmentation process.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Nowadays it is very hard to find available spots in public parking lots and even harder at facilities such as universities and sports venues. A system that provides drivers with parking availability and parking lot occupancy will allow users find a parking space much easier and faster. This thesis presents a system for automatic parking lot occupancy computation using motion tracking. The use of computer vision techniques and low cost video sensors makes it possible to have an accurate system that allows drivers to find a parking spot. Video bitrate and quality reduction and its impact on performance were studied. It was concluded that high quality video is not necessary for the proposed algorithm to obtain accurate results. The results show that relatively inexpensive and low bandwidth networks can be used to develop large scale parking occupancy applications.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In order to facilitate the development, discussion, and advancement of the relatively new subfield of Artificial Intelligence focused on generating narrative content, the author has developed a pattern language for generating narratives, along with a new categorization framework for narrative generation systems. An emphasis and focus is placed on generating the Fabula of the story (the ordered sequence of events that make up the plot). Approaches to narrative generation are classified into one of three categories, and a pattern is presented for each approach. Enhancement patterns that can be used in conjunction with one of the core patterns are also identified. In total, nine patterns are identified - three core narratology patterns, four Fabula patterns, and two extension patterns. These patterns will be very useful to software architects designing a new generation of narrative generation systems.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Contemporary computer vision solutions to the problem of object detection aim at incorporating contextual information into the process. This thesis proposes a systematic evaluation of the usefulness of incorporating knowledge about the geometric context of a scene into a baseline object detection algorithm based on local features. This research extends publicly available MATLABRÂȘ implementations of leading algorithms in the field and integrates them in a coherent and extensible way. Experiments are presented to compare the performance and accuracy between baseline and context-based detectors, using images from the recently published SUN09 dataset. Experimental results demonstrate that adding contextual information about the geometry of the scene improves the detector performance over the baseline case in 50% of the tested cases.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Digital video is being used widely in a variety of applications such as entertainment, surveillance and security. Large amount of video in surveillance and security requires systems capable to processing video to automatically detect and recognize events to alleviate the load on humans and enable preventive actions when events are detected. The main objective of this work is the analysis of computer vision techniques and algorithms used to perform automatic detection of events in video sequences. This thesis presents a surveillance system based on optical flow and background subtraction concepts to detect events based on a motion analysis, using an event probability zone definition. Advantages, limitations, capabilities and possible solution alternatives are also discussed. The result is a system capable of detecting events of objects moving in opposing direction to a predefined condition or running in the scene, with precision greater than 50% and recall greater than 80%.