Image processing--Digital techniques

Model
Digital Document
Publisher
Florida Atlantic University
Description
The fundamental goal of a machine vision system in the inspection of an assembled printed circuit board is to locate the integrated circuit(IC) components. These components are then checked for their position and orientation with respect to a given position and orientation of the model and to detect deviations. To this end, a method based on a modified two-level correlation scheme is presented in this thesis. In the first level, Low-Level correlation, a modified two-stage template matching method is proposed. It makes use of the random search techniques, better known as the Monte Carlo method, to speed up the matching process on binarized version of the images. Due to the random search techniques, there is uncertainty involved in the location where the matches are found. In the second level, High-Level correlation, an evidence scheme based on the Dempster-Shafer formalism is presented to resolve the uncertainty. Experiment results performed on a printed circuit board containing mounted integrated components is also presented to demonstrate the validity of the techniques.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Surveys are made of both character recognition and image
processing. The need to apply image processing techniques
to character recognition is pointed out. The
fields are then combined and tested in sample programs.
Simulations are made of recognition systems with and
without image preprocessing. Processing techniques
applied utilize Walsh-Hadamard transforms and l ocal window
operators. Results indicate that image prepro c ess i ng
improves recognition rates when noise degrades input
images. A system architecture is proposed for a hardware
based video speed image processor operating on local
image windows. The possible implementation of this
processor is outlined.
Model
Digital Document
Publisher
Florida Atlantic University
Description
A method to create 3D-face image using 2D-face images is the objective of this research. The 3D-face image is constructed using a set of 3D-face images of other persons available in a face database. The 3D-face image actually depicts a parameterized form in terms of depth and texture. This concept can be used to facilitate creating a 3D-face image from 2D database. For this purpose, a 3D-face database is first developed. When a 2D-face image is presented to the system, a 3D-face image that starts with an average 3D-face image (derived from the 3D-face database) is projected onto the 2D-image plane, with necessary rotation, translation, scaling and interpolation. The projected image is then compared with the input image; and, an optimization algorithm is applied to minimize an error index by selecting 3D-depth and texture parameters. Hence, the projected image is derived. Once the algorithm converges, the resulting 3D-depth and the texture parameters can be employed to construct a 3D-face image of the subject photographed in the 2D-images. A merit of this method is that only the depth and texture parameters of the compared images are required to be stored in the database. Such data can be used either for the recreation of a 3D-image of the test subject or for any biometric authentication (based on 3D face recognition). Results from an experimental study presented in the thesis illustrate the effectiveness of the proposed approach, which has applications in biometric authentication and 3D computer graphics areas.
Model
Digital Document
Publisher
Florida Atlantic University
Description
A Content-Based Image Retrieval (CBIR) system is a mechanism intended to retrieve a particular image from a large image repository without resorting to any additional information about the image. Query-by-example (QBE) is a technique used by CBIR systems where an image is retrieved from the database based on an example given by the user. The effectiveness of a CBIR system can be measured by two main indicators: how close the retrieved results are to the desired image and how fast we got those results. In this thesis, we implement some classical image processing operations in order to improve the average rank of the desired image, and we also implement two object recognition techniques to improve the subjective quality of the best ranked images. Results of experiments show that the proposed system outperforms an equivalent CBIR system in QBE mode, both from the point of view of precision as well as recall.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The project manager has much to deliberate when choosing a software package for image rectification/registration. He/she must be able to perform a cost analysis evaluation of the packages in question, and determine which package will provide the highest level of positional accuracy. Objective and subjective analysis of six software packages, ArcView Image Analysis, GeoMedia Pro, Arc/Info 8.1, ERMAPPER, ENVI and Idrisi 3.2, and their multiple products (polynomials and triangulations) provide the basis with which the project manager may attain this goal. He/she is familiarized with the user interface of each package, through detailed step-by-step methodology. Positional accuracy of each product is compared to Ground Control Points (GCPs) derived from a Differential Global Positioning System (DGPS). The accuracy of each product is also compared to the industry standard USGS DOQQ, and it is discovered that while simple rectification procedures may produce mean errors acceptable to the specifications of NMAS, the strictest application of these standards reveal that these products are not accurate enough to satisfy the USGS standards.
Model
Digital Document
Publisher
Florida Atlantic University
Description
XYZ Video Compression denotes a video compression algorithm that operates in three dimensions, without the overhead of motion estimation. The smaller overhead of this algorithm as compared to MPEG and other "standards-based" compression algorithms using motion estimation suggests the suitability of this algorithm to real-time applications. The demonstrated results of compression of standard motion video benchmarks suggest that XYZ Video Compression is not only a faster algorithm, but develops superior compression ratios as well. The algorithm is based upon the three-dimensional Discrete Cosine Transform (DCT). Pixels are organized as 8 x 8 x 8 cubes by taking 8 x 8 squares out of 8 consecutive frames. A fast three-dimensional transform is applied to each cube, generating 512 DCT coefficients. The energy-packing property of the DCT concentrates the energy in the cube into few coefficients. The DCT coefficients are quantized to maximize the energy concentration at the expense of introduction of a user-determined level of error. A method of adaptive quantization that generates optimal quantizers based upon statistics gathered for the 8 consecutive frames is described. The sensitivity of the human eye to various DCT coefficients is used to modify the quantizers to create a "visually equivalent" cube with still greater energy concentration. Experiments are described that justify choice of Human Visual System factors to be folded into the quantization step. The quantized coefficients are then encoded into a data stream using a method of entropy coding based upon the statistics of the quantized coefficients. The bitstream generated by entropy coding represents the compressed data of the 8 motion video frames, and typically will be compressed at 50:1 at 5% error. The decoding process is the reverse of the encoding process: the bitstream is decoded to generate blocks of quantized DCT coefficients, the DCT coefficients are dequantized, and the Inverse Discrete Cosine Transform is performed on the cube to recover pixel data suitable for display. The elegance of this technique lies in its simplicity, which lends itself to inexpensive implementation of both encoder and decoder. Finally, real-time implementation of the XYZ Compressor/Decompressor is discussed. Experiments are run to determine the effectiveness of the implementation.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The main objective of the research is to develop computationally efficient hybrid coding schemes for the low bit implementations of image frames and image sequences. The basic fractal block coding can compress a relatively low resolution image efficiently without blocky artifacts, but it does not converge well at the high frequency edges. This research proposes a hybrid multi-resolution scheme which combines the advantages of fractal and DCT coding schemes. The fractal coding is applied to get a lower resolution, quarter size output image and DCT is then used to encode the error residual between original full bandwidth image signal and the fractal decoded image signal. At the decoder side, the full resolution, full size reproduced image is generated by adding decoded error image to the decoded fractal image. Also, the lower resolution, quarter size output image is automatically given by the iteration function scheme without having to spend extra effort. Other advantages of the scheme are that the high resolution layer is generated by error image which covers the bandwidth loss of the lower resolution layer as well as the coding error of the lower resolution layer, and that it does not need a sophisticated classification procedure. A series of computer simulation experiments are conducted and their results are presented to illustrate the merit of the scheme. The hybrid fractal coding method is then extended to process motion sequences as well. A new scheme is proposed for motion vector detection and motion compensation, by judiciously combining the techniques of fractal compression and block matching. The advantage of this scheme is that it improves the performance of the motion compensation, while keeping the overall computational complexity low for each frame. The simulation results on realistic video conference image sequences support the superiority of the proposed method in terms of reproduced picture quality and compression ratio.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The objective of this dissertation is to develop effective algorithms for texture characterization, segmentation and labeling that operate selectively to label image textures, using the Gabor representation of signals. These representations are an analog of the spatial frequency tuning characteristics of the visual cortex cells. The Gabor function, of all spatial/spectral signal representations, provides optimal resolution between both domains. A discussion of spatial/spectral representations focuses on the Gabor function and the biological analog that exists between it and the simple cells of the striate cortex. A simulation generates examples of the use of the Gabor filter as a line detector with synthetic data. Simulations are then presented using Gabor filters for real texture characterization. The Gabor filter spatial and spectral attributes are selectively chosen based on the information from a scale-space image in order to maximize resolution of the characterization process. A variation of probabilistic relaxation that exploits the Gabor filter spatial and spectral attributes is devised, and used to force a consensus of the filter responses for texture characterization. We then perform segmentation of the image using the concept of isolation of low energy states within an image. This iterative smoothing algorithm, operating as a Gabor filter post-processing stage, depends on a line processes discontinuity threshold. Selection of the discontinuity threshold is obtained from the modes of the histogram of the relaxed Gabor filter responses using probabilistic relaxation to detect the significant modes. We test our algorithm on simple synthetic and real textures, then use a more complex natural texture image to test the entire algorithm. Limitations on textural resolution are noted, as well as for the resolution of the image segmentation process.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In our society, large volumes of documents are exchanged on a daily basis. Since documents can easily be scanned, modified and reproduced without any loss in quality, unauthorized use and modification of documents is of major concern. An authentication watermark embedded into a document as an invisible, fragile mark can be used to detect illegal document modification. However, the authentication watermark can only be used to determine whether documents have been tampered with, and additional protection may be needed to prevent unauthorized use and distribution of those documents. A solution to this problem is a two-level, multipurpose watermark. The first level watermark is an authentication mark used to detect document tampering, while the second level watermark is a robust mark, which identifies the legitimate owner and/or user of specific document. This dissertation introduces a new adaptive two-level multipurpose watermarking scheme suitable for binary document images, such as scanned text, figures, engineering and road maps, architectural drawings, music scores, and handwritten text and sketches. This watermarking scheme uses uniform quantization and overlapped embedding to add two watermarks, one robust and the other fragile, into a binary document image. The two embedded watermarks serve different purposes. The robust watermark carries document owner or document user identification, and the fragile watermark confirms document authenticity and helps detect document tampering. Both watermarks can be extracted without accessing the original document image. The proposed watermarking scheme adaptively selects an image partitioning block size to optimize the embedding capacity, the image permutation key to minimize watermark detection error, and the size of local neighborhood in which modification candidate pixels are scored to minimize visible distortion of watermarked documents. Modification candidate pixels are scored using a novel, objective metric called the Structural Neighborhood Distortion Measure (SNDM). Experimental results confirm that this watermarking scheme, which embeds watermarks by modifying image pixels based on their SNDM scores, creates smaller visible document distortion than watermarking schemes which base watermark embedding on any other published pixel scoring method. Document tampering is detected successfully and the robust watermark can be detected even after document tampering renders the fragile watermark undetectable.
Model
Digital Document
Publisher
Florida Atlantic University
Description
This dissertation presents the results of research that led to the development of a complete, fully functional, image search and retrieval system with relevance feedback capabilities, called MUSE (MUltimedia SEarch and Retrieval Using Relevance Feedback). Two different models for searching for a target image using relevance feedback have been proposed, implemented, and tested. The first model uses a color-based feature vector and employs a Bayesian learning algorithm that updates the probability of each image in the database being the target based on the user's actions. The second model uses cluster analysis techniques, a combination of color-, texture-, and edge(shape)-based features, and a novel approach to learning the user's goals and the relevance of each feature for a particular search. Both models follow a purely content-based image retrieval paradigm. The search process is based exclusively on image contents automatically extracted during the (off-line) feature extraction stage. Moreover, they minimize the number and complexity of required user's actions, in contrast with the complexity of the underlying search and retrieval engine. Results of experiments show that both models exhibit good performance for moderate-size, unconstrained databases and that a combination of the two outperforms any of them individually, which is encouraging. In the process of developing this dissertation, we also implemented and tested several image features and similarity measurement combinations. The result of these tests---performed under the query-by-example (QBE) paradigm---served as a reference in the choice of which features to use in the relevance feedback mode and confirmed the difficulty in encoding the understanding of image similarity into a combination of features and distances without human assistance. Most of the code written during the development of this dissertation has been encapsulated into a multifunctional prototype that combines image searching (with or without an example), browsing, and viewing capabilities and serves as a framework for future research in the subject.