Image processing

Model
Digital Document
Publisher
Florida Atlantic University
Description
The retrieval of digital images is hindered by the semantic gap. The semantic gap is the disparity between a user's high-level interpretation of an image and the information that can be extracted from an image's physical properties. Content based image retrieval systems are particularly vulnerable to the semantic gap due to their reliance on low-level visual features for describing image content. The semantic gap can be narrowed by including high-level, user-generated information. High-level descriptions of images are more capable of capturing the semantic meaning of image content, but it is not always practical to collect this information. Thus, both content-based and human-generated information is considered in this work. A content-based method of retrieving images using a computational model of visual attention was proposed, implemented, and evaluated. This work is based on a study of contemporary research in the field of vision science, particularly computational models of bottom-up visual attention. The use of computational models of visual attention to detect salient by design regions of interest in images is investigated. The method is then refined to detect objects of interest in broad image databases that are not necessarily salient by design. An interface for image retrieval, organization, and annotation that is compatible with the attention-based retrieval method has also been implemented. It incorporates the ability to simultaneously execute querying by image content, keyword, and collaborative filtering. The user is central to the design and evaluation of the system. A game was developed to evaluate the entire system, which includes the user, the user interface, and retrieval methods.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Perceptual video coding has been a promising area during the last years. Increases in compression ratios have been reported by applying foveated video coding techniques where the region of interest (ROI) is selected by using a computational attention model. However, most of the approaches for perceptual video coding only use visual features ignoring the auditory component. In recent physiological studies, it has been demonstrated that auditory stimuli affects our visual perception. In this work, we validate some of those physiological tests using complex video sequence. We designed and developed a web-based tool for video quality measurement. After conducting different experiments, we observed that in the general reaction time to detect video artifacts was higher when video was presented with the audio information. We observed that emotional information in audio guide human attention to particular ROI. We also observed that sound frequency change spatial frequency perception in still images.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In order to facilitate the development, discussion, and advancement of the relatively new subfield of Artificial Intelligence focused on generating narrative content, the author has developed a pattern language for generating narratives, along with a new categorization framework for narrative generation systems. An emphasis and focus is placed on generating the Fabula of the story (the ordered sequence of events that make up the plot). Approaches to narrative generation are classified into one of three categories, and a pattern is presented for each approach. Enhancement patterns that can be used in conjunction with one of the core patterns are also identified. In total, nine patterns are identified - three core narratology patterns, four Fabula patterns, and two extension patterns. These patterns will be very useful to software architects designing a new generation of narrative generation systems.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Fine-scale urban land cover information is important for a number of applications, including urban tree canopy mapping, green space analysis, and urban hydrologic modeling. Land cover information has traditionally been extracted from satellite or aerial images using automated image classification techniques, which classify pixels into different categories of land cover based on their spectral characteristics. However, in fine spatial resolution images (4 meters or better), the high degree of within-class spectral variability and between-class spectral similarity of many types of land cover leads to low classification accuracy when pixel-based, purely spectral classification techniques are used. Object-based classification methods, which involve segmenting an image into relatively homogeneous regions (i.e. image segments) prior to classification, have been shown to increase classification accuracy by incorporating the spectral (e.g. mean, standard deviation) and non-spectral (e.g. te xture, size, shape) information of image segments for classification. One difficulty with the object-based method, however, is that a segmentation parameter (or set of parameters), which determines the average size of segments (i.e. the segmentation scale), is difficult to choose. Some studies use one segmentation scale to segment and classify all types of land cover, while others use multiple scales due to the fact that different types of land cover typically vary in size. In this dissertation, two multi-scale object-based classification methods were developed and tested for classifying high resolution images of Deerfield Beach, FL and Houston, TX. These multi-scale methods achieved higher overall classification accuracies and Kappa coefficients than single-scale object-based classification methods.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The efforts addressed in this thesis refer to assaying the extent of local features in 2D-images for the purpose of recognition and classification. It is based on comparing a test-image against a template in binary format. It is a bioinformatics-inspired approach pursued and presented as deliverables of this thesis as summarized below: 1. By applying the so-called 'Smith-Waterman (SW) local alignment' and 'Needleman-Wunsch (NW) global alignment' approaches of bioinformatics, a test 2D-image in binary format is compared against a reference image so as to recognize the differential features that reside locally in the images being compared 2. SW and NW algorithms based binary comparison involves conversion of one-dimensional sequence alignment procedure (indicated traditionally for molecular sequence comparison adopted in bioinformatics) to 2D-image matrix 3. Relevant algorithms specific to computations are implemented as MatLabTM codes 4. Test-images considered are: Real-world bio-/medical-images, synthetic images, microarrays, biometric finger prints (thumb-impressions) and handwritten signatures. Based on the results, conclusions are enumerated and inferences are made with directions for future studies.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Video identification or copy detection is a challenging problem and is becoming increasingly important with the popularity of online video services. The problem addressed in this thesis is the identification of a given video clip in a given set of videos. For a given query video, the system returns all the instance of the video in the data set. This identification system uses video signatures based on video tomography. A robust and low complexity video signature is designed and implemented. The nature of the signature makes it independent to the most commonly video transformations. The signatures are generated for video shots and not individual frames, resulting in a compact signature of 64 bytes per video shot. The signatures are matched using simple Euclidean distance metric. The results show that videos can be identified with 100% recall and over 93% precision. The experiments included several transformations on videos.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Video signature techniques based on tomography images address the problem of video identification. This method relies on temporal segmentation and sampling strategies to build and determine the unique elements that will form the signature. In this thesis an extension for these methods is presented; first a new feature extraction method, derived from the previously proposed sampling pattern, is implemented and tested, resulting in a highly distinctive set of signature elements, second a robust temporal video segmentation system is used to replace the original method applied to determine shot changes more accurately. Under a very exhaustive set of tests the system was able to achieve 99.58% of recall, 100% of precision and 99.35% of prediction precision.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Professional imaging systems, particularly motion picture cameras, usually employ larger photosites and lower pixel counts than many amateur cameras. This results in the desirable characteristics of improved dynamic range, signal to noise and sensitivity. However, high performance optics often have frequency response characteristics that exceed the Nyquist limit of the sensor, which, if not properly addressed, results in aliasing artifacts in the captured image. Most contemporary still and video cameras employ various optically birefringent materials as optical low-pass filters (OLPF) in order to minimize aliasing artifacts in the image. Most OLPFs are designed as optical elements with a frequency response that does not change even if the frequency responses of the other elements of the capturing systems are altered. An extended evaluation of currently used birefringent-based OLPFs is provided. In this work, the author proposed and demonstrated the use of a parallel optical window p ositioned between a lens and a sensor as an OLPF. Controlled X- and Y-axes rotations of the optical window during the image exposure results in a manipulation of the system's point-spread function (PSF). Consequently, changing the PSF affects some portions of the frequency components contained in the image formed on the sensor. The system frequency response is evaluated when various window functions are used to shape the lens' PSF, such as rectangle, triangle, Tukey, Gaussian, Blackman-Harris etc. In addition to the ability to change the PSF, this work demonstrated that the PSF can be manipulated dynamically, which allowed us to modify the PSF to counteract any alteration of other optical elements of the capturing system. There are several instances presented in the dissertation in which it is desirable to change the characteristics of an OLPF in a controlled way.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Digital video is being used widely in a variety of applications such as entertainment, surveillance and security. Large amount of video in surveillance and security requires systems capable to processing video to automatically detect and recognize events to alleviate the load on humans and enable preventive actions when events are detected. The main objective of this work is the analysis of computer vision techniques and algorithms used to perform automatic detection of events in video sequences. This thesis presents a surveillance system based on optical flow and background subtraction concepts to detect events based on a motion analysis, using an event probability zone definition. Advantages, limitations, capabilities and possible solution alternatives are also discussed. The result is a system capable of detecting events of objects moving in opposing direction to a predefined condition or running in the scene, with precision greater than 50% and recall greater than 80%.