Image transmission

Model
Digital Document
Publisher
Florida Atlantic University
Description
In this thesis we describe a local-neighborhood-pixel-based adaptive algorithm to track image features, both spatially and temporally, over a sequence of monocular images. The algorithm assumes no a priori knowledge about the image features to be tracked, or the relative motion between the camera and the 3-D objects. The features to be tracked are selected by the algorithm and they correspond to the peaks of '2-D intensity correlation surface' constructed from a local neighborhood in the first image of the sequence to be analyzed. Any kind of motion, i.e., 6 DOF (translation and rotation), can be tolerated keeping in mind the pixels-per-frame motion limitations. No subpixel computations are necessary. Taking into account constraints of temporal continuity, the algorithm uses simple and efficient predictive tracking over multiple frames. Trajectories of features on multiple objects can also be computed. The algorithm accepts a slow, continuous change of brightness D.C. level in the pixels of the feature. Another important aspect of the algorithm is the use of an adaptive feature matching threshold that accounts for change in relative brightness of neighboring pixels. As applications of the feature-tracking algorithm and to test the accuracy of the tracking, we show how the algorithm has been used to extract the Focus of Expansion (FOE) and compute the Time-to-contact using real image sequences of unstructured, unknown environments. In both these applications, information from multiple frames is used.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Recent advances in integrated circuit technology have resulted in the manufacture of analog to digital converters with a logarithmic transfer characteristic that can operate at video rates. This thesis shows that applying this sort of non-linearity to video signals during digitizing will result in a system that will more closely match the human visual system. The appropriate background is presented including image reproduction, psychophysical aspects of perception, quantization error, and contrast sensitivity. The background is then related to logarithmic processing with respect to minimizing contouring and improving the fidelity of detail in saturated colors. Finally, logarithmic processing was simulated using an image processor and prototype logarithmic analog to digital converters. The results of this demonstration are presented.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The Interlaced Pixel Delta (IPD) Video Codec is a real time video compression and decompression engine. It is specifically designed to be used for video phone or video conferencing applications that are to be run under very low bandwidth networking conditions. The example network used throughout this dissertation is the Internet where users are typically connected at transmission speeds of 33.3 K bits per second or less. In order to accomplish this goal, the IPD codec must achieve very high compression ratios. This feat is further complicated by the fact that the IPD codec is to be fully realized using a software approach in order to be considered a viable solution for the average Internet user. The demonstrated test results show that the IPD codec is capable of achieving these ambitious goals. The IPD compressor operates in a pipelined manner. Each stage in the IPD compression pipeline has its own complexities and challenges, which are individually addressed in detail. The ultimate goal of the IPD compressor is to maintain a constant compression ratio that is sufficiently high enough to allow bi-directional video communication to take place across low bandwidth transmission lines. These compression ratios must be achieved using a software compressor and decompressor. Strict CPU utilization requirements must be met by the IPD codec in order for it to be able to operate in real time. The IPD compressor defines a unique video interlacing scheme to sample the pixels that comprise the incoming video frames. The properties of the interlacing schemes aid the video compressor in its quest for high compression ratios. Later in the decompression stage, the IPD decompressor uses the properties of the interlacing schemes to reverse the sampling process to bring back the original picture quality. The IPD compressor also employs a custom variation of the error diffusion algorithm in its color reduction phase. A pixel delta algorithm is used to build a new frame from a previous frame. The pixel delta algorithm defines a unique bitmask representation of pixel locations that are flagged for refresh. These pixel locations will be used to build a subsequent frame. The bitmask representation of pixel locations is further compressed using a variation of the Huffman compression algorithm. An IPD delta frame is built by the IPD compressor. The IPD delta frame contains a header, the compressed bitmask of pixel locations flagged for change and the actual compressed pixel intensity values used used to build a new frame from a previous frame. The IPD decompressor also operates in a pipelined manner. The IPD decompressor also has strict requirements with respect to CPU utilization. The IPD decompressor applies several image processing algorithms to the video output stream in order to enhance the visual quality of the reconstructed output video frames. Custom test programs are used to derive and validate the algorithms presented in this dissertation. A working prototype of the complete IPD codec is also presented to aid in the visual analysis of the final video picture quality.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The field of Video Transcoding has been evolving throughout the past ten years. The need for transcoding of video files has greatly increased because of the new upcoming standards which are incompatible with old ones. This thesis takes the method of using machine learning for video transcoding mode decisions and discusses ways to improve the process of generating the algorithm for implementation in different video transcoders. The transcoding methods used decrease the complexity in the mode decision inside the video encoder. Also methods which automate and improve results are discussed and implemented in two different sets of transcoders: H.263 to VP6 , and MPEG-2 to H.264. Both of these transcoders have shown a complexity loss of almost 50%. Video transcoding is important because the quantity of video standards have been increasing while devices usually can only decode one specific codec.
Model
Digital Document
Publisher
Florida Atlantic University
Description
H.264/AVC encoder complexity is mainly due to variable size in Intra and Inter frames. This makes H.264/AVC very difficult to implement, especially for real time applications and mobile devices. The current technological challenge is to conserve the compression capacity and quality that H.264 offers but reduce the encoding time and, therefore, the processing complexity. This thesis applies machine learning technique for video encoding mode decisions and investigates ways to improve the process of generating more general low complexity H.264/AVC video encoders. The proposed H.264 encoding method decreases the complexity in the mode decision inside the Inter frames. Results show, at least, a 150% average reduction of complexity and, at most, 0.6 average increases in PSNR for different kinds of videos and formats.