Digital video

Model
Digital Document
Publisher
Florida Atlantic University
Description
In order to effectively transport digital compressed video over Broadband Integrated Services Digital Networks (B-ISDN) with Asynchronous Transfer Mode (ATM), the characteristics of the video source traffic should be understood. The nature of the video traffic depends primarily on the source, the content of the video and the coding algorithm that removes redundancies for efficient transmission over networks. In this study, video conference data encoded using a subband coding scheme, Digital Video Compression System (DVCS), is analyzed to determine its characteristics. Several video traffic sources are multiplexed through an ATM network node with limited capacity and the performance of this environment is evaluated by using a simulation technique. The simulation results are presented for the performance measures for varying traffic and network conditions.
Model
Digital Document
Publisher
Florida Atlantic University
Description
XYZ Video Compression denotes a video compression algorithm that operates in three dimensions, without the overhead of motion estimation. The smaller overhead of this algorithm as compared to MPEG and other "standards-based" compression algorithms using motion estimation suggests the suitability of this algorithm to real-time applications. The demonstrated results of compression of standard motion video benchmarks suggest that XYZ Video Compression is not only a faster algorithm, but develops superior compression ratios as well. The algorithm is based upon the three-dimensional Discrete Cosine Transform (DCT). Pixels are organized as 8 x 8 x 8 cubes by taking 8 x 8 squares out of 8 consecutive frames. A fast three-dimensional transform is applied to each cube, generating 512 DCT coefficients. The energy-packing property of the DCT concentrates the energy in the cube into few coefficients. The DCT coefficients are quantized to maximize the energy concentration at the expense of introduction of a user-determined level of error. A method of adaptive quantization that generates optimal quantizers based upon statistics gathered for the 8 consecutive frames is described. The sensitivity of the human eye to various DCT coefficients is used to modify the quantizers to create a "visually equivalent" cube with still greater energy concentration. Experiments are described that justify choice of Human Visual System factors to be folded into the quantization step. The quantized coefficients are then encoded into a data stream using a method of entropy coding based upon the statistics of the quantized coefficients. The bitstream generated by entropy coding represents the compressed data of the 8 motion video frames, and typically will be compressed at 50:1 at 5% error. The decoding process is the reverse of the encoding process: the bitstream is decoded to generate blocks of quantized DCT coefficients, the DCT coefficients are dequantized, and the Inverse Discrete Cosine Transform is performed on the cube to recover pixel data suitable for display. The elegance of this technique lies in its simplicity, which lends itself to inexpensive implementation of both encoder and decoder. Finally, real-time implementation of the XYZ Compressor/Decompressor is discussed. Experiments are run to determine the effectiveness of the implementation.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In the current communications age, the capabilities of mobile devices are increasing. The mobiles are capable of communicating at data rates of hundreds of mbps on 4G networks. This enables playback of rich multimedia content comparable to internet and television networks. However, mobile networks need to be spectrum-efficient to be affordable to users. Multimedia Broadcast Multicast Systems (MBMS) is a wireless broadcasting standard that is being drafted to enable multimedia broadcast while focusing on being spectrum-efficient. The hybrid video coding techniques facilitate low bitrate transmission, but result in dependencies across frames. With a mobile environment being error prone, no error correction technique can guarantee error free transmission. Such errors propagate, resulting in quality degradation. With numerous mobiles sharing the broadcast session, any error resilient scheme should account for heterogeneous device capabilities and channel conditions. The current research on wireless video broadcasting focuses on network based techniques such as FEC and retransmissions, which add bandwidth overhead. There is a need to design innovative error resilient techniques that make video codec robust with minimal bandwidth overhead. This Dissertation introduces novel techniques in the area of MBMS systems. First, robust video structures are proposed in Periodic Intra Frame based Prediction (PIFBP) and Periodic Anchor Frame based Prediction (PAFBP) schemes. In these schemes, the Intra frames or anchor frames serve as reference frames for prediction during GOP period. The intermediate frames are independent of others; any errors in such frames are not propagated, thereby resulting in error resilience. In prior art, intra block rate is adapted based on the channel characteristics for error resilience. This scheme has been generalized in multicasting to address a group of users sharing the same session. Average packet loss is used to determine the intra block rate. This improves performance of the overall group and strives for consistent performance. Also, the inherent diversity in the broadcasting session can be used for its advantage. With mobile devices capable of accessing a WLAN during broadcast, they form an adhoc network on a WLAN to recover lost packets. New error recovery schemes are proposed for error recovery and their performance comparison is presented.
Model
Digital Document
Publisher
Florida Atlantic University
Description
When motion occurs in a scene, the quality of video degrades due to motion smear, which results in a loss of contrast in the image. The characteristics of the human vision system when smooth pursuit eye movements occur are different from those when the eye fixates on an object such as a video screen during motion. Smooth pursuit eye movements dominate in the presence of dynamic stimuli. In the presence of smooth pursuit eye movements, the contrast sensitivity for increasing target velocities shifts toward lower spatial frequencies. The sensitivity for low spatial frequencies during motion is higher than for a stationary case. This dissertation will propose a method to improve the perceptual quality of video using temporal enhancement prefiltering technique based on the characteristics of Smooth Pursuit Eye Movements (SPEM). The resulting technique closely matches the characteristics of the human visual system (HVS). When motion occurs, the eye tracks the moving targets in a scene as opposed to fixating on any portion of the scene. Hence, psychophysical studies of smooth pursuit eye movements were used as a basis to design the temporal filters. Results of experiments show that temporal enhancement results in improved quality by increasing the apparent sharpness of the image sequence. In this dissertation, a study of research describing how motion affects the image quality at the camera lens and the human eye is presented. This dissertation uses that research to develop a temporal enhancement technique to improve the quality of video degraded by motion.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Perceptual video coding has been a promising area during the last years. Increases in compression ratios have been reported by applying foveated video coding techniques where the region of interest (ROI) is selected by using a computational attention model. However, most of the approaches for perceptual video coding only use visual features ignoring the auditory component. In recent physiological studies, it has been demonstrated that auditory stimuli affects our visual perception. In this work, we validate some of those physiological tests using complex video sequence. We designed and developed a web-based tool for video quality measurement. After conducting different experiments, we observed that in the general reaction time to detect video artifacts was higher when video was presented with the audio information. We observed that emotional information in audio guide human attention to particular ROI. We also observed that sound frequency change spatial frequency perception in still images.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Digital video is being used widely in a variety of applications such as entertainment, surveillance and security. Large amount of video in surveillance and security requires systems capable to processing video to automatically detect and recognize events to alleviate the load on humans and enable preventive actions when events are detected. The main objective of this work is the analysis of computer vision techniques and algorithms used to perform automatic detection of events in video sequences. This thesis presents a surveillance system based on optical flow and background subtraction concepts to detect events based on a motion analysis, using an event probability zone definition. Advantages, limitations, capabilities and possible solution alternatives are also discussed. The result is a system capable of detecting events of objects moving in opposing direction to a predefined condition or running in the scene, with precision greater than 50% and recall greater than 80%.