Video compression

Model
Digital Document
Publisher
Florida Atlantic University
Description
Lower prices of video sensors, security concerns and the need for better and faster
algorithms to extract high level information from video sequences are all factors which
have stimulated research in the area of automated video surveillance systems. In the
context of security the analysis of human interrelations and their environment provides
hints to proactively identify anomalous behavior. However, human detection is a
necessary component in systems where the automatic extraction of higher level
information, such as recognizing individuals' activities, is required. The human detection
problem is one of classification. In general, motion, appearance and shape are the
classification approaches a system can employ to perform human detection. Techniques
representative of these approaches, such us periodic motion detection, skin color
detection and MPEG-7 shape descriptors are implemented in this work. An infrastructure
that allows data collection for such techniques was also implemented.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Recently, multimedia applications and their use have grown dramatically in
popularity in strong part due to mobile device adoption by the consumer market.
Applications, such as video conferencing, have gained popularity. These applications
and others have a strong video component that uses the mobile device’s resources. These
resources include processing time, network bandwidth, memory use, and battery life.
The goal is to reduce the need of these resources by reducing the complexity of the
coding process. Mobile devices offer unique characteristics that can be exploited for
optimizing video codecs. The combination of small display size, video resolution, and
human vision factors, such as acuity, allow encoder optimizations that will not (or
minimally) impact subjective quality. The focus of this dissertation is optimizing video services in mobile environments. Industry has begun migrating from H.264 video coding to a more resource intensive but compression efficient High Efficiency Video Coding (HEVC). However, there has been no proper evaluation and optimization of HEVC for mobile environments.
Subjective quality evaluations were performed to assess relative quality between H.264
and HEVC. This will allow for better use of device resources and migration to new
codecs where it is most useful. Complexity of HEVC is a significant barrier to adoption
on mobile devices and complexity reduction methods are necessary. Optimal use of
encoding options is needed to maximize quality and compression while minimizing
encoding time. Methods for optimizing coding mode selection for HEVC were
developed. Complexity of HEVC encoding can be further reduced by exploiting the
mismatch between the resolution of the video, resolution of the mobile display, and the
ability of the human eyes to acquire and process video under these conditions. The
perceptual optimizations developed in this dissertation use the properties of spatial
(visual acuity) and temporal information processing (motion perception) to reduce the
complexity of HEVC encoding. A unique feature of the proposed methods is that they
reduce encoding complexity and encoding time.
The proposed HEVC encoder optimization methods reduced encoding time by
21.7% and bitrate by 13.4% with insignificant impact on subjective quality evaluations.
These methods can easily be implemented today within HEVC.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The classic methods in indexing image and video databases are either using keywords or analysis of color distribution. In the recent year, there is a new standard in image and video compression standard called JPEG and MPEG respectively. One of the basic operations of JPEG and MPEG is Discrete Cosine Transform (DCT). The human visual system is known to be very dependent on spatial frequency. The DCT has capability to provide a good approximation of the images' spatial frequency that is sensitive to human eyes. We take this advantage of DCT in indexing image and video databases. However, the two-dimensional DCT can give us 64 coefficients per block of 8 x 8 pixels. These numbers are too many to calculate to receive fast indexing results. We use only first coefficient of DCT called DC coefficient to represent an 8 x 8 block of transformed data. This representation yields satisfactory indexing results.
Model
Digital Document
Publisher
Florida Atlantic University
Description
To observe the effects of satellite transmission on video compression technology designed at FAU's Imaging Systems Lab; an interface was designed to accept data directly from a video encoder or a 16 GByte RAID storage device. The design uses a Xilinx XC4005E field programmable gate array. The interface connects to a high speed enhanced parallel port at the computer backplane. Data stored via the interface on the computer; is transferred at a T1 rate through the ACTS T1-VSAT satellite link. In loop-back mode the data is stored, then evaluated.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In order to effectively transport digital compressed video over Broadband Integrated Services Digital Networks (B-ISDN) with Asynchronous Transfer Mode (ATM), the characteristics of the video source traffic should be understood. The nature of the video traffic depends primarily on the source, the content of the video and the coding algorithm that removes redundancies for efficient transmission over networks. In this study, video conference data encoded using a subband coding scheme, Digital Video Compression System (DVCS), is analyzed to determine its characteristics. Several video traffic sources are multiplexed through an ATM network node with limited capacity and the performance of this environment is evaluated by using a simulation technique. The simulation results are presented for the performance measures for varying traffic and network conditions.
Model
Digital Document
Publisher
Florida Atlantic University
Description
When motion occurs in a scene, the quality of video degrades due to motion smear, which results in a loss of contrast in the image. The characteristics of the human vision system when smooth pursuit eye movements occur are different from those when the eye fixates on an object such as a video screen during motion. Smooth pursuit eye movements dominate in the presence of dynamic stimuli. In the presence of smooth pursuit eye movements, the contrast sensitivity for increasing target velocities shifts toward lower spatial frequencies. The sensitivity for low spatial frequencies during motion is higher than for a stationary case. This dissertation will propose a method to improve the perceptual quality of video using temporal enhancement prefiltering technique based on the characteristics of Smooth Pursuit Eye Movements (SPEM). The resulting technique closely matches the characteristics of the human visual system (HVS). When motion occurs, the eye tracks the moving targets in a scene as opposed to fixating on any portion of the scene. Hence, psychophysical studies of smooth pursuit eye movements were used as a basis to design the temporal filters. Results of experiments show that temporal enhancement results in improved quality by increasing the apparent sharpness of the image sequence. In this dissertation, a study of research describing how motion affects the image quality at the camera lens and the human eye is presented. This dissertation uses that research to develop a temporal enhancement technique to improve the quality of video degraded by motion.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The field of Video Transcoding has been evolving throughout the past ten years. The need for transcoding of video files has greatly increased because of the new upcoming standards which are incompatible with old ones. This thesis takes the method of using machine learning for video transcoding mode decisions and discusses ways to improve the process of generating the algorithm for implementation in different video transcoders. The transcoding methods used decrease the complexity in the mode decision inside the video encoder. Also methods which automate and improve results are discussed and implemented in two different sets of transcoders: H.263 to VP6 , and MPEG-2 to H.264. Both of these transcoders have shown a complexity loss of almost 50%. Video transcoding is important because the quantity of video standards have been increasing while devices usually can only decode one specific codec.