Kalva, Hari

Person Preferred Name
Kalva, Hari
Model
Digital Document
Publisher
Florida Atlantic University
Description
Many times we decide to go to a place depending on how crowded the place is.
Our decisions are made based on different aspects that are only known in real time. A
system that provides users or agencies information about the actual number of people in
the scene over the time will allow them to make a decision or have information about a
given location. This thesis presents a low complexity system for human counting and
human detection using public cameras which usually do not have good quality. The use
of computer vision techniques makes it possible to have a system that allows the user to
have an estimate number of people. Different videos were studied with different
resolutions and camera positions. The best video result shows an error of 0.269%, while
the worst one is 8.054 %. The results show that relatively inexpensive cameras streaming
video at a low bitrate can be used to develop large scale people counting applications.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Real-time eye tracking systems with human-computer interaction mechanism are being adopted to advance user experience in smart devices and consumer electronic systems. Eye tracking systems measure eye gaze and pupil response non-intrusively. This research presents an analysis of eye pupil and gaze response to video structure and content. The set of experiments for this study involved presenting different video content to subjects and measuring eye response with an eye tracker. Results show significant changes in video and scene cuts led to sharp constrictions. User response to videos can provide insights that can improve subjective quality assessment metrics. This research also presents an analysis of the pupil and gaze response to quality changes in videos. The results show pupil constrictions for noticeable changes in perceived quality and higher fixations/saccades ratios with lower quality. Using real-time eye tracking systems for video analysis and quality evaluation can open a new class of applications for consumer electronic systems.
Model
Digital Document
Publisher
Florida Atlantic University
Description
E-Learning is transforming the delivery of education. Today, millions of students take selfpaced
online courses. However, the content and language complexity often hinders
comprehension, and that with lack of immediate help from the instructor leads to weaker
learning outcomes. Ability to predict difficult content in real time enables eLearning
systems to adapt content as per students' level of learning. The recent introduction of lowcost
eye trackers has opened a new class of applications based on eye response. Eye
tracking devices can record eye response on the visual element or concept in real time. The
response and the variations in eye response to the same concept over time may be indicative
of the levels of learning.
In this study, we have analyzed reading patterns using eye tracker and derived 12 eye
response features based on psycholinguistics, contextual information processing, anticipatory behavior analysis, recurrence fixation analysis, and pupils' response. We use
eye responses to predict the level of learning for a term/concept. One of the main
contribution is the spatio-temporal analysis of the eye response on a term/concept to derive
relevant first pass (spatial) and reanalysis (temporal) eye response features. A spatiotemporal
model, built using these derived features, analyses slide images, extracts words
(terms), maps the subject's eye response to words, and prepares a term-response map. A
parametric baseline classifier, trained with labeled data (term-response maps) classifies a
term/concept as a novel (positive class) or familiar (negative class), using majority voting
method. On using, only first pass features for prediction, the baseline classifier shows 61%
prediction accuracy, but on adding reanalysis features, baseline achieves 66.92% accuracy
for predicting difficult terms. However, all proposed features do not have the same
response to learning difficulties for all subjects, as we consider reading as an individual
characteristic.
Hence, we developed a non-parametric, feature weighted linguistics classifier (FWLC),
which assigns weight to features based on their relevance. The FWLC classifier achieves
a prediction accuracy of 90.54% an increase of 23.62% over baseline and 29.54% over the
first-pass variant of baseline. Predicting novel terms as familiar is more expensive because
content adapts by using this information. Hence, our primary goal is to increase the
prediction rate of novel terms by minimizing the cost of false predictions. On comparing
the performance of FWLC with other frequently used machine learning classifiers, FWLC
achieves highest true positive rate (TPR) and lowest ratio of false negative rate (FNR) to
false positive rate (FPR). The higher prediction performance of proposed spatio-temporal eye response model to predict levels of learning builds a strong foundation for eye response
driven adaptive e-Learning.
Model
Digital Document
Publisher
Florida Atlantic University
Description
We are presenting work that is aimed at employing characteristics of human visual
system in optimizing video coding compression. Preliminary experiments that include temporal
and motion masking show results with significant savings in bitrate compared to state of the art
coding algorithms.
Model
Digital Document
Publisher
Florida Atlantic University
Description
There is now more data being created than ever before and this data can be any
form of data, textual, multimedia, spatial etc. To process this data, several big data
processing platforms have been developed including Hadoop, based on the MapReduce
model and LexisNexis’ HPCC systems.
In this thesis we evaluate the HPCC Systems framework with a special interest in
multimedia data analysis and propose a framework for multimedia data processing.
It is important to note that multimedia data encompasses a wide variety of data including
but not limited to image data, video data, audio data and even textual data. While
developing a unified framework for such wide variety of data, we have to consider
computational complexity in dealing with the data. Preliminary results show that HPCC
can potentially reduce the computational complexity significantly.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The emerging Scalable Video Coding (SVC) extends the H.264/AVC video coding standard with new tools designed to efficiently support temporal, spatial and SNR scalability. In real-time multimedia systems, the coding performance of video encoders and decoders is limited by computational complexity. This thesis presents techniques to manage computational complexity of H.264/AVC and SVC video encoders. These techniques aim to provide significant complexity saving as well as a framework for efficient use of SVC.
Model
Digital Document
Publisher
Florida Atlantic University
Description
Recently, multimedia applications and their use have grown dramatically in
popularity in strong part due to mobile device adoption by the consumer market.
Applications, such as video conferencing, have gained popularity. These applications
and others have a strong video component that uses the mobile device’s resources. These
resources include processing time, network bandwidth, memory use, and battery life.
The goal is to reduce the need of these resources by reducing the complexity of the
coding process. Mobile devices offer unique characteristics that can be exploited for
optimizing video codecs. The combination of small display size, video resolution, and
human vision factors, such as acuity, allow encoder optimizations that will not (or
minimally) impact subjective quality. The focus of this dissertation is optimizing video services in mobile environments. Industry has begun migrating from H.264 video coding to a more resource intensive but compression efficient High Efficiency Video Coding (HEVC). However, there has been no proper evaluation and optimization of HEVC for mobile environments.
Subjective quality evaluations were performed to assess relative quality between H.264
and HEVC. This will allow for better use of device resources and migration to new
codecs where it is most useful. Complexity of HEVC is a significant barrier to adoption
on mobile devices and complexity reduction methods are necessary. Optimal use of
encoding options is needed to maximize quality and compression while minimizing
encoding time. Methods for optimizing coding mode selection for HEVC were
developed. Complexity of HEVC encoding can be further reduced by exploiting the
mismatch between the resolution of the video, resolution of the mobile display, and the
ability of the human eyes to acquire and process video under these conditions. The
perceptual optimizations developed in this dissertation use the properties of spatial
(visual acuity) and temporal information processing (motion perception) to reduce the
complexity of HEVC encoding. A unique feature of the proposed methods is that they
reduce encoding complexity and encoding time.
The proposed HEVC encoder optimization methods reduced encoding time by
21.7% and bitrate by 13.4% with insignificant impact on subjective quality evaluations.
These methods can easily be implemented today within HEVC.
Model
Digital Document
Publisher
Florida Atlantic University
Description
The main goal of video coding algorithms is to achieve high compression efficiency while
maintaining quality of the compressed signal at the highest level. Human visual system is
the ultimate receiver of compressed signal and final judge of its quality. This dissertation
presents work towards optimal video compression algorithm that is based on the
characteristics of our visual system. Modeling phenomena such as backward temporal
masking and motion masking we developed algorithms that are implemented in the state-of-
the-art video encoders. Result of using our algorithms is visually lossless compression
with improved efficiency, as verified by standard subjective quality and psychophysical
tests. Savings in bitrate compared to the High Efficiency Video Coding / H.265 reference
implementation are up to 45%.
Model
Digital Document
Publisher
Florida Atlantic University
Description
In the current communications age, the capabilities of mobile devices are increasing. The mobiles are capable of communicating at data rates of hundreds of mbps on 4G networks. This enables playback of rich multimedia content comparable to internet and television networks. However, mobile networks need to be spectrum-efficient to be affordable to users. Multimedia Broadcast Multicast Systems (MBMS) is a wireless broadcasting standard that is being drafted to enable multimedia broadcast while focusing on being spectrum-efficient. The hybrid video coding techniques facilitate low bitrate transmission, but result in dependencies across frames. With a mobile environment being error prone, no error correction technique can guarantee error free transmission. Such errors propagate, resulting in quality degradation. With numerous mobiles sharing the broadcast session, any error resilient scheme should account for heterogeneous device capabilities and channel conditions. The current research on wireless video broadcasting focuses on network based techniques such as FEC and retransmissions, which add bandwidth overhead. There is a need to design innovative error resilient techniques that make video codec robust with minimal bandwidth overhead. This Dissertation introduces novel techniques in the area of MBMS systems. First, robust video structures are proposed in Periodic Intra Frame based Prediction (PIFBP) and Periodic Anchor Frame based Prediction (PAFBP) schemes. In these schemes, the Intra frames or anchor frames serve as reference frames for prediction during GOP period. The intermediate frames are independent of others; any errors in such frames are not propagated, thereby resulting in error resilience. In prior art, intra block rate is adapted based on the channel characteristics for error resilience. This scheme has been generalized in multicasting to address a group of users sharing the same session. Average packet loss is used to determine the intra block rate. This improves performance of the overall group and strives for consistent performance. Also, the inherent diversity in the broadcasting session can be used for its advantage. With mobile devices capable of accessing a WLAN during broadcast, they form an adhoc network on a WLAN to recover lost packets. New error recovery schemes are proposed for error recovery and their performance comparison is presented.