Latent structure analysis. | fau.isle.flvc.org

Context-based Image Concept Detection and Annotation

Model

Digital Document

Publisher

Florida Atlantic University

Description

Scene understanding attempts to produce a textual description of visible and
latent concepts in an image to describe the real meaning of the scene. Concepts are
either objects, events or relations depicted in an image. To recognize concepts, the
decision of object detection algorithm must be further enhanced from visual
similarity to semantical compatibility. Semantically relevant concepts convey the
most consistent meaning of the scene.
Object detectors analyze visual properties (e.g., pixel intensities, texture, color
gradient) of sub-regions of an image to identify objects. The initially assigned
objects names must be further examined to ensure they are compatible with each
other and the scene. By enforcing inter-object dependencies (e.g., co-occurrence,
spatial and semantical priors) and object to scene constraints as background
information, a concept classifier predicts the most semantically consistent set of
names for discovered objects. The additional background information that describes
concepts is called context.
In this dissertation, a framework for building context-based concept detection is
presented that uses a combination of multiple contextual relationships to refine the
result of underlying feature-based object detectors to produce most semantically compatible concepts.
In addition to the lack of ability to capture semantical dependencies, object
detectors suffer from high dimensionality of feature space that impairs them.
Variances in the image (i.e., quality, pose, articulation, illumination, and occlusion)
can also result in low-quality visual features that impact the accuracy of detected
concepts.
The object detectors used to build context-based framework experiments in this
study are based on the state-of-the-art generative and discriminative graphical
models. The relationships between model variables can be easily described using
graphical models and the dependencies and precisely characterized using these
representations. The generative context-based implementations are extensions of
Latent Dirichlet Allocation, a leading topic modeling approach that is very
effective in reduction of the dimensionality of the data. The discriminative contextbased
approach extends Conditional Random Fields which allows efficient and
precise construction of model by specifying and including only cases that are
related and influence it.
The dataset used for training and evaluation is MIT SUN397. The result of the
experiments shows overall 15% increase in accuracy in annotation and 31%
improvement in semantical saliency of the annotated concepts.

Member of

FAU Theses and Dissertations