Hurtado, Jose Luis

Relationships
Member of: Graduate College
Person Preferred Name
Hurtado, Jose Luis
Model
Digital Document
Publisher
Florida Atlantic University
Description
Effective decision support plays vital roles in people's daily life, as well as for
professional practitioners such as health care providers. Without correct information
and timely derived knowledge, a decision is often suboptimal and may result in signi
cant nancial loss or compromises of the performance. In this dissertation, we
study text mining and topic modeling and propose to use text mining methods, in
combination with topic models, to discover knowledge from texts popularly available
from a wide variety of sources, such as research publications, news, medical diagnose
notes, and further employ discovered knowledge to assist social and medical decision
support. Examples of such decisions include hospital patient readmission prediction,
which is a national initiative for health care cost reduction, academic research topics
discovery and trend modeling, and social preference modeling for friend recommendation
in social networks etc.
To carry out text mining, our research, in Chapter 3, first emphasizes on single
document analyzing to investigate textual stylometric features for user pro ling and
recognition. Our research confirms that by using properly designed features, it is
possible to identify the authors who wrote the article, using a number of sample articles written by the author as the training data. This study serves as the base to
assert that text mining is a powerful tool for capturing knowledge in texts for better
decision making.
In the Chapter 4, we advance our research from single documents to documents
with interdependency relationships, and propose to model and predict citation
relationship between documents. Given a collection of documents with known linkage
relationships, our research will discover e ective features to train prediction models,
and predict the likelihood of two documents involving a citation relationships. This
study will help accurately model social network linkage relationships, and can be used
to assist e ective decision making for friend recommendation in social networking, and
reference recommendation in scienti c writing etc.
In the Chapter 5, we advance a topic discovery and trend prediction principle
to discover meaningful topics from a set of data collection, and further model the
evolution trend of the topic. By proposing techniques to discover topics from text,
and using temporal correlation between trend for prediction, our techniques can be
used to summarize a large collection of documents as meaningful topics, and further
forecast the popularity of the topic in a near future. This study can help design
systems to discover popular topics in social media, and further assist resource planning
and scheduling based on the discovered topics and the their evolution trend.
In the Chapter 6, we employ both text mining and topic modeling to the
medical domain for effective decision making. The goal is to discover knowledge from
medical notes to predict the risk of a patient being re-admitted in a near future.
Our research emphasizes on the challenge that re-admitted patients are only a small
portion of the patient population, although they bring signficant financial loss. As
a result, the datasets are highly imbalanced which often result in poor accuracy for
decision making. Our research will propose to use latent topic modeling to carryout
localized sampling, and combine models trained from multiple copies of sampled data for accurate prediction. This study can be directly used to assist hospital re-admission
assessment for early warning and decision support.
The text mining and topic modeling techniques investigated in the dissertation
can be applied to many other domains, involving texts and social relationships,
towards pattern and knowledge based e ective decision making.
Model
Digital Document
Publisher
Florida Atlantic University
Description
We survey and compare the different major mechanisms for embedding the relational database language SQL in object-oriented programming languages such as Java and C#, with regard to how much impedance mismatch these embeddings suffer. Here impedance mismatch refers to clarity and performance difficulties that arise because of the nature of the embedding. Because of the central position in the information technology industry of object-oriented programs that access SQL-based relational database systems, reducing impedance mismatch is generally recognized in that industry as an important practical problem. We argue for the suitability of SQL as a database language, and hence for the desirability of keeping SQL as the view provided by a SQL embedding. We make the case that SQLJ, a SQL embedding for Java in which it appears that Java directly supports SQL commands, is the kind of SQL embedding that suffers the least impedance mismatch, when compared with call-level interfaces and object-relational mappings. We propose extensions to SQLJ that would reduce its impedance mismatch even further.