{"218":0,"2429":0,"2430":0,"2432":0,"2433":0,"2434":0,"2435":0}
Site Home
Site Home
Drexel University Libraries
Drexel University
Contact Us
å
iDEA: DREXEL LIBRARIES E-REPOSITORY AND ARCHIVES
iDEA: DREXEL LIBRARIES E-REPOSITORY AND ARCHIVES
Main sections
Main menu
Home
Search
Collections
Names
Subjects
Titles
About
You are here
Home
/
Islandora Repository
/
Theses, Dissertations, and Projects
/
Generative topic modeling in image data mining and bioinformatics studies
Generative topic modeling in image data mining and bioinformatics studies
Details
Title
Generative topic modeling in image data mining and bioinformatics studies
Author(s)
Chen, Xin
Advisor(s)
Hu, Xiaohua
Keywords
Information science
;
Data mining
;
Bioinformatics
Date
2012-12
Publisher
Drexel University
Thesis
Ph.D., Information Systems -- Drexel University, 2012
Abstract
Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model is proposed for co-existing image features and annotations, in which new latent variables are introduced to allow for more flexible sampling of word topics and visual topics. A perspective hierarchical Dirichlet process (pHDP) model is proposed to deal with user-tagged image modeling, associating image features with image tags and incorporating the user’s perspectives into the image tag generation process. It’s also shown that in mining large scale text corpora of natural language descriptions, the relation between semantic visual attributes and object categories can be encoded as Must-Links and Cannot-Links, which can be represented by Dirichlet-Forest prior. Novel generative topic models are also introduced to meta-genomics studies. The experimental results show that the generative topic model can be used to model the taxon abundance information obtained by the homology-based approach and study the microbial core. It also shows that latent topic modeling can be used to characterize core and distributed genes within a species and to correlate similarities between genes and their functions. A further study on the functional elements derived from the non-redundant CDs catalogue shows that the configuration of functional groups encoded in the gene-expression data of meta-genome samples can be inferred by applying probabilistic topic modeling to functional elements. Furthermore, an extended HDP model is introduced to infer functional basis from detected enterotypes. The latent topics estimated from human gut microbial samples are evidenced by the recent discoveries in fecal microbiota study, which demonstrate the effectiveness of the proposed models.
URI
http://hdl.handle.net/1860/3969
In Collections
Theses, Dissertations, and Projects
/islandora/object/idea%3A3969/datastream/OBJ/view
Search iDEA
All formats
Search by:
Keyword
Name
Subject
Title
Advanced Search
My Account
Login