Drexel University Home Pagewww.drexel.edu DREXEL UNIVERSITY LIBRARIES HOMEPAGE >>

iDEA: Drexel E-repository and Archives > Drexel Academic Community > College of Information Science and Technology > Faculty Research and Publications (IST) > Data mining and predictive modeling of biomolecular network from biomedical literature databases

Please use this identifier to cite or link to this item: http://hdl.handle.net/1860/2013

Title: Data mining and predictive modeling of biomolecular network from biomedical literature databases
Authors: Hu, Xiaohua
Wu, Daniel
Keywords: Biomolecular Network;Semisupervised Learning;Scale-Free Network;Information Extraction;Biological Complexes (Communities)
Issue Date: Apr-2007
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Citation: IEEE/ACM Transactions on Computational Biology and Bioinformatics, 4(2): pp. 251-263 .
Abstract: In this paper, we present a novel approach Bio-IEDM (Biomedical Information Extraction and Data Mining) to integrate text mining and predictive modeling to analyze biomolecular network from biomedical literature databases. Our method consists of two phases. In phase 1, we discuss a semisupervised efficient learning approach to automatically extract biological relationships such as protein-protein interaction, protein-gene interaction from the biomedical literature databases to construct the biomolecular network. Our method automatically learns the patterns based on a few user seed tuples and then extracts new tuples from the biomedical literature based on the discovered patterns. The derived biomolecular network forms a large scale-free network graph. In phase 2, we present a novel clustering algorithm to analyze the biomolecular network graph to identify biologically meaningful subnetworks (communities). The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters with different density level. The experimental results indicate our approach is very effective in extracting biological knowledge from a huge collection of biomedical literature. The integration of data mining and information extraction provides a promising direction for analyzing the biomolecular network.
URI: http://dx.doi.org/10.1109/TCBB.2007.070211
Appears in Collections:Faculty Research and Publications (IST)

Files in This Item:

File Description SizeFormat
2007005058.pdf3.75 MBAdobe PDFView/Open
View Statistics

Items in iDEA are protected by copyright, with all rights reserved, unless otherwise indicated.


Valid XHTML 1.0! iDEA Software Copyright © 2002-2010  Duraspace - Feedback