|
iDEA: Drexel E-repository and Archives >
Drexel Academic Community >
College of Information Science and Technology >
Faculty Research and Publications (IST) >
Relation-based document retrieval for biomedical literature databases
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1860/919
|
| Title: | Relation-based document retrieval for biomedical literature databases |
| Authors: | Zhou, Xiaohua Hu, Xiaohua Lin, Xia Han, Hyoil Zhang, Xiaodan |
| Issue Date: | Apr-2006 |
| Publisher: | Springer Verlag |
| Citation: | Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, April 9-12, 2006, Singapore (Lecture Notes in Computer Science 3918: http://www.springerlink.com/link.asp?id=t85674vqg783). Retrieved 6/26/2006 from http://www.ischool.drexel.edu/faculty/thu/My%20Publication/Conference-papers/DASFFA06.pdf. |
| Abstract: | In this paper, we explore the use of term relations in information retrieval
for precision-focused biomedical literature search. A relation is defined as
a pair of two terms which are semantically and syntactically related to each
other. Unlike the traditional “bag-of-word” model for documents, our model
represents a document by a set of sense-disambiguated terms and their binary
relations. Since document level co-occurrence of two terms, in many cases, does
not mean this document addresses their relationship s, the direct use of relation
may improve the precision of very specific search, e.g. searching documents
that mention genes regulated by Smad4. For this purpose, we develop a generic
ontology-based approach to extract terms and their relations; a prototyped IR
system supporting relation-based search is then built for Medline abstract
search. We then use this novel IR system to improve the retrieval result of all
official runs in TREC-2004 Genomics Track. The experiment shows promising
performance of relation-based IR. The mean of P@100 (the precision of top
100 documents) for all 50 topics is raised from 26.37 %( the P@100 of the best
run is 42.10%) to 53.69% while the recall is kept at an acceptable level of
44.31%. The experiment also shows the expressiveness of relations for the representation
of information needs, especially in the area of biomedical literature
full of various biological relations. |
| URI: | http://hdl.handle.net/1860/919 |
| Appears in Collections: | Faculty Research and Publications (IST)
|
Items in iDEA are protected by copyright, with all rights reserved, unless otherwise indicated.
|