Drexel University Home Pagewww.drexel.edu DREXEL UNIVERSITY LIBRARIES HOMEPAGE >>
iDEA DREXEL ARCHIVES >>

iDEA: Drexel E-repository and Archives > Drexel Theses and Dissertations > Drexel Theses and Dissertations > Feasibility of using citations as document summaries

Please use this identifier to cite or link to this item: http://hdl.handle.net/1860/288

Title: Feasibility of using citations as document summaries
Authors: Hand, Jeff
Keywords: Information science;Content analysis (Communication);Abstracting
Issue Date: 26-May-2004
Abstract: The purpose of this research is to establish whether it is feasible to use citations as document summaries. People are good at creating and selecting summaries and are generally the standard for evaluating computer generated summaries. Citations can be characterized as concept symbols or short summaries of the document they are citing. Similarity metrics have been used in retrieval and text summarization to determine how alike two documents are. Similarity metrics have never been compared to what human subjects think are similar between two documents. If similarity metrics reflect human judgment, then we can mechanize the selection of citations that act as short summaries of the document they are citing. The research approach was to gather rater data comparing document abstracts to citations about the same document and then to statistically compare those results to several document metrics; frequency count, similarity metric, citation location and type of citation. There were two groups of raters, subject experts and non-experts. Both groups of raters were asked to evaluate seven parameters between abstract and citations: purpose, subject matter, methods, conclusions, findings, implications, readability, and understandability. The rater was to identify how strongly the citation represented the content of the abstract, on a five point likert scale. Document metrics were collected for frequency count, cosine, and similarity metric between abstracts and associated citations. In addition, data was collected on the location of the citations and the type of citation. Location was identified and dummy coded for introduction, method, discussion, review of the literature and conclusion. Citations were categorized and dummy coded for whether they refuted, noted, supported, reviewed, or applied information about the cited document. The results show there is a relationship between some similarity metrics and human judgment of similarity.
URI: http://dspace.library.drexel.edu/handle/1860/288
Appears in Collections:Drexel Theses and Dissertations

Files in This Item:

File Description SizeFormat
hand_jeff_thesis.pdf1.16 MBAdobe PDFView/Open
View Statistics

Items in iDEA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! iDEA Software Copyright © 2002-2010  Duraspace - Feedback