Title: Retrieval using document structure and annotations
Abstract: Successful retrieval of information from text collections requires effective use of the information present in a collection. The structure of documents in the collection and the relationships between elements within a document and other documents contain important information about the meaning of these elements. For example, the words present in the title of a web page may contain important clues about that page's content. The text of a link to the web page may also be an important indicator of the page's content.
Researchers have long recognized that structure can be an important indicator of relevance. Yet the majority of prior work is limited to experiments on small test collections and evaluated on a single retrieval task. These limitations hamper the generality of the conclusions. The recent construction of large and diverse test collections provides us the opportunity to reconsider the general task of retrieval in collections with structure.
This dissertation draws on three retrieval tasks to identify important properties of retrieval systems supporting the use of structure and annotations. We investigate known-item finding of web pages, retrieving elements from XML articles, and the retrieval of answer-bearing sentences as a component of a question-answering system. The retrieval model, an adaptation of the Inference Network model, clarifies the query language and simplifies the process of smoothing using multiple representations. The experiments in this dissertation show state-of-the-art results for these tasks and also provide novel insights to the shape of the parameter space when using mixtures of language models. Our experiments with question-answering further show how semantic predicates automatically annotated on a collection can be used to improve a system's ability to retrieve answer-bearing sentences.
Publication Year: 2010
Publication Date: 2010-01-01
Language: en
Type: article
Access and Citation
Cited By Count: 5
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot