Title: Information retrieval through textual annotations
Abstract: Many well-known information retrieval models rank documents using scores derived from the query and document text. Some other models integrate query-independent evidence, such as indegree or URL structure, typically as prior probabilities. This thesis proposes a model based on a third category of evidence, external textual annotations. The model considers both the degree of match between queries and annotations and the degree of association between annotations and documents. The annotation retrieval model accommodates anchortext as well as additional sources: click-associated queries, folksonomy ‘tags’ and microblogs.1 The characteristics of these types of annotations and their potential for use in information retrieval are studied. In previous work, annotations have been appended to the document text or treated as a field of the document. A new index structure, designed to support the efficient evaluation of queries with the new model, is evaluated. Secondary storage and query processing resources required for the annotations are reduced while preserving result quality. A useful source of annotations is anchortext, but anchortext can be gathered only after links have been created. Click-associated queries and folksonomy tags also take time to accumulate and may not provide up to the minute evidence. However, microblog posts, by their nature, relate to contemporary events and may provide a good method of finding new documents on the web. A method of collecting and conducting a new-web search system is proposed. The contributions of the thesis are as follows. First, an investigation into the characteristics of the different types of annotations, particularly folksonomy tags and microblogs. Second, a model and investigation of a practical data structure for utilising annotation data in a search system. Third, an evaluation method for real-world retrieval systems. Finally, a method of collecting evidence for new-web search based on annotations.
Publication Year: 2012
Publication Date: 2012-01-01
Language: en
Type: dissertation
Access and Citation
Cited By Count: 1
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot