Abstract: As participants in this CLEF evaluation campaign, our first objective is to propose and evaluate various indexing and search strategies for the CHiC corpus, in order to compare the retrieval effectiveness across different IR mod- els. Our second objective is to measure the relative merit of various stemming strategies when used for the French and English monolingual task in the CH context. Our third objective is to assess the effectiveness of query translation methods in a bilingual retrieval. To do so we evaluated the CHiC test- collections using Okapi, various IR models derived from the Divergence from Randomness (DFR) paradigm together with the dtu-dtn vector-space model. We also evaluated different pseudo-relevance feedback approaches. In the bi- lingual task, we conducted our search on the English corpus using the French and German topics with two different translations for each of them. For both English and French languages, we find that word-based indexing with our light stemming procedure results in better retrieval effectiveness than with other strategies. When ignoring stemming, the performance variations were relative- ly small yet for the French corpus better than applying a light stemmer. In bilin- gual level results show that using a combination of translation resources gives better results than a single source.
Publication Year: 2012
Publication Date: 2012-01-01
Language: en
Type: article
Access and Citation
Cited By Count: 4
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot