Title: Evaluating a Cross-Language Semantically Enriched Search Engine
Abstract: This paper tackles the problem of a user who is capable of reading or using documents written in a specific language, but who is not fluent enough in this specific language to use the right query terms to find the document. The design of Cross-Language Information Retrieval systems started since 1969 by Gerard Salton who enhanced his SMART system to retrieve documents in multiple languages, English and Spanish; however, the translation process is still considered to be a challenging problem. This paper is devoted to the evaluation of a Cross-Language search engine that uses Natural Language Processing techniques as a means of improving the search process of documents provided by two languages, English and Spanish. The research is implemented and evaluated on a real platform HyperManyMedia at Western Kentucky University. The implementation of the Cross-Language search engine follows a synergistic approach between (1) A Thesaurus-based Approach and (2) A Corpus-based Approach. In the case of the Thesaurus-based Approach, we use a simple bilingual listing of terms, phrases, concepts, and subconcepts where the hierarchical structure of the ontology is used to define the relationship between concepts/subconcepts. Also, we use a specific terminology that captures the domain of E-learning; those terms are associated with college name, course name, and lecture name which is presented in two languages. In the case of the Corpus-based Approach, we use the Term Vector Translation approach; the goal is to find statistical information about term usage between the two languages using techniques which map sets of term weights from English to Spanish and vice-versa.
Publication Year: 2010
Publication Date: 2010-01-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 2
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot