Title: Subject classification in the Oxford English Dictionary
Abstract:The Oxford English Dictionary is a valuable source of lexical information and a rich testing ground for mining highly structured text. Each entry is organized into a hierarchy of senses, which include...The Oxford English Dictionary is a valuable source of lexical information and a rich testing ground for mining highly structured text. Each entry is organized into a hierarchy of senses, which include definitions, labels and cited quotations. Subject labels distinguish the subject classification of a sense, for example they signal how a word may be used in anthropology, music or computing. Unfortunately subject labeling in the dictionary is incomplete. To overcome this incompleteness, we attempt to classify the senses (i.e., definitions) in the dictionary by their subjects, using the citations as an information guide. We report on four different approaches: k nearest neighbors, a standard classification technique; term weighting, an information retrieval method dealing with text; naive Bayes, a probabilistic method; and expectation maximization, an iterative probabilistic method. Experimental performance of these methods is compared based on standard classification metrics.Read More
Publication Year: 2002
Publication Date: 2002-11-14
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 13
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot