Title: Naive (Bayes) at forty: The independence assumption in information retrieval
Abstract: The naive Bayes classifier, currently experiencing a renaissance ] in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assumptions made about word occurrences in documents.