Title: Hierarchical Clustering in Medical Document Collections: the BIC-Means Method
Abstract: Hierarchical clustering of text collections is a key problem in document management and retrieval. In partitional hierarchical clustering, which is more effi cient than its agglom- erative counterpart, the entire collection is split into clusters and the individual clusters are further split until a heuristically- motivated termination criterion is met. In this paper, we defi ne the BIC-means algorithm, which applies the Bayesian Infor- mation Criterion (BIC) as a domain independent termination criterion for partitional hierarchical clustering. We evaluate the effectiveness of BIC-means in clustering and retrieval on medi- cal document collections and we propose a dynamic version of the BIC-Means algorithm for adapting an existing clustering solution to document additions.
Publication Year: 2010
Publication Date: 2010-04-01
Language: en
Type: article
Access and Citation
Cited By Count: 12
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot