Title: Interobserver agreement: Cohen's kappa coefficient does not necessarily reflect the percentage of patients with congruent classifications.
Abstract: A widely accepted approach to evaluate interrater reliability for categorical responses involves the rating of n subjects by at least 2 raters. Frequently, there are only 2 response categories, such as positive or negative diagnosis. The same approach is commonly used to assess the concordant classification by 2 diagnostic methods. Depending on whether one uses the percent agreement as such or corrected for that expected by chance, i.e. Cohen's kappa coefficient, one can get quite different values. This short communication demonstrates that Cohen's kappa coefficient of agreement between 2 raters or 2 diagnostic methods based on binary (yes/no) responses does not parallel the percentage of patients with congruent classifications. Therefore, it may be of limited value in the assessment of increases in the interrater reliability due to an improved diagnostic method. The percentage of patients with congruent classifications is of easier clinical interpretation, however, does not account for the percent of agreement expected by chance. We, therefore, recommend to present both, the percentage of patients with congruent classifications, and Cohen's kappa coefficient with 95% confidence limits.
Publication Year: 1997
Publication Date: 1997-03-01
Language: en
Type: article
Indexed In: ['pubmed']
Access and Citation
Cited By Count: 30
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot