Title: Influencing clinicians and healthcare managers: can ROC be more persuasive?
Abstract: Receiver Operating Characteristic analysis provides a reliable and cost effective performance measurement tool, without using full clinical trials. However, when ROC analysis shows that performance is statistically superior in one condition than another it is difficult to relate this result to effects in practice, or even to determine whether it is clinically significant. In this paper we present two concurrent analyses: using ROC methods alongside single threshold recall rate data, and suggest that reporting both provides complimentary data. Four mammographers read 160 difficult cases (41% malignant) twice, with and without prior mammograms. Lesion location and probability of malignancy was reported for each case and analyzed using JAFROC. Concurrently each participant chose recall or return to screen for each case. JAFROC analysis showed that the presence of prior mammograms improved performance (p<.05). Single threshold data showed a trend towards a 26% increase in the number of false positive recalls without prior mammograms (p=.056). If this trend were present throughout the NHS Breast Screening Programme then discarding prior mammograms would correspond to an increase in recall rate from 4.6% to 5.3%, and 12,414 extra women recalled annually for assessment. Whilst ROC methods account for all possible thresholds of recall and have higher power, providing a single threshold example of false positive, false negative, and recall rates when reporting results could be more influential for clinicians. This paper discusses whether this is a useful additional method of presenting data, or whether it is misleading and inaccurate.