RSNA 2007 

Abstract Archives of the RSNA, 2007


SSE01-03

Calculating Mammography Performance at the Examination versus Finding Level Significantly Affects Outcome Measures

Scientific Papers

Presented on November 26, 2007
Presented as part of SSE01: Breast Imaging (Mammography)

Participants

Elizabeth S. Burnside MD, MPH, Presenter: Nothing to Disclose
Jagpreet Chhatwal MS, Abstract Co-Author: Nothing to Disclose
Kazuhiko Shinki, Abstract Co-Author: Nothing to Disclose
Katherine Anne Shaffer MD, Abstract Co-Author: Nothing to Disclose
Lonie R. Salkowski MD, Abstract Co-Author: Nothing to Disclose
Jason P. Fine, Abstract Co-Author: Nothing to Disclose
Oguzhan Alagoz PhD, Abstract Co-Author: Nothing to Disclose
et al, Abstract Co-Author: Nothing to Disclose
et al, Abstract Co-Author: Nothing to Disclose

PURPOSE

Conventional mammography performance analysis is executed at the examination level while radiologist interpretation occurs largely at the finding level for determination of biopsy. We sought to determine how the level of analysis used to calculate mammography practice performance metrics may influence outcome measures.

METHOD AND MATERIALS

The institutional review board approved our study and determined it was exempt from requiring informed consent. We used structured reports from 45,845 mammography examinations on 17,784 patients from 4/5/1999 to 2/9/2004 capturing 61,930 total mammography findings. Each mammography finding was coded using BI-RADS categories. Using the National Mammography Database format, these data were successfully matched with the Wisconsin Cancer Reporting System which served as the reference standard. We aggregated all of the reported findings for each mammogram examination into an overall examination BI-RADS code as recommended by the ACR: 5 > 4 > 0 > 3 > 2 > 1. BI-RADS categories 0, 4 and 5 were considered positive. We compared practice performance using sensitivity and specificity between the examination and finding levels with the z-test. To account for correlations within patients we calculated standard errors with 250 bootstrap samples.

RESULTS

Sensitivity was significantly better at the examination level than at the finding level (88.9% versus 85.4%, P = 0.011). In contrast, specificity was significantly worse at the examination level than at the finding level (86.6% vs. 88.0%, P < 0.001).

CONCLUSION

The conventional method of calculating performance metrics at the examination rather than the finding level appears to inflate sensitivity at the cost of specificity. It is possible that a proportion of positive examinations may identify findings that do not correspond to a subsequently diagnosed breast cancer. Such a case would be counted as true positive on the examination level but more appropriately as false negative on the finding level.

CLINICAL RELEVANCE/APPLICATION

Further study of mammography audit methodology is warranted to expose possible biases. Given these results, methods used to calculate performance metrics should be explicitly documented.

Cite This Abstract

Burnside, E, Chhatwal, J, Shinki, K, Shaffer, K, Salkowski, L, Fine, J, Alagoz, O, et al, , et al, , Calculating Mammography Performance at the Examination versus Finding Level Significantly Affects Outcome Measures.  Radiological Society of North America 2007 Scientific Assembly and Annual Meeting, November 25 - November 30, 2007 ,Chicago IL. http://archive.rsna.org/2007/5007703.html