RSNA 2019

Abstract Archives of the RSNA, 2019


SSM07-01

Can AI Outperform a Junior Resident? Comparison of Deep Neural Network to First-Year Radiology Residents for Identification of Pneumothorax

Wednesday, Dec. 4 3:00PM - 3:10PM Room: N228



Awards
Trainee Research Prize - Resident

Participants
Paul H. Yi, MD, Baltimore, MD (Presenter) Nothing to Disclose
Tae Kyung Kim, Baltimore, MD (Abstract Co-Author) Nothing to Disclose
Alice Yu, MD, Baltimore, MD (Abstract Co-Author) Nothing to Disclose
Bradford Bennett, MD, Baltimore, MD (Abstract Co-Author) Nothing to Disclose
John Eng, MD, Cockeysville, MD (Abstract Co-Author) Nothing to Disclose
Cheng Ting Lin, MD, Baltimore, MD (Abstract Co-Author) Nothing to Disclose

For information about this presentation, contact:

pyi10@jhmi.edu

PURPOSE

To develop a deep learning system for identification of pneumothorax and compare its performance to that of two 1st-year radiology residents.

METHOD AND MATERIALS

We obtained 112,120 frontal chest radiographs (CXRs) from the NIH ChestX-ray 14 database, of which 4360 cases (4%) had been labeled as pneumothorax by natural language processing. We utilized 111,494 CXRs to train and validate the ResNet-152 deep convolutional neural network (DCNN) pretrained on ImageNet to identify pneumothorax. DCNN testing was performed on a hold-out set of 602 CXRs (176 with pneumothorax and 426 without), whose groundtruth was determined by re-interpretation by a cardiothoracic radiologist with 5 years of post-fellowship experience; images were presented at 1024 x 1024 resolution and had a mix of both subtle and more obvious pneumothoraces. Two 1st-year radiology residents (PGY-2) independently evaluated the same 626 test CXRs for the presence of pneumothorax using a 6-point Likert scale to reflect levels of confidence ranging from low to intermediate to high. Receiver operating characteristic (ROC) curves were generated for the DCNN and 2 residents with area under the curve (AUC) calculated to evaluate test performance. AUCs were compared using the DeLong parametric method (significance defined as p<0.05).

RESULTS

The best-performing DCNN achieved AUC of 0.841 for identification of pneumothorax at a rate of 1980 images/minute. In contrast, both 1st-year residents achieved significantly higher AUCs of 0.942 and 0.905 (p<0.01 for both compared to DCNN; Figure 1), but at a slower rate of 2 images/minute.

CONCLUSION

Our DCNN for pneumothorax identification achieved significantly lower test AUC than two 1st-year radiology residents. However, the DCNN was able to interpret images >1000x as fast. Further work is warranted to compare the relative performance of AI to radiologists of varying levels, and the relative benefits of image interpretation speed to accuracy, particluarly for use in time-sensitive settings like the Emergency Department.

CLINICAL RELEVANCE/APPLICATION

1st-year radiology residents outperformed a deep learning system for pneumothorax detection, but the deep learning system interpreted images >1000x faster.

Printed on: 03/01/22