Retrospective Detection of Breast Malignancies with Deep Learning in Clinically Negative, Prior Screening Mammograms

Tuesday, Dec. 3 9:25AM - 9:35AM Room: Arie Crown Theater

FDA Discussions may include off-label uses.

Abdul Rahman Diab, Cambridge, MA (Presenter) Employee, DeepHealth, Inc
Jiye G. Kim, PhD, Cambridge, MA (Abstract Co-Author) Employee, DeepHealth, Inc
Mack K. Bandler, MD, Medford, OR (Abstract Co-Author) Nothing to Disclose
A. Gregory Sorensen, MD, Belmont, MA (Abstract Co-Author) Employee, DeepHealth, Inc Board member, IMRIS Inc Board member, Siemens AG Board member, Fusion Healthcare Staffing Board member, DFB Healthcare Acqusitions, Inc Board member, inviCRO, LLC
William Lotter, PhD, Cambridge, MA (Abstract Co-Author) Officer, DeepHealth Inc

For information about this presentation, contact:



To evaluate the ability of a deep learning model to detect breast cancer in clinically negative, prior screening mammograms of breast cancer patients.


Screening FFDM x-ray mammograms from 2011 to 2017 were retrospectively collected under an IRB-approved protocol. Women who had a malignancy on either screening or diagnostic x-ray mammography were identified, and their previous screening mammograms, which were interpreted as normal (BI-RADS 1 or 2) and performed 9 months to 3 years prior to the index cancer diagnosis, were collected. These 'prior' screening mammograms were assigned the label 'malignant'. In addition, a set of screening exams interpreted as BI-RADS 1 or 2, each of which was followed by at least one additional screening exam also interpreted as BI-RADS 1 or 2, was assigned the label 'normal'. The resulting full set consisted of 328 'malignant' cases and 13540 'normal' cases. For evaluation on this dataset, we used a top-scoring deep learning model from the Digital Mammography DREAM Challenge which was not trained on data from this institution. The receiver operating characteristic (ROC) and the corresponding area under the curve (AUC) were quantified.


The model achieved an AUC of 0.70 (+/- 0.03). At an operating point of 88.9% specificity - the mean radiologist level according to the Breast Cancer Surveillance Consortium - the model achieved a sensitivity of 35%. Thus, at a recall rate consistent with clinical practice, the model detected 35% of cancer cases using the prior exams. Using this threshold, the earliest cancer detected was in a screening exam 730 days prior to the index diagnosis.


A deep learning model successfully detected malignancies in a significant number of clinically negative prior screening exams of women diagnosed with breast cancer.


AI-assisted screening mammography has the potential to help physicians detect breast malignancies earlier, which could ultimately improve prognosis.

Printed on: 10/20/20