ParticipantsEric Wu, Cambridge, MA (Presenter) Employee, DeepHealth, Inc
Kevin Wu, Cambridge, MA (Abstract Co-Author) Employee, DeepHealth, Inc
A. Gregory Sorensen, MD, Belmont, MA (Abstract Co-Author) Employee, DeepHealth, Inc Board member, IMRIS Inc Board member, Siemens AG Board member, Fusion Healthcare Staffing Board member, DFB Healthcare Acqusitions, Inc Board member, inviCRO, LLC
William Lotter, PhD, Cambridge, MA (Abstract Co-Author) Officer, DeepHealth Inc
eric.wu@deep.health
PURPOSEMachine learning has shown great promise in cancer detection in x-ray mammography; however, these approaches are typically dependent on large numbers of malignant and normal examples. Data collection is challenging in screening applications, where the amount of normal examples greatly outnumber abnormals, which can cause overfitting and under-utilization of the available data, and thus hindering ultimate performance. Here, we explore using the machine learning approach known as generative adversarial networks (GANs) as a data augmentation strategy for synthesizing and removing lesions in mammogram images to supplement the original training set.
METHOD AND MATERIALSWe started with the Optimam Mammography Image Database, a publically available FFDM x-ray mammography dataset from the UK. We use 16000 images for training (800 with cancer ROIs), 2400 for validation (120 with cancer ROIs), and 6000 for testing (800 with cancer ROIs). We created a custom GAN model to synthesize lesions (5000 masses and 5000 calcifications) or remove lesions (5000 normals) onto random patches cropped from mammograms. We then trained a ResNet-50 neural network model using a sampling proportion of 50% synthetic data and 50% real data, and evaluated performance on entirely real data from the testing dataset. Performance is quantified using the area under a receiver operating characteristic curve (AUROC). To determine whether the synthetic data affected performance, we compared this new model trained on both real and synthetic data to a baseline model trained only on real data.
RESULTSThe classifier trained on the GAN-augmented dataset achieved an AUROC of 0.853 on the test set of real data, compared to 0.829 AUROC for the model trained on only real data, for a difference of 0.024 (p < 1e-8). Visual inspection of the GAN outputs suggests that the GAN is indeed capable of realistically inserting and removing lesions in the mammogram patches.
CONCLUSIONSynthetically generated data using GANs improved the performance of a model trained on both real and synthetic data over a model trained only on real data. This suggests that data augmentation with appropriately designed GANs could be a valuable method for improving the performance of AI-based cancer detection in mammograms.
CLINICAL RELEVANCE/APPLICATIONImproved classification accuracy of machine learning models applied to mammography increases their potential for effective clinical deployment.