Automatic Classification and Reporting of Multiple Common Thorax Diseases Using Chest Radiographs

Sunday, Nov. 25 11:55AM - 12:05PM Room: S406B

Xiaosong Wang, PhD, Bethesda, MD (Presenter) Nothing to Disclose
Yifan Peng, Bethesda, MD (Abstract Co-Author) Nothing to Disclose
Le Lu, PhD, Bethesda, MD (Abstract Co-Author) Nothing to Disclose
Zhiyong Lu, Bethesda, MD (Abstract Co-Author) Nothing to Disclose
Ronald M. Summers, MD,PhD, Bethesda, MD (Abstract Co-Author) Royalties, iCAD, Inc; Royalties, Koninklijke Philips NV; Royalties, ScanMed, LLC; Research support, Ping An Insurance Company of China, Ltd; Researcher, Carestream Health, Inc; Research support, NVIDIA Corporation; ; ; ;

For information about this presentation, contact:


Chest radiographs are one of the most common radiological exams in daily clinical routines. Reporting thorax diseases using chest radiographs is often an entry-level task for radiologist trainees, but it remains a challenging job for learning-oriented machine intelligence. It's due to the shortage of large-scale well-annotated medical image datasets and lack of techniques that can mimic the high-level reasoning of human radiologists. In this work, we show that clinical free-text radiological reports can be utilized as a priori knowledge for tackling these two difficult problems.


We used a hospital-scale chest radiograph dataset, which consists of 112,120 frontal-view radiographs of 30,805 patients. 14 disease labels observed in images were mined using natural language processing techniques, i.e., atelectasis, cardiomegaly, effusion, infiltrate, mass, nodule, pneumonia, pneumothorax, consolidation, edema, emphysema, fibrosis, pleural thickening, and hernia. We propose a novel text-image embedding neural network (illustrated in the attached figure) for extracting the distinctive image and text representations. Multilevel attention models are integrated into an end-to-end trainable architecture for highlighting the meaningful text words and image regions. We first apply this combined convolutional and recurrent neural network (CNN-RNN) to classify the image by using both image features and text embeddings from associated reports. Furthermore, we transform the framework into a radiograph reporting system by taking only images as input and turning RNN into a generative model.


The proposed framework achieves high accuracy (0.960.03 in AUCs) in disease classification using both images and reports on an unseen and hand-labeled dataset (OpenI, 3,643 images). When using only the images as input, the system can also produce significantly improved results (0.800.07 in AUCs) compared to the state-of-the-art (0.740.08) with a p-value=0.0005. The figure shows sample classification results with generated reports (attended words in red).


We illustrate a framework for fully-automated classification and reporting of common thorax diseases in chest radiographs and demonstrate its superior performance compared to the state-of-the-art.


The proposed multi-purpose CADx system can be applied for automatic classification and reporting of common thoracic diseases as a second opinion.