RSNA 2011 

Abstract Archives of the RSNA, 2011


LL-INS-TH7A

What Is My Miss Rate? Automating Quality Assurance by Using Natural Language Processing to Calculate Radiologist Performance from Unstructured Reports in the RIS

Scientific Informal (Poster) Presentations

Presented on December 1, 2011
Presented as part of LL-INS-TH: Informatics

Participants

Bao H. Do MD, Presenter: Nothing to Disclose
Rashad Daker, Abstract Co-Author: Nothing to Disclose
Sandip Biswal MD, Abstract Co-Author: Co-founder, SiteOne Therapeutics Inc Consultant, General Electric Company Stockholder, Atreus Pharmaceuticals, Inc

PURPOSE

Various practices are used for evaluating the accuracy of radiology interpretation, from personal record keeping logs (manual) to group-based structured systems using standard forms (semi-automated). Efforts to close the "interpretation - feedback" loop are essential and even federally mandated (Mammography Quality Standards Act). However, manual systems can be time consuming and not scalable.  The purpose this work is to develop a natural language processor (NLP) to automatically extract relevant metrics from unstructured reports to calculate radiologist performance statistics. The overall accuracy of radiologists in assessing osteoporosis by x-ray using DEXA as a gold standard is selected as a use case.

METHOD AND MATERIALS

We developed and validated an NLP to extract demineralization concepts (osteopenia and osteoporosis) from unstructured x-ray (primary interpretation) and DEXA (outcome data) reports from our institution between 1/1/2005 to 12/31/2008 (548,285 x-ray and 8,776 DEXA reports). The NLP was iteratively trained using 300 DEXA and pelvic x-ray reports to accept an unstructured x-ray or DEXA report as input and automatically classify x-ray reports as normal or abnormal bony mineralization and DEXA reports as normal, osteopenia, or osteoporosis. Based on this, radiologist performance in assessing osteoporosis on pelvic xrays using DEXA as a gold standard was automatically calculated. A radiologist manually reviewed all NLP interpretations.

RESULTS

1,910 pelvic x-ray reports were analyzed. The NLP correctly classified all reports (154 true positive, 1,756 true negative). 1,496 DEXA reports were analyzed. The NLP correctly classified 99.9% (1,495 / 1,496) of the reports. The sensitivity, specificty, positive predictive value, negative predictive value, and accuracy for extracting demineralization concepts were 99.9, 100, 100, 99.7, and 99.9%, respectively.

CONCLUSION

We have developed and validated an NLP to automatically calculate a radiologist’s performance for a specific diagnosis (osteoporosis) using unstructured reports. However, application of NLP to calculate quality assurance statistics from unstructured reports in the RIS will be challenging for more lexical and syntactically complex outcome metrics (ie pathology reports).

CLINICAL RELEVANCE/APPLICATION

There may be potential utility of NLP for estimating radiologists’ performance using automated methods to extract data from unstructured reports in the RIS.

Cite This Abstract

Do, B, Daker, R, Biswal, S, What Is My Miss Rate? Automating Quality Assurance by Using Natural Language Processing to Calculate Radiologist Performance from Unstructured Reports in the RIS.  Radiological Society of North America 2011 Scientific Assembly and Annual Meeting, November 26 - December 2, 2011 ,Chicago IL. http://archive.rsna.org/2011/11034450.html