RSNA 2013 

Abstract Archives of the RSNA, 2013


LL-INE3174-TUA

Automatic Extraction of Patient Characteristics from Clinical Reports

Education Exhibits

Presented on December 3, 2013
Presented as part of LL-INS-TUA: Informatics - Tuesday Posters and Exhibits (12:15pm - 12:45pm)

Participants

Jean Garcia-Gathright, Presenter: Nothing to Disclose
Corey W. Arnold, Abstract Co-Author: Nothing to Disclose
Alex Anh-Tuan Bui MS, PhD, Abstract Co-Author: Nothing to Disclose

BACKGROUND

The extraction of specific data elements from unstructured free-text documents is a critical task for a range of clinical and research activities, including data mining and disease registry construction. To enable such applications for imaging-based application domains, we have developed a set of natural language processing (NLP) annotators for the automatic extraction of patient characteristics and the subsequent population of a database. The use of this framework is demonstrated for lung cancer screening.

EVALUATION

Our input corpus comprises the entire set of medical reports for patients who have undergone a biopsy of an indeterminate lung nodule. We targeted several data elements, including location of tumor, biopsy results, family history of cancer, and smoking history. Extraction performance was evaluated against a manually-annotated gold standard of 112 cases. Precision and recall were as high as 95% for certain data elements, such as location of tumor.

DISCUSSION

An investigation of the input corpus revealed that most of the data elements of interest were found in radiology reports, pathology reports, and oncology consultations. We found that rule-based logic was sufficient for very good annotation performance. Our framework was implemented in Apacha UIMA (Unstructured Information Management Architecture) and includes mechanisms for database querying, section detection, and information extraction based on regular expressions.

CONCLUSION

The successful implementation of these annotators represents an important step in the analysis of unstructured clinical documents. The rules and regular expressions we have developed can be used to further structured reporting templates and other free-text based analyses.  Future work also includes the implementation of interactive web-based visualizations of the extracted data to support integrated radiology/pathology reporting and tumor board meetings.

Cite This Abstract

Garcia-Gathright, J, Arnold, C, Bui, A, Automatic Extraction of Patient Characteristics from Clinical Reports.  Radiological Society of North America 2013 Scientific Assembly and Annual Meeting, December 1 - December 6, 2013 ,Chicago IL. http://archive.rsna.org/2013/13016206.html