Abstract Archives of the RSNA, 2014
SSQ11-04
EXTraction of Numerical Data (EXTND): A Novel Tool to EXTEND Clinical Radiology Research Using Automated Numerical Data Collection
Scientific Papers
Presented on December 4, 2014
Presented as part of SSQ11: Informatics (Results and Reporting)
Tianrun Cai MD, Presenter: Nothing to Disclose
Kanako Kunishima Kumamaru MD, PhD, Abstract Co-Author: Nothing to Disclose
Amir Imanzadeh MD, Abstract Co-Author: Nothing to Disclose
Elizabeth George MD, Abstract Co-Author: Nothing to Disclose
Ruth M. Dunne MBBCh, Abstract Co-Author: Nothing to Disclose
Frank John Rybicki MD, PhD, Abstract Co-Author: Research Grant, Toshiba Corporation
Carlos J. Gonzalez Quesada MD, Abstract Co-Author: Nothing to Disclose
Zoha Hussain, Abstract Co-Author: Nothing to Disclose
Andetta Rotilla Hunsaker MD, Abstract Co-Author: Nothing to Disclose
Arash Bedayat MD, Abstract Co-Author: Nothing to Disclose
Rani S. Sewatkar MBBS, Abstract Co-Author: Nothing to Disclose
Numerical data (eg, blood pressure, heart rate) recorded in the Electronic Medical Record (EMR) are important information in radiology clinical outcomes research. The purpose of the study was to develop and validate EXTND, a novel tool that automatically collects important numerical data through the processing medical reports.
Software design
EXTND was written in-house using Python. Pattern matching, word segmentation, and lexical analysis were the main technologies used.
1: Standardize report format
2: Build a list of abbreviations by using Unified Medical Language System
3: Process a medical report using the module of Natural Language Toolkit and search relevant key words
4: Extract numerical data following the key words and send them to a set of functions to perform validity testing in terms of normal ranges, value structures, and units
5: Collect validated numerical values
Software application
A total of 69,406 free-text medical records in the hospital EMR database for the 2070 consecutive patients (08/2003-05/2010) with acute pulmonary embolism diagnosed with CT pulmonary angiography at a single, large, teaching hospital were evaluated using EXTND. Heart rate, blood pressure, temperature, respiratory rate, and oxygen saturation measured at the time closest to the CT acquisition were collected for all patients.
Software validation
Manual review of 285 documents (from the 69,406 above) from a randomly selected sub-cohort of 149 patients was performed. The accuracy of EXTND was assessed using the manual EMR review as reference standard.
For all 2070 patients, EXTND rapidly and effectively acquired the data elements. Using the manual data as reference standard, the positive predictive value (PPV) and sensitivity (with standard errors) were as follows:
PPV Sensitivity
Heart rate 0.953 (0.016) 0.970 (0.013)
Blood pressure 0.911 (0.022) 1.000 (0)
Temperature 0.942 (0.021) 0.991 (0.009)
Respiratory rate 0.988 (0.021) 0.991 (0.009)
Oxygen saturation 0.938 (0.018) 0.943 (0.008)
EXTND is a novel tool with high accuracy in acquiring clinical numerical parameters that are important in pulmonary embolism outcomes research
EXTraction of Numerical Data (EXTND) that was developed to extract key numerical metrics for pulmonary embolism research can potentially be applied to other clinical radiology research
Cai, T,
Kumamaru, K,
Imanzadeh, A,
George, E,
Dunne, R,
Rybicki, F,
Gonzalez Quesada, C,
Hussain, Z,
Hunsaker, A,
Bedayat, A,
Sewatkar, R,
EXTraction of Numerical Data (EXTND): A Novel Tool to EXTEND Clinical Radiology Research Using Automated Numerical Data Collection. Radiological Society of North America 2014 Scientific Assembly and Annual Meeting, - ,Chicago IL.
http://archive.rsna.org/2014/14004027.html