Abstract Archives of the RSNA, 2011
Automated Capture of Pulmonary Embolism Spatial Location in Dictated Reports Using the ConText Algorithm
Presented on November 27, 2011
Richard Wilson MS, Presenter: Nothing to Disclose
Brian E Chapman PhD, Abstract Co-Author: Nothing to Disclose
Automated systems which rely on clinical records are routinely challenged with imperfections, ambiguities and complexity of human language. This pilot study aims to perfect an algorithm which automatically extracts spatial location descriptions of positively identified pulmonary embolism (PE) findings from free-text reports. This study is motivated by our overarching project, which seeks to improve image search through disease and similarity driven models.
In order to automatically extract the spatial locations of PE findings we used natural language processings techniques and a modification of a existing algorithm, ConText (developed by the co-author).
Our data set included 200 de-identified CT PE studies. The impression section of 200 exams was evaluated by ConText modified to also extract spatial locations. This pilot study focused on lung identification such as ‘left superior lobe’ or ‘bilateral lobes.’
Of the 200 reports, there were 129 instances of non-negated PEs. These 129 system responses were then graded for accuracy, precision, recall and F-measure. Pilot study results were as follows: overall accuracy - 0.577; precision – 0.651; recall – 0.444; and F-measure - .0528. While initial results are low, lessons learned from this pilot run are encouraging.
Error analysis reveals key areas for algorithm improvement. Thirty-five responses of the 129 were false negatives, primarily due to the regular expression or sentence splitting failure, and lack of spatial description phrase representation in the development set. Of the 15 false positives, most errors stemmed not from the lack of capturing spatial data, but failing to capture the complete description due to complex sentence structure. Most of these errors can easily be correct with regular expression modifications in ConText yielding near perfect results.
This pilot study demonstrated the ability of an automated system to extract spatial locations of pulmonary embolisms in free-text reports. While initial results were average, lessons learned and significantly improve the algorithm.
Automated Capture of Pulmonary Embolism Spatial Location in Dictated Reports Using the ConText Algorithm. Radiological Society of North America 2011 Scientific Assembly and Annual Meeting, November 26 - December 2, 2011 ,Chicago IL. http://archive.rsna.org/2011/11016603.html