RSNA 2014 

Abstract Archives of the RSNA, 2014


SSJ13-06

Determining Imaging Characteristics of KRAS Oncogene Mutations in Colon Cancer Using Word Frequency and Naive Bayes Analysis of Radiology Reports

Scientific Papers

Presented on December 2, 2014
Presented as part of SSJ13: Informatics (Business Analytics)

Participants

Siddharth Govindan MD, Presenter: Nothing to Disclose
Quanzheng Li PhD, Abstract Co-Author: Nothing to Disclose
Suvranu Ganguli MD, Abstract Co-Author: Research Grant, Merit Medical Systems, Inc Consultant, Boston Scientific Corporation
Thomas Gregory Walker MD, Abstract Co-Author: Nothing to Disclose
Rahmi Oklu MD, PhD, Abstract Co-Author: Nothing to Disclose

PURPOSE

To apply word frequency analysis and a naive Bayes classifier on radiology reports to extract distinguishing imaging descriptors of wild-type colon cancer patients and those with KRAS mutations.

METHOD AND MATERIALS

In this IRB approved study, we compiled a SNaPshot mutation analysis dataset from 457 colon adenocarcinoma patients between March, 2009 to December, 2012. From this cohort of patients, we analyzed the radiology reports of 299 patients (>32,000 reports) who were either the wild type (147 patients) or had a KRAS (152 patients) mutation. We wrote a computer program to determine the frequency of words within the wild type and mutant group radiology reports and using a naive Bayes classifier determined the probability of a given word belonging within either group.

RESULTS

Words with a greater than 50% chance (range 56-58%) of being in the KRAS mutation group and which had the highest absolute probability difference compared to the wild type group included: “several”, “innumerable”, “confluent”, and “numerous.” In contrast, words with a greater than 50% chance (range 58-61%) of being in the wild type group and with the highest absolute probability difference included: “few”, “discrete”, and “[no] recurrent.”

CONCLUSION

Words used in radiology reports, which have direct implications on disease course, tumor burden and therapy, show up with differing frequency in patients with KRAS mutations versus wild-type colon adenocarcinoma. More importantly, the study suggest that there are likely characteristic imaging traits of mutant tumors.

CLINICAL RELEVANCE/APPLICATION

Probabilistic word analysis may be useful in identifying unique characteristics and disease course associated with mutated oncogenes. This type of analysis may be applied to radiology reports as well as other types of clinical notes.

Cite This Abstract

Govindan, S, Li, Q, Ganguli, S, Walker, T, Oklu, R, Determining Imaging Characteristics of KRAS Oncogene Mutations in Colon Cancer Using Word Frequency and Naive Bayes Analysis of Radiology Reports.  Radiological Society of North America 2014 Scientific Assembly and Annual Meeting, - ,Chicago IL. http://archive.rsna.org/2014/14015516.html