Authors: Berry de Bruijn,Colin Cherry,Svetlana Kiritchenko,Joel Martin,Xiaodan Zhu
Where published:
JAMIA 2011 5
Document:
PDF
DOI
Abstract URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168309/
Objective: As clinical text mining continues to mature, its
potential as an enabling technology for innovations in
patient care and clinical research is becoming a reality. A
critical part of that process is rigid benchmark testing of
natural language processing methods on realistic clinical
narrative. In this paper, the authors describe the design
and performance of three state-of-the-art text-mining
applications from the National Research Council of
Canada on evaluations within the 2010 i2b2 challenge.
Design: The three systems perform three key steps in
clinical information extraction: (1) extraction of medical
problems, tests, and treatments, from discharge
summaries and progress notes; (2) classification of
assertions made on the medical problems; (3)
classification of relations between medical concepts.
Machine learning systems performed these tasks using
large-dimensional bags of features, as derived from both
the text itself and from external sources: UMLS, cTAKES,
and Medline.
Measurements: Performance was measured per
subtask, using micro-averaged F-scores, as calculated by
comparing system annotations with ground-truth
annotations on a test set.
Results: The systems ranked high among all submitted
systems in the competition, with the following F-scores:
concept extraction 0.8523 (ranked first); assertion
detection 0.9362 (ranked first); relationship detection
0.7313 (ranked second).
Conclusion: For all tasks, we found that the introduction
of a wide range of features was crucial to success.
Importantly, our choice of machine learning algorithms
allowed us to be versatile in our feature design, and to
introduce a large number of features without overfitting
and without encountering computing-resource
bottlenecks.