OBJECTIVE: Applications to extract medical information from electronic medical records(EMRs) confront some serious obstacles such as spelling errors, ambiguous abbreviations, and unrecognizable words. Those obstacles hinder the process of finding medical entities, relations, and events. We present an efficient EMR refinement system for the purpose of medical information extraction from EMRs, not just for traditional text error correction.
METHODS: The EMR refinement system has been designed and implemented through following steps: 1) Build domain constrained dictionary database, 2) Correct spelling errors in Korean-English EMR documents, 3) Resolve ambiguous abbreviations in the bilingual documents. The resulting EMR documents are now machine readable and can be applied to various applications including information extraction.
RESULT: Precision rate of the refinement system for spelling error correction is 80.4% and for disambiguating abbreviations/acronyms is 94.7%.
CONCLUSION: We developed an EMR refinement system to correct spelling errors and resolve ambiguous abbreviations as well as unrecognizable words. Our system can enhance the reliability of medical records and contribute to develop further application systems in the field of text mining and information extraction. |