OBJECTIVE: CDA is a standard for the exchange and sharing of clinical documents among all entities in the healthcare domain. As it proliferates, the number of CDA documents will increase exponentially and it will require huge storage spaces to store them. The main goal of this study is to devise an efficient compression method optimized for CDA documents so that the storage requirement can be lowered.
METHODS: The method proposed in this paper is based on a compression method called Xmill which has been designed specifically for XML documents at large, which requires human intervention for the effective compression, especially, of CDA. Our proposed method, CDACOM, automatically extracts type information from CDA documents to infer the data type, assigns data values of the same type to the same data container, and applies an optimized encoder to the container so that a better compression rate can be achieved.
RESULTS: Experiments with various types of CDA documents were performed to evaluate the effectiveness of CDACOM over Xmill. The results show that CDACOM indeed outperforms Xmill and can decrease the output file size by about 24.1% on average, compared to Xmill. If documents are combined and compressed together, the gap gets even bigger to about 50%.
CONCLUSION: The proposed compression method, CDACOM, is very effective and promising. It will help lowering the cost for systems to transmit and store CDA documents and, hence, expediting the adoption of the standard in the healthcare domain. |