When dealing with medical data encoded using classification systems such as SNOMED CT or ICD variations, the primary challenge is the conversion and standardization of the data. While manual mapping performed by medical professionals offers unparalleled quality, it may not be feasible due to the large amounts of data in real-world datasets. In some cases, another approach may be necessary. By breaking down medical concepts into their atomic attributes, we can align and create maps between concepts derived from different classification systems, resulting in a comprehensive ontology compiled from various sources.
Our solution involves several steps. First, we process source concepts to capture their meaning in sets of attributes and relationships. Next, we represent each concept from the source and target classifications as a set of attributes and relations. We then align concepts between classifications using a common hierarchy of attributes and proxy rules for different case groups. Finally, we create a comprehensive system that encompasses all concepts in a single hierarchy. This approach allows us to create mappings between different medical classification systems in a more efficient and accurate manner, reducing the need for manual mapping and increasing the precision of mappings.
We have successfully implemented a sustainable mapping workflow for complex medical vocabularies, such as ICD-10-PCS and ICD-O-3, as well as for OPS and LOINC. Our approach has reduced the need for manual workload and intervention due to classification updates, as now only individual attributes require mappings instead of multiple concepts constructed from them. Additionally, the precision of the mapping process has improved significantly, as our algorithm is capable of finding nuanced mapping targets in different hierarchy branches. Overall, our approach has enabled us to create comprehensive ontologies that encompass all concepts in a single hierarchy, allowing for more efficient and accurate analysis of medical data.