Attribute-based Mapping for Medical Terms

Published: January 16, 2025

# Healthcare

# Data Science

# Big Data

# NLP

We have extensive experience in automating mappings between medical coding and classification systems by computationally processing their semantic meaning, which can be extracted using NLP methodology or explicitly available from the classification model. Our successful applications include mappings made from ICD-10-PCS, LOINC, and ICD-O-3 systems for OHDSI OMOP Standardized Vocabularies.

Challenge

When dealing with medical data encoded using classification systems such as SNOMED CT or ICD variations, the primary challenge is the conversion and standardization of the data. While manual mapping performed by medical professionals offers unparalleled quality, it may not be feasible due to the large amounts of data in real-world datasets. In some cases, another approach may be necessary. By breaking down medical concepts into their atomic attributes, we can align and create maps between concepts derived from different classification systems, resulting in a comprehensive ontology compiled from various sources.

Solution

Our solution involves several steps. First, we process source concepts to capture their meaning in sets of attributes and relationships. Next, we represent each concept from the source and target classifications as a set of attributes and relations. We then align concepts between classifications using a common hierarchy of attributes and proxy rules for different case groups. Finally, we create a comprehensive system that encompasses all concepts in a single hierarchy. This approach allows us to create mappings between different medical classification systems in a more efficient and accurate manner, reducing the need for manual mapping and increasing the precision of mappings.

Development Journey

Domain expertise in healthcare (general practice, internal medicine, pathology, neurology, pediatrics, psychiatry, anesthesiology and intensive care)
Extensive experience in standardized medical terminologies (SNOMED, LOINC, ICDs, HCPCS, CPT4, CVX, RxNorm, ATC, NDC) and medical data standards (OMOP CDM, FHIR, UMLS)
Data Science and Data Analytics
Natural Language Processing (NLP)
Relational Database Management (SQL-based)
OHDSI Software Tools (Atlas, Athena, Whiterabbit and Rabbit-in-a-hat, Usagi)
Programming languages: Python, Java, SQL, R

Impact

We have successfully implemented a sustainable mapping workflow for complex medical vocabularies, such as ICD-10-PCS and ICD-O-3, as well as for OPS and LOINC. Our approach has reduced the need for manual workload and intervention due to classification updates, as now only individual attributes require mappings instead of multiple concepts constructed from them. Additionally, the precision of the mapping process has improved significantly, as our algorithm is capable of finding nuanced mapping targets in different hierarchy branches. Overall, our approach has enabled us to create comprehensive ontologies that encompass all concepts in a single hierarchy, allowing for more efficient and accurate analysis of medical data.

RELATED CASE STUDIES

Automated Orchestration of Observational Research

Large data volumes are revolutionizing industries, including medical research. This influx of data enables observational studies that harness global statistical evidence. However, conducting such studies can be labor-intensive and prone to inconsistencies due to disconnected communication channels, like repositories, emails, forums, and chats. Moreover, adapting code to different environments during the execution phase can create unscalable and non-reusable analytical frameworks. In response, the OHDSI community is developing ARACHNE, an innovative platform designed to streamline observational research by fostering collaboration among life sciences, healthcare, academia, and organizations handling patient-level data.

# Healthcare

# Big Data

Cohort Definition and Building

The OMOP CDM is widely recognized as the industry standard for observational health research. It provides a standardized data model that facilitates data integration and sharing across different sources, enabling researchers to conduct studies at scale. ATLAS is a central tool used for research within the OMOP CDM ecosystem, providing a user-friendly interface for querying the data model and creating visualizations. Building well-defined cohorts is a critical first step in conducting research using the OMOP CDM. By selecting and defining cohorts of patients with specific characteristics or conditions, researchers can ensure that their studies are focused and relevant, and can generate reliable evidence.

# Healthcare

# Data Science

Automated Lung Pathology Detection with an AI Chest X-Ray Tool

The client is a healthcare company focused on medical imaging and diagnostics, working to improve the diagnosis of lung diseases like tuberculosis and COVID-19. They wanted to create an AI-based algorithm that could detect abnormal changes in chest X-rays and automatically prioritize the most urgent cases. It helps to make the diagnostic process faster and more accurate by identifying abnormalities and highlighting critical cases. It reduces the workload for radiologists and pulmonologists by automating routine tasks, so they can focus on more complex cases. By solving issues like delayed diagnoses and heavy workloads, the solution ensures patients with serious conditions get quicker and more reliable care.

View all Case Studies