OHDSI Medical Vocabulary Development

iconData Analytics
iconData Science

Any country's healthcare systems and institutions have their classifications applicable locally. Instead of a single united system, multiple vocabularies pose obstacles to patient data exchange, global healthcare big data analysis, and slow down research. So it is evident that researchers and healthcare providers lack a common medical vocabulary model that will meet all the requirements. One of our clients decided to create a comprehensive model representing the data from different sources and ontologies used in various fields of medical science and the healthcare industry. Check out how we contributed to this project!


Standard vocabulary should contain a code set made up of unique identifiers, respective nomenclature, thesaurus (synonyms), and taxonomy (classification). Moreover, all input data in standard medical vocabularies shall be represented in the same data format, no matter the origin, to be queried in a standardized manner. This task requires experts in medical ontology engineering with a deep understanding of context.

  • We first created criteria for data standardization to standardize raw data sets according to them, implementing medical terminology standardization.
  • Then we designed a user-friendly interface where transformed data can be searched and queried.
  • After that, we made it possible to browse and navigate hierarchies of classes and abstractions inherent in transformed data.
  • Finally, we provided a human-readable interpretation of prepared data values.

As a result, we have a clear semantic structure that meets the requirements of the OMOP CDM data standard and OHDSI (The Observational Health Data Sciences and Informatics) vocabulary model. Moreover, a newly created medical ontology enables further maintenance and extension according to the customer's needs.

Tech Stack
  1. Domain Expertise in Healthcare (General Practice, Internal Medicine, Pathology, Neurology, Pediatrics, Psychiatry, Anesthesiology, Obstetrics, and Gynecology)
  2. Data Analytics and ETL
  3. Medical Ontology Engineering (SQL-based)

This is how we ensured comprehensive coverage of massive heterogeneous datasets from around the world for further research and analysis.