logo

Cohort Definition and Building

Published: January 16, 2025
# Healthcare
# Data Science
The OMOP CDM is widely recognized as the industry standard for observational health research. It provides a standardized data model that facilitates data integration and sharing across different sources, enabling researchers to conduct studies at scale. ATLAS is a central tool used for research within the OMOP CDM ecosystem, providing a user-friendly interface for querying the data model and creating visualizations. Building well-defined cohorts is a critical first step in conducting research using the OMOP CDM. By selecting and defining cohorts of patients with specific characteristics or conditions, researchers can ensure that their studies are focused and relevant, and can generate reliable evidence.

Challenge

Cohort definition and building in the OMOP CDM ecosystem is a complex and multistep process that involves collaboration between domain experts and data engineers. Building a cohort requires a deep understanding of the research question, the clinical domain, and the available data sources. Data engineers need to have a thorough understanding of the OMOP CDM data model, including the various tables and their relationships. They also need to be proficient in SQL and familiar with the ETL (extract, transform, load) process. On the other hand, domain experts are responsible for selecting the appropriate clinical concepts, creating inclusion and exclusion criteria, and ensuring that the cohort definition aligns with the research question.

Solution

Our team comprises senior OMOP CDM developers who have been working on the project since its inception. We have extensive experience in participating in and leading development workgroups and have been involved in designing, promoting, and implementing desired changes within the OMOP CDM. Our full lifecycle support includes proposing design changes, implementing improvements, and concluding complete research at all stages using the developed enhancements. With our expertise in OMOP CDM, medical domain, and data science, we are equipped to guide and support you throughout the entire process of transforming raw data into completed research.

Development Journey

  • Domain expertise in healthcare (general practice, internal medicine, pathology, neurology, pediatrics, psychiatry, anesthesiology and intensive care)
  • Extensive experience in standardized medical terminologies (SNOMED, LOINC, ICDs, HCPCS, CPT4, CVX, RxNorm, ATC, NDC) and medical data standards (OMOP CDM, SSSOM, FHIR, UMLS)
  • Data Science and Data Analytics
  • Natural Language Processing (NLP)
  • Relational Database Management (SQL-based)
  • OHDSI Software Tools (Atlas, Athena, Whiterabbit and Rabbit-in-a-hat, Usagi etc.)
  • Programming languages: Python, Java, SQL, R

Impact

Our team's expertise in defining and building concept sets and cohorts in Atlas has enabled us to create well-defined cohorts for our clients. This has facilitated successful observational studies within the OMOP CDM ecosystem, leading to evidence-based insights. With our full lifecycle support, from proposing design changes to concluding complete research, our clients achieve their research goals with confidence and efficiency.

RELATED CASE STUDIES

View all Case Studies