logo

Predictive Modeling of Cancer Progression

Solution for lung cancer and lymphoma patients

Published: April 2, 2024
# Healthcare
# Data Science
Health data is becoming highly digested, and data science and predictive modeling enter the picture to facilitate precision oncology. One of our clients decided to turn tons of unstructured medical data into a predictive modeling solution for cancer progression. Since relying on seasoned professionals in medical data engineering, data analysis, and machine learning (ML) is crucial in this case, we took on a challenge. Read on to discover how we came up with a solution to cancer progression prediction for lung cancer and lymphoma patients.

Challenge

There are two principal challenges of applying ML in cancer progression modeling: Having enough phenotypically rich data to train models and professional validation of generated insights. So we had all the ingredients to deal with the scientific problem and achieve the objective.

Solution

  • In the majority of cases, medical data is highly disorganized. Thus the first step was to convert scattered medical definitions to strict machine-readable language using extensive domain expertise in healthcare, SQL, and NLP tools (Python-based).
  • Then we implemented deep machine learning technologies (CNN-based models) to make predictions.
  • And finally, we visualized obtained models using R and Shiny application in a human-readable format.

Development Journey

  1. Domain Expertise in Healthcare
  2. Natural Language Processing (NLP)
  3. Relational Database Management (SQL-based)
  4. Data Science and Data Analytics
  5. Programming languages: R, Python, SQL
  6. Machine Learning
  7. Predictive Analysis

Impact

We created a predictive model to forecast the risk of a disease progression (e.g., relapse, protracted clinical course) for patients with lung cancer and lymphoma: After transforming raw data to the OMOP CDM data standard, we defined the required cohorts. Based on it, we applied CNN-based deep learning models, predicting cancer outcomes to data collected for patients during the "time at risk" window.

As a result, we contributed to the early diagnosis of lung cancer and lymphoma with an ML-driven cancer progression predictive model.

RELATED CASE STUDIES

View all Case Studies