Predictive Modeling of Cancer Progression

Solution for lung cancer and lymphoma patients
iconData Analytics
iconPredictive analytics
iconData Science
iconMachine Learning

Health data is becoming highly digested, and data science and predictive modeling enter the picture to facilitate precision oncology. One of our clients decided to turn tons of unstructured medical data into a predictive modeling solution for cancer progression. Since relying on seasoned professionals in medical data engineering, data analysis, and machine learning (ML) is crucial in this case, we took on a challenge. Read on to discover how we came up with a solution to cancer progression prediction for lung cancer and lymphoma patients.


There are two principal challenges of applying ML in cancer progression modeling: Having enough phenotypically rich data to train models and professional validation of generated insights. So we had all the ingredients to deal with the scientific problem and achieve the objective.

  • In the majority of cases, medical data is highly disorganized. Thus the first step was to convert scattered medical definitions to strict machine-readable language using extensive domain expertise in healthcare, SQL, and NLP tools (Python-based).
  • Then we implemented deep machine learning technologies (CNN-based models) to make predictions.
  • And finally, we visualized obtained models using R and Shiny application in a human-readable format.

We created a predictive model to forecast the risk of a disease progression (e.g., relapse, protracted clinical course) for patients with lung cancer and lymphoma: After transforming raw data to the OMOP CDM data standard, we defined the required cohorts. Based on it, we applied CNN-based deep learning models, predicting cancer outcomes to data collected for patients during the "time at risk" window.

Tech Stack
  1. Domain Expertise in Healthcare
  2. Natural Language Processing (NLP)
  3. Relational Database Management (SQL-based)
  4. Data Science and Data Analytics
  5. Programming languages: R, Python, SQL
  6. Machine Learning
  7. Predictive Analysis

As a result, we contributed to the early diagnosis of lung cancer and lymphoma with an ML-driven cancer progression predictive model.

Our contacts