Skip to content

bsc-life/Explainable_Synthetic_Data_Generation_Medulloblastoma

Repository files navigation

BSC IPC

The goal of the iPC project is to collect, standardize and harmonize existing clinical knowledge and medical data and, with the help of artificial intelligence, create treatment models for patients.

alt text

BSC (Barcelona Supercomputing Center) is the largest research center in Spain and one of the largest supercomputers in Europe. The mission of the Life Sciences department is to understand living organisms by means of theoretical and computational methods (molecular modeling, genomics, proteomics).

Abstract

Synthetic data generation is emerging as a dominant solution for personalized medicine as it enables to address critical challenges such as yielding the data volumes needed to deliver accurate results and complying with increasingly restrictive privacy regulations, both demanded in paediatric cancer research. Here we introduce an exaplainable VAE for synthetic data generation for medulloblastoma, a childhood brain tumor. Our model can be used to augment and interpolate available data with synthetic instances, which are automatically annotated with confidence scores to assess the reliability of augmented data points and interpolated paths. The model is transparent as it is able to match the learned latent variables with distinct gene expression patterns. We leverage both the synthetic data generation ability and explainability features of our model to study the unknown relationship between G3 and G4 subgroups of medulloblastoma and identify an intermediate subgroup with a specific gene signature.

Setup

In order to reproduce the results indicated in the paper simply setup an environment using the provided environment.yaml and conda and run the experiments using the provided makefile:

conda env create --file environment.yaml
source activate ENV_NAME

About

Explainable synthetic data generation for paediatric cancer research

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published