Skip to content

A classifier for tumor microenvironment subtype based on ensemble machine learning models

License

Notifications You must be signed in to change notification settings

LiaoWJLab/TMEclassifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TMEclassifier

TMEclassifier is an R package to perform tumor microenvironment classification based on TME characteristics of gastric cancer.

1.Introduction

  • 1.TMEclassifier was designed to classify the tumor microenvironment (TME) of gastric cancer and even other cancers.
  • 2.This package consists an ensemble classification model integrating 6 machine learning algorithms:Support Vector Machine (SVM), Random Forest (RF), Neural Networks (NNET), k-Nearest Neighbor (KNN), Decision Tree (DecTree), eXtreme Gradient Boosting (XGBoost).
  • 3.TMEclassifier identifies three TME-clusters based on expression profiles of 134 TME-related genes and ensemble models.
  • 4.In addition, TMEclassifier provides functions for multi-scale visualization of TMEcluster and some functions depend on IOBR, which was developed by our team previously. The research about IOBR can be reached by this link.

Graphical abstract for construction and clinical application of TMEclassifier

TMEclassifier logo

2.Installation

It is essential that you have R 3.6.3 or above already installed on your computer or server. Before installing TMEclassifier, please install all dependencies by executing the following command in R console:

The dependencies includes caret, e1071, crayon, ggplot2, scales, tibble, IOBR, ggplot2 and ggpubr.

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")

depens<-c("crayon", "ggplot2", "scales", "tibble", "caret", "e1071", "randomForest", "xgboost", "ggpp", "kernlab", "ComplexHeatmap", "survminer", "ggpubr")
for(i in 1:length(depens)){
  depen<-depens[i]
  if (!requireNamespace(depen, quietly = TRUE))
    BiocManager::install(depen,update = FALSE)
}

The package is not yet on CRAN or Bioconductor. You can install it from Github:

if (!requireNamespace("TMEclassifier", quietly = TRUE))
  devtools::install_github("LiaoWJLab/TMEclassifier")
#> Warning: 程辑包'tidyHeatmap'是用R版本4.2.3 来建造的

Library R packages

library(TMEclassifier) 

3.Usage

data("eset_example1")
res<-tme_classifier(eset = t(eset_example1), method = "ensemble", scale = T)
#> Step-1: Expression data preprocessing...
#> >>> There are no missing values
#> 
#> Step-2: TME deconvolution...
#> >>> This step was skipped, user can set parameter `tme_deconvolution` to TRUE or provide TME data to realize prediction.
#> 
#> Step-3: Predicting TME phenotypes...
#> >>>-- Scaling data...
#> >>>--- Ensemble Model was used to predict TME phenotypes...
#> [17:26:58] WARNING: amalgamation/../src/learner.cc:1040: 
#>   If you are loading a serialized model (like pickle in Python, RDS in R) generated by
#>   older XGBoost, please export the model by calling `Booster.save_model` from that version
#>   first, then load it back in current version. See:
#> 
#>     https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html
#> 
#>   for more details about differences between saving model and serializing.
#> 
#> [17:26:58] WARNING: amalgamation/../src/learner.cc:749: Found JSON model saved before XGBoost 1.6, please save the model using current version again. The support for old JSON model will be discontinued in XGBoost 2.3.
#> 
#> >>>--- DONE!
head(res)
#>           ID         IE         IS         IA TMEcluster
#> 1 GSM1523727 0.14378383 0.09860092 0.75761525         IA
#> 2 GSM1523728 0.00769956 0.10048409 0.89181635         IA
#> 3 GSM1523729 0.86870675 0.10316410 0.02812915         IE
#> 4 GSM1523744 0.04996065 0.06216853 0.88787082         IA
#> 5 GSM1523745 0.05508487 0.81613011 0.12878502         IS
#> 6 GSM1523746 0.56659350 0.38249992 0.05090658         IE
table(res$TMEcluster)
#> 
#>  IA  IE  IS 
#> 108  93  99

4.Visualization

Functions applied to visualization were depends on the IOBR R package. Users can install it from Github:

if (!requireNamespace("IOBR", quietly = TRUE))
  devtools::install_github("IOBR/IOBR")
library(IOBR)

Combining TMEcluster data and phenotype data.

data("pdata_example")
input<-inner_join(res, pdata_example, by = "ID")
input[1:5, 1:8]
#>           ID         IE         IS         IA TMEcluster ProjectID  Technology
#> 1 GSM1523727 0.14378383 0.09860092 0.75761525         IA  GSE62254 Affymetrix 
#> 2 GSM1523728 0.00769956 0.10048409 0.89181635         IA  GSE62254 Affymetrix 
#> 3 GSM1523729 0.86870675 0.10316410 0.02812915         IE  GSE62254 Affymetrix 
#> 4 GSM1523744 0.04996065 0.06216853 0.88787082         IA  GSE62254 Affymetrix 
#> 5 GSM1523745 0.05508487 0.81613011 0.12878502         IS  GSE62254 Affymetrix 
#>         platform
#> 1 HG-U133_Plus_2
#> 2 HG-U133_Plus_2
#> 3 HG-U133_Plus_2
#> 4 HG-U133_Plus_2
#> 5 HG-U133_Plus_2

Box plot

cols<- c('#fc0d3a','#ffbe0b','#2692a4')
p1<-sig_box(data = input, signature = "TMEscore", variable = "TMEcluster", cols = cols, hjust = 0.5)
#> # A tibble: 3 × 8
#>   .y.       group1 group2        p    p.adj p.format p.signif method  
#>   <chr>     <chr>  <chr>     <dbl>    <dbl> <chr>    <chr>    <chr>   
#> 1 signature IA     IE     3.66e-24 1.10e-23 < 2e-16  ****     Wilcoxon
#> 2 signature IA     IS     1.18e-17 2.40e-17 < 2e-16  ****     Wilcoxon
#> 3 signature IE     IS     9.44e- 6 9.40e- 6 9.4e-06  ****     Wilcoxon
#> Warning: Removed 1 rows containing non-finite values (`stat_boxplot()`).
#> Warning: Removed 1 rows containing non-finite values (`stat_signif()`).
#> Warning: Removed 1 rows containing non-finite values (`stat_compare_means()`).
p2<-sig_box(data = input, signature = "T.cells.CD8", variable = "TMEcluster", cols = cols, hjust = 0.5)
#> # A tibble: 3 × 8
#>   .y.       group1 group2        p    p.adj p.format p.signif method  
#>   <chr>     <chr>  <chr>     <dbl>    <dbl> <chr>    <chr>    <chr>   
#> 1 signature IA     IE     1.06e- 9 2.10e- 9 1.1e-09  ****     Wilcoxon
#> 2 signature IA     IS     7.54e-18 2.3 e-17 < 2e-16  ****     Wilcoxon
#> 3 signature IE     IS     1.09e- 3 1.1 e- 3 0.0011   **       Wilcoxon

KM-plot

library(survminer)
p3<-surv_cluster(input_pdata     = input,
                 target_group    = "TMEcluster",
                 time            = "OS_time",
                 status          = "OS_status",
                 project         = "ACRG",
                 cols            = c('#fc0d3a','#ffbe0b','#2692a4'),
                 save_path       = paste0("./man/figures"))
#>  IA  IE  IS 
#> 108  93  99

Distribution of TMEcluster and molecular subtypes

# install.package("remotes")   #In case you have not installed it.
if (!requireNamespace("ggpie", quietly = TRUE)) 
  remotes::install_github("showteeth/ggpie")
library(ggpie)
p4<-ggdonut(data = input, group_key = "TMEcluster", count_type = "full",
        label_info = "ratio", label_type = "circle", label_split = NULL,
        label_size = 5, label_pos = "in",  donut.label.size = 4)+
  scale_fill_manual(values = cols)

p5<-ggdonut(data = input, group_key = "Subtype", count_type = "full",
            label_info = "ratio", label_type = "circle", label_split = NULL,
            label_size = 5, label_pos = "in",  donut.label.size = 4)+
  scale_fill_manual(values = palettes(palette = "nrc", show_col = FALSE, show_message = FALSE))
#########################################
p6<-ggnestedpie(data                  = input, 
                group_key             = c("TMEcluster", "Subtype"),
                count_type            = "full",
                inner_label_info      = "ratio",
                inner_label_split     = NULL,
                inner_labal_threshold = 5, 
                inner_label_size      = 3, 
                outer_label_type      = "circle",
                outer_label_size      = 5,
                outer_label_pos       = "in", 
                outer_label_info      = "count")+
  scale_fill_manual(values = cols)

Combination of plots

if (!requireNamespace("patchwork", quietly = TRUE)) 
  install.packages("patchwork")
library(patchwork)
p<-(p1|p2|p3)/(p4|p5|p6)
p + plot_annotation(tag_levels = 'A')
#> Warning: Removed 1 rows containing non-finite values (`stat_boxplot()`).
#> Warning: Removed 1 rows containing non-finite values (`stat_signif()`).
#> Warning: Removed 1 rows containing non-finite values (`stat_compare_means()`).

5.Phenotype of TMEclusters defined by gastric cancer.

TMEcluster logo

References

1.Zeng D, Yu Y, Qiu W, Mao Q, …, Zhang K, Liao W; Tumor microenvironment immunotyping heterogeneity reveals distinct molecular mechanisms to clinical immunotherapy applications in gastric cancer. (2023) Under Review.

2.Zeng D, Li M, …, Liao W; Tumor microenvironment characterization in gastric cancer identifies prognostic and imunotherapeutically relevant gene signatures. Cancer Immunology Research, 2019, 7(5), 737-750. DOI: 10.1158/2326-6066.CIR-18-0436, PMID: 30842092

3.Zeng D, Ye Z, Sheng R, Yu G, Xiong Y, …, Liao W; IOBR: Multi-omics Immuno-Oncology Biological Research to decode tumor microenvironment and signatures. Frontiers in Immunology, 2021, 12:687975. DOI: 10.3389/fimmu.2021.687975, PMID: 34276676

Reporting bugs

Please report bugs to the Github issues page

E-mail any questions to [email protected]

About

A classifier for tumor microenvironment subtype based on ensemble machine learning models

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages