Skip to content

Benchmark of existing cell type lable transfer methods and analysis on how the careful selection of the reference data can improve the predictions.

Notifications You must be signed in to change notification settings

HaghverdiLab/CelltypeLabelTransfer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adjustments to the reference dataset design improve cell type label transfer

This is the GitHub repository holding the scripts and result files for the paper "Adjustments to the reference dataset design improve cell type label transfer" by Moelbert et al. The preprint can be found on Frontiers (ttps://doi.org/10.3389/fbinf.2023.1150099).

The transfer of cell type labels from prior annotated (reference) to newly collected data is an important task in single-cell data analysis. As the number of publicly available annotated datasets which can be used as a reference, as well as the number of computational methods for cell type label transfer are constantly growing, rationals to understand and decide which reference design and which method to use for a particular query dataset is needed. Here, we benchmark a set of five popular cell type annotation methods, study the performance on different cell types and highlight the importance of the design of the reference data (number of cell samples for each cell type, inclusion of multiple datasets in one reference, gene set selection, etc.) for more reliable predictions.

We introduce a weighted bootstrapping-based approach that allows to make use of the entrie reference dataset while still keeping the benefits from not under-representing any of the existing cell types.

bootstrapping_algorithm

The approach can be applied in combination with any lable transfer method and is especially helpful when working with model-based approaches.

About

Benchmark of existing cell type lable transfer methods and analysis on how the careful selection of the reference data can improve the predictions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages