You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been testing the BANKSY pipeline through both the original (SpatialExperiment) pipeline vignettes as well as this wrapper pipeline, and I believe I have discovered an error in the wrapper, at least within the current implementation suggested in the vignette. In the original BANKSY code, computeBanksy stores the BANKSY matrix for either the mean neighborhood (H0 / M0) or the AGF (H1 / M1) in a new SpatialExperiment assay (equivalent to a Seurat layer), where H0 is kept separately from H1.
In the Seurat-wrapper, RunBanksy creates data (and scale.data) layers which each contain a combination of data, and it is unclear if this is the intended behavior. Each gene present in the original data has a matching row in the new matrix, as well as additional matching rows appended with ".m0" or ".m1" for mean or AGF computations respectively, but they are stored within the same matrix. (i.e. Egr2, Egr2.m0, Egr2.m1 all in one large matrix)
This leads to 2 or 3 times the number of "features" in the BANKSY matrix than originally present. When Seurat's RunPCA is then called, all features are selected. This does not match the original BANKSY code, where runBanksyPCA calculates the PCA values only on the H0 or H1 BANKSY matrix, depending on whether use_agf is true or false.
I am unsure whether folks on the Seurat side would be the best to pose this issue to or if @jleechung with the original BANKSY code is the correct person for this issue, but hopefully it is something straightforward to correct.
Thank you
The text was updated successfully, but these errors were encountered:
Hi @LPotter21, thanks for your interest in BANKSY!
The intended behaviour is what you see in the SeuratWrappers implementation: the original gene cell matrices are concatenated with the neighborhood feature matrices (H0 if use_agf=False, H0 and H1 if use_agf=True) to form the BANKSY matrix. The BANKSY matrix, with either 2 or 3 times the number of features as you note, is used for downstream dimensionality reduction.
The original BANKSY code implements this as well: while the neighborhood feature matrices are stored in different slots in SpatialExperiment, runBanskyPCA constructs the concatenated matrix here, by calling getBanksyMatrix:
Good afternoon.
I have been testing the BANKSY pipeline through both the original (
SpatialExperiment
) pipeline vignettes as well as this wrapper pipeline, and I believe I have discovered an error in the wrapper, at least within the current implementation suggested in the vignette. In the original BANKSY code,computeBanksy
stores the BANKSY matrix for either the mean neighborhood (H0 / M0) or the AGF (H1 / M1) in a new SpatialExperimentassay
(equivalent to a Seuratlayer
), where H0 is kept separately from H1.In the Seurat-wrapper,
RunBanksy
creates data (and scale.data) layers which each contain a combination of data, and it is unclear if this is the intended behavior. Each gene present in the original data has a matching row in the new matrix, as well as additional matching rows appended with ".m0" or ".m1" for mean or AGF computations respectively, but they are stored within the same matrix. (i.e. Egr2, Egr2.m0, Egr2.m1 all in one large matrix)This leads to 2 or 3 times the number of "features" in the BANKSY matrix than originally present. When Seurat's
RunPCA
is then called, all features are selected. This does not match the original BANKSY code, whererunBanksyPCA
calculates the PCA values only on the H0 or H1 BANKSY matrix, depending on whetheruse_agf
is true or false.I am unsure whether folks on the Seurat side would be the best to pose this issue to or if @jleechung with the original BANKSY code is the correct person for this issue, but hopefully it is something straightforward to correct.
Thank you
The text was updated successfully, but these errors were encountered: