Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing cell types in the results #14

Closed
jracle85 opened this issue Jul 18, 2019 · 12 comments
Closed

Missing cell types in the results #14

jracle85 opened this issue Jul 18, 2019 · 12 comments

Comments

@jracle85
Copy link

Hello,
First, congrats for your paper and thank you for including some additional discussion in it, as I had suggested. It's great to see that EPIC was top-performing in this independent benchmark :-)

It's also cool that you wrote this package so that people can easily use the one or other method, even though this might hide some options that the original packages are proposing to their users, like the use of their own cell reference profiles.

After testing your package, I've seen an issue: some cell types predicted by a given method are missing from the results. In particular, for EPIC, when doing the deconvolution for non-tumor tissue, we also predict the neutrophils but they aren't returned. And both for the tumor and non-tumor cases, EPIC also adds the predictions of the "otherCells". This is a novel and very important feature that has been introduced by EPIC, representing mostly the cancer cells or other cell types without reference profiles that are present in the bulk sample. We feel that these cell types should thus be added in your cell_type_mapping file to include all cell types predicted by EPIC.

Similarly, for xCell, I understand that you didn't want to output all cell subtypes due to the potential spillover effects affecting this method. But this also hides the fact that xCell would also return predictions for other cell types. So, it might be good to implement an option allowing the user to select doing the deconvolution for his cell types of interest (the "expected_cell_type" option seem to only work to further restrain the set of cell types, but not to add other cell types absent from the cell_type_mapping table).

Thanks! Cheers,

Julien

@mlist
Copy link
Collaborator

mlist commented Jul 18, 2019 via email

@jracle85
Copy link
Author

Dear Markus,

Thank you for the message. Unfortunately I won't be able to join ISMB next week. But I'll be at BC2 in September, also in Basel. Maybe you'll be there as well?

And sorry, but I see that I posted this message in the wrong repository... I had both repo opened but this issue should of course relate to the package, not to the benchmark where the cell types are fixed...

I'll thus put a link to this also in the package in case someone else is interested by this question (but if you prefer we can also delete it from here and recopy in the other).

Thank. Cheers,

Julien

@grst grst transferred this issue from icbi-lab/immune_deconvolution_benchmark Jul 18, 2019
@grst
Copy link
Collaborator

grst commented Jul 18, 2019

Moved the issue ;)
Will reply in detail later...

@jracle85
Copy link
Author

Great, thanks for moving it; I'm still quite new to GitHub and don't master all of it ;-)

@grst
Copy link
Collaborator

grst commented Jul 18, 2019

Hi Julien,

thanks for your feedback, this is really helpful!

It's also cool that you wrote this package so that people can easily use the one or other method, even though this might hide some options that the original packages are proposing to their users, like the use of their own cell reference profiles.

I believe it would be ideal to access all the methods' options through a consistent interface, however this is pretty challenging due to the conceptual differences of the methods. One example would be to specify custom signatures. This is something that has been proposed to me earlier via email. I now created #15 to discuss how to implement this.

some cell types predicted by a given method are missing from the results. In particular, for EPIC, when doing the deconvolution for non-tumor tissue, we also predict the neutrophils but they aren't returned. And both for the tumor and non-tumor cases, EPIC also adds the predictions of the "otherCells".

Well the reason why they are not included is that first I created this package as helper functions for the benchmark pipeline. Since the scRNA-seq dataset does not include Neutrophils (and all the different xCell types), there was no point in adding them.

However I agree that they should be included now!

As you may have figured out it's all in this excel sheet, and it should be easy to add the additional cell types:
https://github.com/grst/immunedeconv/blob/master/inst/extdata/cell_type_mapping.xlsx

I am hoping to do next week. Maybe you could review the final mapping? What do you think?

Cheers,
Gregor

@jracle85
Copy link
Author

Hi Gregor,

Thanks for the reply.

I am hoping to do next week. Maybe you could review the final mapping? What do you think?

Yes sure, I can check the updated mapping once it's done (but I won't have much time next week anyway). I can check for EPIC and have a look at the other methods as well but I don't know all the details from all the methods.

I believe it would be ideal to access all the methods' options through a consistent interface, however this is pretty challenging due to the conceptual differences of the methods.

Another simple option would be that you add help pages for all the methods, quoting back the full help page of the given function to indicate the various options available to it. You already did kind of that (a simplified help page though) for quanTIseq and Timer where there are separate help documents. The options from each method are kind of already available through your interface through the .... There are nevertheless still two issues that need to be considered:

  • Some of the options of a method are defined based on your other parameters (e.g. for EPIC the referenceoption is depending on your tumor option), so if both options are given, it should be the option specific to the original method that should be used.

  • If a method gets updated with new parameters, you'd then have to also update manually the corresponding help document. Possibly bigger changes in the functions might also need other adaptations to your parameter mapping between the methods.

    • Maybe instead of rewriting the full help page, you could also just give a link to the help document of the function found directly in it's package of origin (when such a help pages exists). Maybe in such a case, you should indicate somewhere the mapping that you did between the parameters such that one knows which parameters of the function of origin are already set by your method.

Cheers,

Julien

@grst
Copy link
Collaborator

grst commented Jul 27, 2019

Hi Julien,

I'm sorry, I didn't manage to update the package during the conference and I'm off to a bikepacking trip for the next four weeks...

But I'll take care of this as soon as I'm back!

grst added a commit that referenced this issue Sep 2, 2019
@grst
Copy link
Collaborator

grst commented Sep 3, 2019

Hi @jracle85,

I added the missing cell types and am working on improving the documentation in #19.

  • I would rather not include EPIC's "otherCells" as an own cell type, because technically it isn't and doesn't really fit in the cell-type tree. The fraction is already available as 1-sum(other_cells) and that way it's consistent with quanTIseq. Instead I would dedicate a new section in the documentation to "absolute scores" explaining that. Is that ok with you?

  • It would be great if you could have a quick look at the new cell type mapping. I now added the Neutrophils and also the remaining immune cell types of xCell. Here is the mapping as a google doc, where you could add comments directly.

And here is the new cell-type tree:
ct-tree

@grst
Copy link
Collaborator

grst commented Sep 3, 2019

As a good deal of the re-mapping deals with xCell signatures, maybe @dviraran could have a quick look at the mapping as well?

  • The mapping is in this google doc
  • Sheet1 defines the cell type tree above
  • Sheet2 defines how the signatures of the different methods are mapped to nodes in the cell-type tree.

Your help would be greatly appreciated!

Cheers,
Gregor

@jracle85
Copy link
Author

jracle85 commented Sep 6, 2019

Hi @grst,

Thanks for the update.

It would be great if you could have a quick look at the new cell type mapping. I now added the Neutrophils and also the remaining immune cell types of xCell. Here is the mapping as a google doc, where you could add comments directly.

The mapping for EPIC cell types seems fine. I didn't check for xCell.

I would rather not include EPIC's "otherCells" as an own cell type, because technically it isn't and doesn't really fit in the cell-type tree. The fraction is already available as 1-sum(other_cells) and that way it's consistent with quanTIseq. Instead I would dedicate a new section in the documentation to "absolute scores" explaining that. Is that ok with you?

I agree that you could find it back by 1 - sum(other_cells). The only drawback of not including a value directly for these in the results, is that I fear that people with less experience working with EPIC or that didn't read the documentation in detail, would miss the point that we are also returning this value, which is a meaningful value (often corresponding directly to the fraction of cancer cells). It is not straightforward that we can obtain the remaining cells by 1 - sum(other_cells): some of the other methods return values for various cell types but without this equation to be meaningful. So, if the value appears in the results directly, then the user will see there is this cell type and if he doesn't understand what it means he can simply check the documentation. But if it is not visible, he will not necessarily think that he could do such a transformation to get the cell fractions from an additional cell type.

If you nevertheless still rather don't include this cell type in the results, please make sure that the documentation clearly states that we can obtain also the fraction of the remaining cells with above's equation.

Thanks. Best,

Julien

@grst
Copy link
Collaborator

grst commented Sep 9, 2019

Hi Julien,

I re-thought this and think you are right. Other cells are now included for both EPIC and quanTIseq in the main result. Wasn't too complicated anyway.

#19 will be merged today, and then it's available.

Cheers and thanks for your input,
Gregor

grst added a commit that referenced this issue Sep 9, 2019
@grst
Copy link
Collaborator

grst commented Sep 9, 2019

Close #14 per #19.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants