-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hvg flavors seurat and cellranger with batch: bug in subset #3042
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3042 +/- ##
=======================================
Coverage 76.31% 76.31%
=======================================
Files 109 109
Lines 12513 12515 +2
=======================================
+ Hits 9549 9551 +2
Misses 2964 2964
|
Looks simple enough! Please deduplicate the tests though, they have too many identical lines. |
To do so did across-setting tests with for loop on top of former test... Would you prefer one separate test with that for loop for the across-settings check? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry for missing this. I’m always for pytest.parametrize
instead of for loops. That way it’s immediately visible which variants of the test fail.
To test in the future that this bug doesnt happen, the 4 combinations of inplace=True/False and subset=True/False need to be compared for their selected Could
# check that the results are consistent for subset True/False: inplace True
adata_post_subset = adatas["subset_False_inplace_True"][
:, adatas["subset_False_inplace_True"].var["highly_variable"]
]
assert adata_post_subset.var_names.equals(
adatas["subset_True_inplace_True"].var_names
)
# check that the results are consistent for subset True/False: inplace False
df_post_subset = dfs["subset_False_inplace_False"][
dfs["subset_False_inplace_False"]["highly_variable"]
]
assert df_post_subset.index.equals(dfs["subset_True_inplace_False"].index)
# check that the results are consistent for inplace True/False: subset True
assert adatas["subset_True_inplace_True"].var_names.equals(
dfs["subset_True_inplace_False"].index
)
what's your preference here? |
Ah, silly me, this makes sense. Since some entries of the dicts are never used, I just removed them and replaced the string with a bool (subset=True/False). This makes it all quite a bit more compact. Also please remember |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a relnote
Timeout and Scrublet failing in Python3.12? |
Ah on main too I see |
coverage decreased, I think not detected because some pytest.parametrize were removed? |
I think codecov just hadn’t updated its comment yet when you saw that. What you say can’t be, it doesn’t matter how a line was hit: If a line is run, it’ll be reported as hit, if your changes would have caused it to no longer be it, it would have been reported as a miss. |
…h: bug in subset
…in subset (#3128) Co-authored-by: Eljas Roellin <[email protected]>
This PR fixes the bug reported in the linked issue.
A new test which spots the erroneous computations has been added.
I would use this chance to refactor the
_highly_variable_genes.py
, rather than using the 2-lines fix suggested in the first commit:Doing the multi-batch hvg flagging differently for seurat_v3 and seurat/cell_ranger is what made this bug hard to spot in the first place I think.