-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sourmash 3.x protein ksizes must be divisible by 3 #1019
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1019 +/- ##
==========================================
- Coverage 92.31% 91.94% -0.38%
==========================================
Files 72 72
Lines 5413 5412 -1
==========================================
- Hits 4997 4976 -21
- Misses 416 436 +20
Continue to review full report at Codecov.
|
hi @bluegenes could you explain the rationale for this a little bit more in the main PR text? |
@ctb done! |
hi @bluegenes as described I merged #1013 after switching ksizes over to 57. I then merged that master back into this PR, and wanted to double check with you that this all looks right. I actually just double checked myself, and it looks like commit 3ec04b5 is now all that's left on this PR, which seems appropriate. Yay! Instead of removing the test that now fails, |
@ctb - done! test mirrors the "bad ksize" test for nucleotide input. |
great thx! one suggested change. |
Co-authored-by: C. Titus Brown <[email protected]>
Before merging, I need to think about whether this violates our command-line interface semantic versioning guarantee :) |
I think this may be a specific example of an API-breaking change that is ready to merge but would break 3.x! How exciting :) Briefly, this check was removed several versions ago (December 2019), which doesn't violate semantic versioning requirements, since making command line checks more liberal is fine. To add it back in could, in theory, break people's workflows. Ugh! So I'll put this in cold storage for a little bit while we settle into our 4.0-targeted groove viz #1016. |
(i.e. please don't merge :) |
merged in #1277. |
When I wrote #575 / #576 , I thought that protein sigs were calculated at their exact ksize, except when translating (I forgot about this PR until recently). So #576 eliminated a check for the protein ksize being divisible by 3, allowing you to calculate signatures with k=19, which would evaluate to 19/3=6.33, giving a protein ksize of 6.
While this is not terrible, I think disabling these non-divisible (to integers) ksizes is informative for what is happening under the hood, and is thus clearer.
This PR simply reverses what was changed in #576, and removes the unnecessary (and now failing) test. If we change protein ksize calculation in sourmash 4.x, we can add these right back in.
make test
Did it pass the tests?make coverage
Is the new code covered?without a major version increment. Changing file formats also requires a
major version number increment.
changes were made?