Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Olgabot/use x if codon not found #529

Merged
merged 7 commits into from
Sep 17, 2018
Merged

Olgabot/use x if codon not found #529

merged 7 commits into from
Sep 17, 2018

Conversation

olgabot
Copy link
Collaborator

@olgabot olgabot commented Aug 9, 2018

This adjusts the _dna_to_aa code to fallback on the "X" if the codon was not found. Hopefully addresses #502 because std::map::operator[] inserts a new element and returns it, in this case the empty string, and then the protein kmers built on DNA seuqences with "N"s are an incorrect length because they are too short.

Can't get sourmash to install via conda on my mac laptop right now so I haven't been able to run the tests.

  • Is it mergeable?
  • make test Did it pass the tests?
  • make coverage Is the new code covered?
  • Did it change the command-line interface? Only additions are allowed
    without a major version increment. Changing file formats also requires a
    major version number increment.
  • Was a spellchecker run on the source code and documentation after
    changes were made?

@codecov-io
Copy link

codecov-io commented Aug 9, 2018

Codecov Report

Merging #529 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #529      +/-   ##
==========================================
+ Coverage   90.76%   90.76%   +<.01%     
==========================================
  Files          33       33              
  Lines        5011     5013       +2     
  Branches       36       37       +1     
==========================================
+ Hits         4548     4550       +2     
  Misses        463      463
Impacted Files Coverage Δ
sourmash/kmer_min_hash.hh 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12aa76f...78d4693. Read the comment docs.

@olgabot
Copy link
Collaborator Author

olgabot commented Aug 31, 2018

Any updates on this merge?

@@ -174,7 +174,14 @@ public:
unsigned int dna_size = (dna.size() / 3) * 3; // floor it
for (unsigned int j = 0; j < dna_size; j += 3) {
std::string codon = dna.substr(j, 3);
aa += (_codon_table)[codon];
std::map<std::string,std::string>::iterator translated = _codon_table.find(codon);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick comment: you can replace std::map<std::string,std::string>::iterator with auto and avoid having to figure out the proper type (yay C++11)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! Just made the change :)

@luizirber
Copy link
Member

Hi! Why did you choose X to be the replacement? Is that the equivalent to N in amino acid code? I usually try to follow this table for conversions, but I never saw X before...
(and if it is not clear, this is me confessing my ignorance, so any pointers to other refs would be really helpful =] )

@olgabot
Copy link
Collaborator Author

olgabot commented Sep 7, 2018

Yep, X is the equivalent of N! Here's an official-looking (and Web 0.1-looking) reference: https://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/MolBioReview/iupac_aa_abbreviations.html

@luizirber
Copy link
Member

I think this sounds reasonable, any comments @ctb ?

@luizirber luizirber merged commit 2534989 into sourmash-bio:master Sep 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants