-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--minFraction too stringent for fragmented genomes? #70
Comments
Check this line in the code, it should be clear from there. |
Essentially it checks the ratio of shared genome length vs. the length of the smaller of two genomes being compared. In older versions, FastANI had an absolute cutoff on shared genome length before trusting the ANI value, which was not good as the cutoff value should ideally depend on genome lengths. |
So shared over length of smallest genome, rather than shared over length of smallest genome's hashed fragments, is that right?. So if a significant portion of the smallest genome is in small contigs, a significant portion cannot count in the numerator but is included in the denominator. Does that make sense? Should it be changed somehow?
|
I think I understand your point. Are you able to attach the fna files here? I'll take a look. |
I've sent the MAGs to you over email just now. |
The 'test_fraglen' is now commented out as the test is now invalid. See also: ParBLiSS/FastANI#70 Reported by: @apcamargo
Apologies for the delay in my response. I've revised the master branch to fix this. It is now working on this example.
|
Thanks. You can use the genomes as public test cases if that is helpful, as galah has done.
…________________________________
From: Chirag Jain <[email protected]>
Sent: Thursday, August 20, 2020 3:22:41 PM
To: ParBLiSS/FastANI <[email protected]>
Cc: Ben J Woodcroft <[email protected]>; Author <[email protected]>
Subject: Re: [ParBLiSS/FastANI] --minFraction too stringent for fragmented genomes? (#70)
Apologies for the delay in my response. I've revised the master branch to fix this. It is now working on this example.
$EXE -q MAG52.fna -r MAG189.fna -o /dev/stdout --minFraction 0.5
BE_RX_R2_MAG52.fna BE_RX_R3_MAG189.fna 97.4762 228 629
$EXE -r MAG52.fna -q 189.fna -o /dev/stdout --minFraction 0.5
BE_RX_R3_MAG189.fna BE_RX_R2_MAG52.fna 98.351 232 255
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#70 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAADX5B7QRLIYROIXQNO7JTSBSXKDANCNFSM4O7A657Q>.
|
Hi, would it be possible to add this fix to a release? |
Done. |
Hi, thanks a report by @apcamargo at wwood/galah#7 I came across an issue with
--minFraction
on these fragmented genomes. They seem to align well:But when
--minFraction
is used the hit goes away. This is even though 232/255 > 0.5:Filtering out this alignment by the minFraction seems incorrect to me. I wonder what the definition of the minFraction actually is. Is it the fraction of the total genome length or the fraction of the genome that is long enough to be included as a fragment, or something along those lines?
Thanks, ben
The text was updated successfully, but these errors were encountered: