Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to detect presence or absence of a sequence carrying a deletion #184

Closed
bioprojects opened this issue Jul 17, 2017 · 2 comments
Closed

Comments

@bioprojects
Copy link

bioprojects commented Jul 17, 2017

Dear Martin,

Thank you so much again for explaining ARIBA
during the conference ABPHM2017 in Sanger in May.

I do think ARIBA is a really nice, well-designed,
generally useful tool.

I'm trying to detect presence or absence of a promoter
sequence carrying a deletion associated with AMR.

When I used the sequence carrying the deletion as a reference,
a report of ARIBA run for a sample without carrying the deletion is
http://yahara.hustle.ne.jp/projects/ariba/report.tsv
in which column T (ref_ctg_effect) is correctly indicated as "INS".

However, ARIBA summary for the report
http://yahara.hustle.ne.jp/projects/ariba/out.summary.csv
says "mtrR_Adel_promoter.match" is yes.

But this sample does not have the deletion, so I would like to
have a summary in which "mtrR_Adel_promoter.match" is no.

Of course I read definition of "match" judged by the summary command
https://github.com/sanger-pathogens/ariba/wiki/Task:-summary

Is ARIBA not designed to detect this kind of deletion?
I appreciate your comment.
Thanks a lot in advance.

Best wishes,

Koji


Koji Yahara
Senior Research Fellow
Antimicrobial Resistance Research Center
National Institute of Infectious Diseases
4-2-1 Aobacho, Higashimurayama, Tokyo
189-0002 Japan
Tel: +81-42-561-0771 (Ex. 3539)

@martinghunt
Copy link
Collaborator

Ariba does detect it, in the sense that it is reported in report.tsv - this is where you will need to look to be sure exactly what happened. The summary can't catch every case and while developing ariba different users wanted different incompatible behaviours for what summary counts as a "match". This is the reason for the extra columns. You could use the pct_id or novel_var columns from summary to identify samples that do not match perfectly. Ultimately, every use case is different and sorry but it's impossible to get ariba summary to behave in a way that works for everyone.

@bioprojects
Copy link
Author

Thank you very much for your answer. OK, I've understood it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants