Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_ref card is attempting to download RGI #301

Closed
rpetit3 opened this issue Aug 12, 2020 · 5 comments
Closed

get_ref card is attempting to download RGI #301

rpetit3 opened this issue Aug 12, 2020 · 5 comments

Comments

@rpetit3
Copy link
Contributor

rpetit3 commented Aug 12, 2020

Hello!

Looks like there's been some changes to the CARD download URL. Changes are causing RGI v5.1.0 to be downloaded instead of card.

 ariba getref card card
Getting available CARD versions
Downloading "https://card.mcmaster.ca/download" and saving as "download.html" ... done
Found versions:
1.0.0   https://card.mcmaster.ca/download/0/broadstreet-v1.0.0.tar.bz2
1.0.1   https://card.mcmaster.ca/download/0/broadstreet-v1.0.1.tar.bz2
1.0.2   https://card.mcmaster.ca/download/0/broadstreet-v1.0.2.tar.bz2
1.0.3   https://card.mcmaster.ca/download/0/broadstreet-v1.0.3.tar.bz2
1.0.4   https://card.mcmaster.ca/download/0/broadstreet-v1.0.4.tar.bz2
1.0.5   https://card.mcmaster.ca/download/0/broadstreet-v1.0.5.tar.bz2
1.0.6   https://card.mcmaster.ca/download/0/broadstreet-v1.0.6.tar.bz2
1.0.7   https://card.mcmaster.ca/download/0/broadstreet-v1.0.7.tar.bz2
1.0.8   https://card.mcmaster.ca/download/0/broadstreet-v1.0.8.tar.bz2
1.0.9   https://card.mcmaster.ca/download/0/broadstreet-v1.0.9.tar.bz2
1.1.0   https://card.mcmaster.ca/download/0/broadstreet-v1.1.0.tar.bz2
1.1.1   https://card.mcmaster.ca/download/0/broadstreet-v1.1.1.tar.bz2
1.1.2   https://card.mcmaster.ca/download/0/broadstreet-v1.1.2.tar.bz2
1.1.3   https://card.mcmaster.ca/download/0/broadstreet-v1.1.3.tar.bz2
1.1.4   https://card.mcmaster.ca/download/0/broadstreet-v1.1.4.tar.bz2
1.1.5   https://card.mcmaster.ca/download/0/broadstreet-v1.1.5.tar.bz2
1.1.6   https://card.mcmaster.ca/download/0/broadstreet-v1.1.6.tar.bz2
1.1.7   https://card.mcmaster.ca/download/0/broadstreet-v1.1.7.tar.bz2
1.1.8   https://card.mcmaster.ca/download/0/broadstreet-v1.1.8.tar.bz2
1.1.9   https://card.mcmaster.ca/download/0/broadstreet-v1.1.9.tar.bz2
1.2.0   https://card.mcmaster.ca/download/0/broadstreet-v1.2.0.tar.bz2
1.2.1   https://card.mcmaster.ca/download/0/broadstreet-v1.2.1.tar.bz2
2.0.0   https://card.mcmaster.ca/download/0/broadstreet-v2.0.0.tar.gz
2.0.1   https://card.mcmaster.ca/download/0/broadstreet-v2.0.1.tar.gz
2.0.2   https://card.mcmaster.ca/download/0/broadstreet-v2.0.2.tar.gz
2.0.3   https://card.mcmaster.ca/download/0/broadstreet-v2.0.3.tar.gz
3.0.0   https://card.mcmaster.ca/download/0/broadstreet-v3.0.0.tar.gz
3.0.1   https://card.mcmaster.ca/download/0/broadstreet-v3.0.1.tar.gz
3.0.2   https://card.mcmaster.ca/download/0/broadstreet-v3.0.2.tar.gz
3.0.3   https://card.mcmaster.ca/download/0/broadstreet-v3.0.3.tar.gz
3.0.4   https://card.mcmaster.ca/download/0/broadstreet-v3.0.4.tar.gz
3.0.5   https://card.mcmaster.ca/download/0/broadstreet-v3.0.5.tar.gz
3.0.6   https://card.mcmaster.ca/download/0/broadstreet-v3.0.6.tar.gz
3.0.7   https://card.mcmaster.ca/download/0/broadstreet-v3.0.7.tar.gz
3.0.8   https://card.mcmaster.ca/download/0/broadstreet-v3.0.8.tar.bz2
3.0.9   https://card.mcmaster.ca/download/0/broadstreet-v3.0.9.tar.bz2
3.1.0   https://card.mcmaster.ca/download/0/broadstreet-v3.1.0.tar.bz2
5.1.0   https://card.mcmaster.ca/download/1/software-v5.1.1.tar.bz2">DOWNLOAD</a></td></tr><div id="other-software" class="more collapse"><tr class="more-software collapse"><td>RGI Software</td><td class="hidden-xs">RGI version 5.1.0 - added beta version for K-mer taxonomic classifiers (rgi kmer_build and rgi kmer_query), updated the code to accept a broader range of nucleotide redundancy codes, and set biopython to version 1.72 due to bug on biopython version 1.74</td><td class="hidden-xs">5.1.0</td><td class="hidden-xs">TAR</td><td class="hidden-xs">2019-08-22 09:43:12.5152</td><td><a href="/download/1/software-v5.1.0.tar.gz
Getting version 5.1.0
Working in temporary directory /home/rpetit/test-grounds/training_set/card.download
Downloading data from card: https://card.mcmaster.ca/download/1/software-v5.1.1.tar.bz2">DOWNLOAD</a></td></tr><div id="other-software" class="more collapse"><tr class="more-software collapse"><td>RGI Software</td><td class="hidden-xs">RGI version 5.1.0 - added beta version for K-mer taxonomic classifiers (rgi kmer_build and rgi kmer_query), updated the code to accept a broader range of nucleotide redundancy codes, and set biopython to version 1.72 due to bug on biopython version 1.74</td><td class="hidden-xs">5.1.0</td><td class="hidden-xs">TAR</td><td class="hidden-xs">2019-08-22 09:43:12.5152</td><td><a href="/download/1/software-v5.1.0.tar.gz
syscall: wget -O card.tar.bz2 https://card.mcmaster.ca/download/1/software-v5.1.1.tar.bz2">DOWNLOAD</a></td></tr><div id="other-software" class="more collapse"><tr class="more-software collapse"><td>RGI Software</td><td class="hidden-xs">RGI version 5.1.0 - added beta version for K-mer taxonomic classifiers (rgi kmer_build and rgi kmer_query), updated the code to accept a broader range of nucleotide redundancy codes, and set biopython to version 1.72 due to bug on biopython version 1.74</td><td class="hidden-xs">5.1.0</td><td class="hidden-xs">TAR</td><td class="hidden-xs">2019-08-22 09:43:12.5152</td><td><a href="/download/1/software-v5.1.0.tar.gz
The following command failed with exit code 4
wget -O card.tar.bz2 https://card.mcmaster.ca/download/1/software-v5.1.1.tar.bz2">DOWNLOAD</a></td></tr><div id="other-software" class="more collapse"><tr class="more-software collapse"><td>RGI Software</td><td class="hidden-xs">RGI version 5.1.0 - added beta version for K-mer taxonomic classifiers (rgi kmer_build and rgi kmer_query), updated the code to accept a broader range of nucleotide redundancy codes, and set biopython to version 1.72 due to bug on biopython version 1.74</td><td class="hidden-xs">5.1.0</td><td class="hidden-xs">TAR</td><td class="hidden-xs">2019-08-22 09:43:12.5152</td><td><a href="/download/1/software-v5.1.0.tar.gz

The output was:

--2020-08-12 13:28:20--  https://card.mcmaster.ca/download/1/software-v5.1.1.tar.bz2%3EDOWNLOAD%3C/a%3E%3C/td%3E%3C/tr%3E%3Cdiv%20id=other-software%20class=more
Resolving card.mcmaster.ca (card.mcmaster.ca)... 130.113.77.126
Connecting to card.mcmaster.ca (card.mcmaster.ca)|130.113.77.126|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2020-08-12 13:28:20 ERROR 404: Not Found.

--2020-08-12 13:28:20--  http://collapse%3E%3Ctr%20class=more-software/
Resolving collapse><tr class=more-software (collapse><tr class=more-software)... failed: Name or service not known.
wget: unable to resolve host address ‘collapse><tr class=more-software’
--2020-08-12 13:28:20--  http://collapse%3E%3Ctd%3Ergi%20software%3C/td%3E%3Ctd%20class=hidden-xs%3ERGI%20version%205.1.0%20-%20added%20beta%20version%20for%20K-mer%20taxonomic%20classifiers%20(rgi%20kmer_build%C2%A0and%C2%A0rgi%20kmer_query),%20updated%20the%20code%20to%20accept%20a%20broader%20range%20of%20nucleotide%20redundancy%20codes,%20and%20set%20biopython%20to%20version%201.72%20due%20to%20bug%20on%20biopython%20version%201.74%3C/td%3E%3Ctd%20class=hidden-xs%3E5.1.0%3C/td%3E%3Ctd%20class=hidden-xs%3ETAR%3C/td%3E%3Ctd%20class=hidden-xs%3E2019-08-22%2009:43:12.5152%3C/td%3E%3Ctd%3E%3Ca%20href=/download/1/software-v5.1.0.tar.gz
Resolving collapse><td>rgi software< (collapse><td>rgi software<)... failed: Name or service not known.
wget: unable to resolve host address ‘collapse><td>rgi software<’

I'll work on a PR to get this fixed.

Thanks!
Robert

@rpetit3
Copy link
Contributor Author

rpetit3 commented Aug 12, 2020

Dug into this further.

Regex: r'''href="(/download/.*?broad.*?v([0-9]+\.[0-9]+\.[0-9]+)\.tar\.(gz|bz2))"''' is matching to this:

<a href="/download/1/software-v5.1.1.tar.bz2">DOWNLOAD</a></td></tr><div id="other-software" class="more collapse"><tr class="more-software collapse"><td>RGI Software</td><td class="hidden-xs">RGI version 5.1.0 - added beta version for K-mer taxonomic classifiers (rgi kmer_build and rgi kmer_query), updated the code to accept a broader range of nucleotide redundancy codes, and set biopython to version 1.72 due to bug on biopython version 1.74</td><td class="hidden-xs">5.1.0</td><td class="hidden-xs">TAR</td><td class="hidden-xs">2019-08-22 09:43:12.5152</td><td><a href="/download/1/software-v5.1.0.tar.gz">

because: updated the code to accept a broader range of nucleotide

One solution is to change the regex to : r'''href="(/download/0/.*?broad.*?v([0-9]+\.[0-9]+\.[0-9]+)\.tar\.(gz|bz2))"'''
Or extend it to r'''href="(/download/.*?broadstreet.*?v([0-9]+\.[0-9]+\.[0-9]+)\.tar\.(gz|bz2))"'''

@rpetit3 rpetit3 closed this as completed Aug 12, 2020
@rpetit3 rpetit3 reopened this Aug 12, 2020
@rpetit3
Copy link
Contributor Author

rpetit3 commented Aug 12, 2020

Related to #302 and #303

@raphenya
Copy link

@rpetit3 CARD updated the downloads for both software and data filename extension from .gz to .bz2 around March 2020, that's why the regex failed. I think your two solutions should work.

@rpetit3
Copy link
Contributor Author

rpetit3 commented Aug 20, 2020

This is fixed on the Bioconda side. You will need to update Ariba in your Conda environment.

conda update -c conda-forge -c bioconda ariba

You should see something like:

The following packages will be UPDATED:

  ariba                               2.14.5-py36hf0b53f7_1 --> 2.14.5-py36hf0b53f7_2

After update:

ariba getref card card
Getting available CARD versions
Downloading "https://card.mcmaster.ca/download" and saving as "download.html" ... done
Found versions:
1.0.0   https://card.mcmaster.ca/download/0/broadstreet-v1.0.0.tar.bz2
1.0.1   https://card.mcmaster.ca/download/0/broadstreet-v1.0.1.tar.bz2
1.0.2   https://card.mcmaster.ca/download/0/broadstreet-v1.0.2.tar.bz2
1.0.3   https://card.mcmaster.ca/download/0/broadstreet-v1.0.3.tar.bz2
1.0.4   https://card.mcmaster.ca/download/0/broadstreet-v1.0.4.tar.bz2
1.0.5   https://card.mcmaster.ca/download/0/broadstreet-v1.0.5.tar.bz2
1.0.6   https://card.mcmaster.ca/download/0/broadstreet-v1.0.6.tar.bz2
1.0.7   https://card.mcmaster.ca/download/0/broadstreet-v1.0.7.tar.bz2
1.0.8   https://card.mcmaster.ca/download/0/broadstreet-v1.0.8.tar.bz2
1.0.9   https://card.mcmaster.ca/download/0/broadstreet-v1.0.9.tar.bz2
1.1.0   https://card.mcmaster.ca/download/0/broadstreet-v1.1.0.tar.bz2
1.1.1   https://card.mcmaster.ca/download/0/broadstreet-v1.1.1.tar.bz2
1.1.2   https://card.mcmaster.ca/download/0/broadstreet-v1.1.2.tar.bz2
1.1.3   https://card.mcmaster.ca/download/0/broadstreet-v1.1.3.tar.bz2
1.1.4   https://card.mcmaster.ca/download/0/broadstreet-v1.1.4.tar.bz2
1.1.5   https://card.mcmaster.ca/download/0/broadstreet-v1.1.5.tar.bz2
1.1.6   https://card.mcmaster.ca/download/0/broadstreet-v1.1.6.tar.bz2
1.1.7   https://card.mcmaster.ca/download/0/broadstreet-v1.1.7.tar.bz2
1.1.8   https://card.mcmaster.ca/download/0/broadstreet-v1.1.8.tar.bz2
1.1.9   https://card.mcmaster.ca/download/0/broadstreet-v1.1.9.tar.bz2
1.2.0   https://card.mcmaster.ca/download/0/broadstreet-v1.2.0.tar.bz2
1.2.1   https://card.mcmaster.ca/download/0/broadstreet-v1.2.1.tar.bz2
2.0.0   https://card.mcmaster.ca/download/0/broadstreet-v2.0.0.tar.gz
2.0.1   https://card.mcmaster.ca/download/0/broadstreet-v2.0.1.tar.gz
2.0.2   https://card.mcmaster.ca/download/0/broadstreet-v2.0.2.tar.gz
2.0.3   https://card.mcmaster.ca/download/0/broadstreet-v2.0.3.tar.gz
3.0.0   https://card.mcmaster.ca/download/0/broadstreet-v3.0.0.tar.gz
3.0.1   https://card.mcmaster.ca/download/0/broadstreet-v3.0.1.tar.gz
3.0.2   https://card.mcmaster.ca/download/0/broadstreet-v3.0.2.tar.gz
3.0.3   https://card.mcmaster.ca/download/0/broadstreet-v3.0.3.tar.gz
3.0.4   https://card.mcmaster.ca/download/0/broadstreet-v3.0.4.tar.gz
3.0.5   https://card.mcmaster.ca/download/0/broadstreet-v3.0.5.tar.gz
3.0.6   https://card.mcmaster.ca/download/0/broadstreet-v3.0.6.tar.gz
3.0.7   https://card.mcmaster.ca/download/0/broadstreet-v3.0.7.tar.gz
3.0.8   https://card.mcmaster.ca/download/0/broadstreet-v3.0.8.tar.bz2
3.0.9   https://card.mcmaster.ca/download/0/broadstreet-v3.0.9.tar.bz2
3.1.0   https://card.mcmaster.ca/download/0/broadstreet-v3.1.0.tar.bz2
Getting version 3.1.0
Working in temporary directory /local/home/rpetit/temp_blast/card.download
Downloading data from card: https://card.mcmaster.ca/download/0/broadstreet-v3.1.0.tar.bz2
syscall: wget -O card.tar.bz2 https://card.mcmaster.ca/download/0/broadstreet-v3.1.0.tar.bz2
...finished downloading
Extracted json data file ./card.json. Reading its contents...
Found 3022 records in the json file. Analysing...
Extracted data and written ARIBA input files

Finished. Final files are:
        /local/home/rpetit/temp_blast/card.fa
        /local/home/rpetit/temp_blast/card.tsv

You can use them with ARIBA like this:
ariba prepareref -f /local/home/rpetit/temp_blast/card.fa -m /local/home/rpetit/temp_blast/card.tsv output_directory

If you use this downloaded data, please cite:
"The Comprehensive Antibiotic Resistance Database", McArthur et al 2013, PMID: 23650175
and in your methods say that version 3.1.0 of the database was used

@puethe
Copy link
Contributor

puethe commented Sep 7, 2020

Fixed with #302

@puethe puethe closed this as completed Sep 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants