Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in DefenseFinder #1

Open
mmolari opened this issue Jan 23, 2024 · 3 comments
Open

Error in DefenseFinder #1

mmolari opened this issue Jan 23, 2024 · 3 comments

Comments

@mmolari
Copy link

mmolari commented Jan 23, 2024

Hi!

First of all, thank you for putting together such a nice pipeline! It's very convenient to have all of these tools in one place.

I installed BEAV using conda as per instruction on the readme. I downloaded the light version of the database, and then ran BEAV with the command:

beav \
    --input ~/ownCloud/neherlab/code/pangenome-evo/data/fa/NZ_CP124487.1.fa \
    --output test \
    --threads 4 \
    --skip_tiger \
    --skip_gapmind \
    --skip_dbscan-swa \
    --skip_antismash \
    --bakta_arguments '--db ~/miniconda3/envs/beav/db/db-light' \

The first issue I encountered is with DefenseFinder. From the BEAV log file:

Identifying defense systems (DefenseFinder)

Error: error occurred while running DefenseFinder. Please see defensefinder.log
Elapsed: 0hrs 0min 1sec
cut: ./NZ_CP124487.1.fa_defense_finder_genes.tsv: No such file or directory
Here is the DefenseFinder.log
 2024-01-23 11:00:41 | �[32mINFO    �[0m | �[32mReceived file ./bakta/NZ_CP124487.1.fa.faa�[0m
 2024-01-23 11:00:41 | �[33mWARNING �[0m | �[33mOut directory /home/marco/ownCloud/neherlab/code/pangenome-evo/exploration/2401c_beav/test/NZ_CP124487.1.fa already exists. Existing DefenseFinder output will be overwritten�[0m
 2024-01-23 11:00:41 | �[32mINFO    �[0m | �[32mRunning DefenseFinder�[0m
Traceback (most recent call last):
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/profile.py", line 70, in get_profile
    path = model_location.get_profile(gene.name)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/registries.py", line 344, in get_profile
    return self._profiles[name]
KeyError: 'Rst_Hydrolase-Tm__Hydrolase-Tm'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/marco/miniconda3/envs/beav/bin/defense-finder", line 10, in <module>
    sys.exit(cli())
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/defense_finder_cli/main.py", line 143, in run
    defense_finder.run(protein_file_name, dbtype, workers, coverage, tmp_dir, models_dir, no_cut_ga, loglevel)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/defense_finder/__init__.py", line 29, in run
    macsyfinder.main(args=msf_cmd)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/scripts/macsyfinder.py", line 1193, in main
    all_systems, rejected_candidates = search_systems(config, model_registry, models_def_to_detect, logger)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/scripts/macsyfinder.py", line 529, in search_systems
    parser.parse(models_def_to_detect)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/definition_parser.py", line 85, in parse
    self._fill_gene_bank(model_node, model_location, def_loc)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/definition_parser.py", line 287, in _fill_gene_bank
    self.gene_bank.add_new_gene(model_location, gene_name, self.profile_factory)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/gene.py", line 102, in add_new_gene
    gene = CoreGene(model_location, name, profile_factory)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/gene.py", line 114, in __init__
    self._profile = profile_factory.get_profile(self, model_location)
  File "/home/marco/miniconda3/envs/beav/lib/python3.9/site-packages/macsypy/profile.py", line 72, in get_profile
    raise MacsypyError(f"'{model_location.name}/{gene.name}': No such profile")
macsypy.error.MacsypyError: 'defense-finder-models/Rst_Hydrolase-Tm__Hydrolase-Tm': No such profile

I believe this is a known issue and was already raised here. I just wanted to bring it up here as well so that once it is fixed you can update the version as well, or have a temporary fix in the meantime.

For completeness here is the full output of the BEAV command
BEAV version 1.0.0

--input /home/marco/ownCloud/neherlab/code/pangenome-evo/data/fa/NZ_CP124487.1.fa --output test --threads 4 --skip_tiger --skip_gapmind --skip_dbscan-swa --skip_antismash --bakta_arguments --db /home/marco/miniconda3/envs/beav/db/db-light


Checking prerequisites:
----------------------------------------------------------
Bakta: OK
antiSMASH: skipped
MacSyFinder: OK
IntegronFinder: OK
DefenseFinder: OK
TIGER2: skipped
GapMind: skipped
DBSCAN-SWA: skipped
----------------------------------------------------------

Running Bakta

Elapsed: 0hrs 8min 53sec

Done

----------------------------------------------------------

Annotation of other sequence elements

cut: ./borders/NZ_CP124487.1.fa.virbox: No such file or directory
cut: ./borders/NZ_CP124487.1.fa.trabox: No such file or directory
Elapsed: 0hrs 0min 0sec

Done

----------------------------------------------------------

Indentifying oriT

Elapsed: 0hrs 0min 5sec

Done

----------------------------------------------------------

Identifying secretion systems (MacSyFinder)

Elapsed: 0hrs 0min 5sec

Done

----------------------------------------------------------

Identifying integrons (IntegronFinder)

Elapsed: 0hrs 0min 14sec

Done

----------------------------------------------------------

Identifying defense systems (DefenseFinder)

Error: error occurred while running DefenseFinder. Please see defensefinder.log
Elapsed: 0hrs 0min 1sec
cut: ./NZ_CP124487.1.fa_defense_finder_genes.tsv: No such file or directory

Done

----------------------------------------------------------

Identifying biosynthetic gene clusters (antiSMASH)


Skipped

----------------------------------------------------------

Identifying phage (DBSCAN-SWA)


Skipped

----------------------------------------------------------

Characterizing amino acid biosynthesis and small carbon metabolite catabolism (GapMind)


Skipped

----------------------------------------------------------

Identifying integrative conjugative elements [ICEs] (TIGER2)


Skipped

----------------------------------------------------------

Combining annotations and preparing final output files
tee: NZ_CP124487.1.fa/logs/Beav.log: No such file or directory

Elapsed: 0hrs 0min 46sec

Final annotation output: NZ_CP124487.1.fa_final.gbk

----------------------------------------------------------

Creating Circos Map

ls: cannot access 'test/NZ_CP124487.1.fa/*_final.gbk': No such file or directory
cat: 'test/NZ_CP124487.1.fa/*oncogenic_plasmid_final.out.contiglist': No such file or directory
python3 beav_circos.py --input 
usage: beav_circos.py [-h] --input INPUT [--contigs [CONTIGS ...]] [--plasmid PLASMID]
beav_circos.py: error: argument --input/-i: expected one argument
Elapsed: 0hrs 0min 1sec

Done
mv: cannot stat 'NZ_CP124487.1.fa.circos.png': No such file or directory
mv: cannot stat 'NZ_CP124487.1.fa.circos.pdf': No such file or directory
mv: cannot stat 'NZ_CP124487.1.fa.oncogenes.png': No such file or directory
mv: cannot stat 'NZ_CP124487.1.fa.oncogenes.pdf': No such file or directory

----------------------------------------------------------
Summary of annotations

Secretion_Systems      Defense_Systems Phages  Biosynthetic_gene_clusters      ICEs    Integrons
/home/marco/miniconda3/envs/beav/bin/beav: line 1063: N/A: No such file or directory
6   N/A      N/A       N/A    N/A  0

Small carbon catabolism pathways: 

Done

----------------------------------------------------------

The BEAV pipeline automates the use of a number of published software tools.
If you use these results in a publication, please include the following in your methods section and cite:

Jung J, Rahman A, Schiffer A, and Weisberg A. 2023. BEAV: a bacterial genome and mobile element annotation pipeline. https://github.com/weisberglab/beav

grep: test/NZ_CP124487.1.fa/logs/bakta.log: No such file or directory
Bakta version 
Schwengers O, Jelonek L, Dieckmann MA, et al. 2021. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom 7: 000685.

EMBOSS:fuzznuc
EMBOSS: The European Molecular Biology Open Software Suite (2000) Rice,P. Longden,I. and Bleasby,A. Trends in Genetics 16, (6) pp276--277 
head: cannot open 'test/NZ_CP124487.1.fa/MacSyFinder_TXSS/macsyfinder.log' for reading: No such file or directory
MacSyFinder version 
Néron, Bertrand; Denise, Rémi; Coluzzi, Charles; Touchon, Marie; Rocha, Eduardo P.C.; Abby, SophieS 2023. MacSyFinder v2: Improved modelling and search engine to identify molecular systems igenomes. Peer Community Journal, Volume 3, article no. e28. DOI: 10.24072/pcjournal.250.



DefenseFinder
Tesson F., Hervé A. , Touchon M., d’Humières C., Cury J., Bernheim A. Systematic and quantitative view of the antiviral arsenal of prokaryotes bioRx

grep: test/NZ_CP124487.1.fa/Integron_Finder/Results_Integron_Finder_NZ_CP124487.1.fa/integron_finder.out: No such file or directory
IntegronFinder version 
Néron B, Littner E, Haudiquet M, et al. 2022. IntegronFinder 2.0: Identification and Analysis of Integrons across Bacteria, with a Focus on Antibiotic Resistance in Klebsiella. Microorganisms 10: 700.

Thanks again!

Marco

@alexweisberg
Copy link
Member

Dear Marco,
Thank you for bringing this to our attention. Yes, unfortunately it is a bug in MacSyFinder that was uncovered by new DefenseFinder models. Until they update MacSyFinder with the fix on conda, the DefenseFinder component of the pipeline won't work.

Until it is fixed, you could run Beav with --skip_defensefinder to skip running DefenseFinder and it should run the rest of the pipeline.

Alternatively, you could download the updated file from the MacSyFinder commit (gem-pasteur/macsyfinder@27ee21c) and copy it into the corresponding folders in macsyfinder in your conda environment. You would only need to add in the registries.py file for it to work.

To do so, with your conda environment activated:

get your python version: python --version

Mine is python 3.9, so fill that in in the following cp commands:

wget https://github.com/gem-pasteur/macsyfinder/blob/27ee21ceb8e7100d9183b084356f791487aca4ad/macsypy/registries.py

cp registries.py $CONDA_PREFIX/lib/python3.9/site-packages/macsypy/

@mmolari
Copy link
Author

mmolari commented Jan 23, 2024

Thank you for the quick answer!

@alexweisberg
Copy link
Member

alexweisberg commented Jan 23, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants