Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal - Add --traverse-directory to sourmash compare #406

Closed
brooksph opened this issue Feb 19, 2018 · 3 comments
Closed

Proposal - Add --traverse-directory to sourmash compare #406

brooksph opened this issue Feb 19, 2018 · 3 comments

Comments

@brooksph
Copy link
Contributor

brooksph commented Feb 19, 2018

I'm running sourmash compare with a large number of signatures. I'm able to make the comparison using 'sourmash compare *sig' but the same method will not work not work with docker. For example the following command docker run -v ${PWD}:/data quay.io/biocontainers/sourmash:2.0.0a3--py36_0 /data/*sig -k 51 returns

[Errno 2] No such file or directory: '/data/*sig

warning: no signatures loaded at given ksize/molecule type from /data/*sig
loaded 0 signatures total.                                                     
cannot mix scaled signatures with bounded signatures

To confirm that this not an issue with the container I ran docker run -v ${PWD}:/data quay.io/biocontainers/sourmash:2.0.0a3--py36_0 sourmash compare /data/example1.scaled10k.k51.sig /data/example2.sig -k 51 which yielded

0-HKVWJBCXY170605...	[1. 0.]
1-CAVL1ANXX170419...	[0.712 1.   ]
min similarity in matrix: 0.712
loaded 2 signatures total.                                                     
downsampling to scaled value of 1000

Proposal - I think adding '--traverse-directory' to sourmash compare will resolve this issue when run in the following way docker run -v ${PWD}:/data quay.io/biocontainers/sourmash:2.0.0a3--py36_0 /data/ --traverse-directory -k 51 if sourmash knows to look for signatures in the specified directory without providing a pattern.

I started modifying the code but need some help completing the pull request on branch 'add/compare_traverse_directory' right now 'sourmash compare *sig --traverse-directory' yields 'sourmash: error: unrecognized arguments: --traverse-directory' despite changes to commands.py.

@ctb
Copy link
Contributor

ctb commented Feb 19, 2018

that new option sounds good!

two thoughts --

first, the error you're encountering is this, cannot mix scaled signatures with bounded signatures. So somewhere in that directory you have some signatures that weren't computed with --scaled.

second thought - the code on branch add/compare_traverse_directory does add the --traverse-directory option for me so I suspect you are not running your modified code. How are you running sourmash after modifying it?

@olgabot
Copy link
Collaborator

olgabot commented Jun 15, 2018

should this issue be closed now that the PR #412 is merged?

@luizirber
Copy link
Member

thanks @olgabot =]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants