-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating sketch database can fail with multiple processors #41
Comments
I have seen this before on some systems but haven't been able to track it down. Can I ask what type of configuration you are using (hardware/OS)? |
I am running Linux version 3.2.0-98-generic (buildd@lgw01-13) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)) on a machine with 40 Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz. I am running Mash v1.1.1 and using the provided executable. This is run directly on the server (i.e., not through any scheduling system). It looks to be a race condition since it doesn't always occur. |
I never figured out exactly what was going on, but I was able to reproduce it and then make it go away by tweaking how the files are opened. The patch is pushed to master if you are able to build from source to test it out. |
I just ran into the same issue using the latest binary |
Fixed in v2.0. Please reopen if issues persist. |
I am creating a sketch database from ~1500 genomes. These are all in individual FASTA files. When using multiple processors I often encounter the error:
ERROR: Did not find fasta records in "GCF_000755225.1_genomic.fna"
I have examined this file and it is a valid FASTA file with a single sequence. Note, that using different numbers of processors results in a different input file causing the error. Moreover, the sketch database builds properly if I use a single processor.
I am running Mash as follows:
mash sketch -p 23 -o ../gtdb *.fna
The text was updated successfully, but these errors were encountered: