This repo contains validation code for sourmash branchwater, which supports massive-scale sequence search of genomes and metagenomes using sourmash and FracMinHash.
Also see:
- pyo3_branchwater repo
- Sourmash Branchwater Enables Lightweight Petabyte-Scale Sequence Search, Irber, Pierce-Ward, Brown, 2022.
- Benchmarking repo.
mkdir metagenomes
mkdir MAGs
cd MAGs/
sourmash sketch dna -p k=31 --name-from *.gz
cd ..
for i in $(cat orig-list.txt); do
cp /group/ctbrowngrp/irber/data/wort-data/wort-sra/sigs/$i.sig metagenomes/
done
srun -p high2 --time=2:00:00 --nodes=1 --cpus-per-task 8 --mem 40GB
snakemake -j 8