You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a FASTQ file from which I would like to subsample very specific sequences by ID. Additionally, some sequences should be subsampled multiple times. However, using seqkit grep with --pattern-file, it extracts each pattern only once.
The file contains 4 patterns (2 unique and 1 duplicate). Would it be possible to add a parameter such that the patterns are not converted to unique patterns?
In contrast, seqtk does not only extract unique IDs: seqtk subseq mock.fq id_list.txt
All 4 patterns are used, so the output contains 4 sequences.
Prerequisites
seqkit version
Describe your issue
I have a FASTQ file from which I would like to subsample very specific sequences by ID. Additionally, some sequences should be subsampled multiple times. However, using
seqkit grep
with--pattern-file
, it extracts each pattern only once.seqkit grep -f id_list.txt mock.fq
[INFO] 3 patterns loaded from file
The file contains 4 patterns (2 unique and 1 duplicate). Would it be possible to add a parameter such that the patterns are not converted to unique patterns?
In contrast,
seqtk
does not only extract unique IDs:seqtk subseq mock.fq id_list.txt
All 4 patterns are used, so the output contains 4 sequences.
mock.fq
id_list.txt
The text was updated successfully, but these errors were encountered: