Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime of seqkit amplicon in v2.7.0 is much longer than v2.3.0 #439

Closed
fsnibs10 opened this issue Feb 18, 2024 · 4 comments
Closed

Runtime of seqkit amplicon in v2.7.0 is much longer than v2.3.0 #439

fsnibs10 opened this issue Feb 18, 2024 · 4 comments

Comments

@fsnibs10
Copy link

Hi developers,

Recently, I used seqkit amplicon to extract the target sequence from the compressed FASTQ file by giving primer sequence. I downloaded the latest version (v2.7.0) and the previous version (v2.3.0).

The download command is shown below.
wget https://github.com/shenwei356/seqkit/releases/download/v2.3.0/seqkit_linux_amd64.tar.gz
wget https://github.com/shenwei356/seqkit/releases/download/v2.7.0/seqkit_linux_amd64.tar.gz

I found that runtime of amplicon module in the latest version (seqkit v2.7.0) is much longer than seqkit v2.3.0. The file size of the sequencing data is about 550Mb, including 5671607 reads. With the same command and server computer, seqkit amplicon v2.3.0 runs very fast, taking 20 seconds. While the execution time of version 2.7.0 is about 7 minutes. I don't know why. My command is shown bleow.

seqkit amplicon --threads 8 -F AAGAGTGGAG -R GTTCATCC -o sample.read1.fq read1.fq.gz

shenwei356 added a commit that referenced this issue Feb 19, 2024

Verified

This commit was signed with the committer’s verified signature.
manimaul William Kamp
@shenwei356
Copy link
Owner

Thanks for reporting this. This bug was introduced in v2.7.0. It's fixed now.

seqkit_linux_amd64.tar.gz

Besides, after checking the code, I think I can make it faster.

@shenwei356
Copy link
Owner

Use this, it's slightly faster.

@fsnibs10
Copy link
Author

Thanks! I have tested this improved version with the same dataset. It takes about 20 seconds, very fast.

shenwei356 added a commit that referenced this issue Apr 25, 2024
… primers are given, only the last one is used. #457. introduced in fixing #439
@shenwei356
Copy link
Owner

@fsnibs10 Sorry, the previous changes introduced a bug. see #457 . It occurred when more than 2 pairs of primers were given.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants