Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/gzip: writer spend too long time #69528

Closed
sunhuaibo opened this issue Sep 19, 2024 · 2 comments
Closed

compress/gzip: writer spend too long time #69528

sunhuaibo opened this issue Sep 19, 2024 · 2 comments

Comments

@sunhuaibo
Copy link

Go version

go version go1.23.1 linux/amd64

Output of go env in your module/workspace:

...

What did you do?

Trying to compress a file (75M) with gzip, but it took too long.

import (
	"bufio"
	"compress/gzip"
	"os"
	// gzip "github.com/klauspost/pgzip"
)

func main() {

	// R22015519_R1.fastq: size 76M
	f, err := os.Open("R22015519_R1.fastq")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	fw, err := os.Create("R22015519_R1.fastq.gz")
	if err != nil {
		panic(err)
	}
	defer fw.Close()

	w := gzip.NewWriter(fw)
	defer w.Close()
	defer w.Flush()

	sc := bufio.NewScanner(f)
	for sc.Scan() {
		w.Write(sc.Bytes())
		w.Write([]byte("\n"))
	}
}

What did you see happen?

The time spent by the compress/gzip package is dozens of times that of the pgzip (github.com/klauspost/pgzip) package.

Using compress/gzip:

real	2m39.768s
user	0m20.957s
sys	0m1.569s

Using github.com/klauspost/pgzip:

real	0m1.222s
user	0m1.161s
sys	0m0.098s

What did you expect to see?

NULL

@seankhliao
Copy link
Member

As the documentation for pgzip mentions, it starts multiple go routines to run compression in parallel. This is something we are unlikely to do in the standard compress/gzip.

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants