gzip: fix memory allocs (buffers not returned to pool) #28

wojas · 2020-02-11T05:09:13Z

Fix allocation leaks during gzip writes. Due to incorrect use of the dstPool (unmatched Get and Put), large amounts of memory were temporarily allocated during writes and not put back into the pool.

This also removes some special handling code in compressCurrent that would recursively call itself for too large input buffers. This condition can never occur, because Write ensures that blocks are capped
and there is no other public interface that extends currentBuffer. The recursive call that slices the buffer would have made returning byte slices to the Pool dangerous, as we could have been returning the same
underlying buffer multiple times.

This also adds a test to check allocations per Write to prevent regressions. There is further room for improvement, but this was by far the biggest leak.

Closes #8

~~Additionally, this adds a go.mod for Go modules support.~~

(Note that tests broke with recent commit a8ba214).

Fix allocation leaks during gzip writes. Due to incorrect use of the dstPool (unmatched Get and Put), large amounts of memory were temporarily allocated during writes and not put back into the pool. This also removes some special handling code in compressCurrent that would recursively call itself for too large input buffers. This condition can never occur, because Write ensures that blocks are capped and there is no other public interface that extends currentBuffer. The recursive call that slices the buffer would have made returning byte slices to the Pool dangerous, as we could have been returning the same underlying buffer multiple times. This also adds a test to check allocations per Write to prevent regressions. There is further room for improvement, but this was by far the biggest leak.

klauspost · 2020-02-11T05:14:06Z

Thanks. I will review tomorrow.

Please take out the go.mod - I have not decided to go that route yet.

wojas · 2020-02-11T05:28:57Z

Thanks for the fast response. I took out the go.mod.

wojas · 2020-02-11T05:46:26Z

I found that the tests pass without -v, but fail with -v... Investigating.

The allocation test was incorrectly using Alloc (the current number of allocated bytes) instead of TotalAlloc (the cummulative number of bytes), which would break the test if garbage collection happened during the run. This also increases the allocated bytes check to allow for occasional other background allocations. I once saw 3kB instead of the usual <700B allocations. This is still sufficiently low to detect pool usage issues.

In the panics, make clear that these can only happen due to concurrent Write call races, which are unsafe. The alternative would be to silently corrupt data. Move the panic to the beginning of compressCurrent to not deadlock tests after a panic.

Use a pool buffer for prevTail instead of slicing the current buffer, which may have been returned to the pool by the time the tail is accessed.

Skip the allocation tests when running with the race detector, because the race detector allocates a lot of memory. Rename the unreliable tests file.

wojas · 2020-02-11T13:46:56Z

This PR should be done as is. The remaining test failures are also present in master and I cannot reproduce them locally in my branch, which is based on v1.2.1.

I suspect they are caused by a8ba214.

klauspost · 2020-02-11T17:13:18Z

Thanks! I'll take a look. 👍

klauspost

Thanks for taking a look at it. It is some quite old code that I haven't touched in a while, so I also have some catching up to do.

gzip_norace_test.go

klauspost · 2020-02-11T22:31:08Z

gzip_norace_test.go

+	// but it's better than the 175846 bytes before the pool release fix this tests.
+	// TODO: Further reduce allocations
+	if allocBytes > 10240 {
+		t.Errorf("Write allocated too much memory per run (%.0f bytes), Pool used incorrectly?", allocBytes)


For the failures it reports gzip_test.go:637: Write allocated too much memory per run (18446744073709436 bytes), Pool used incorrectly?

That is 18446.7TB - I think you would be hard pressed to allocated that much, so I suspect there is something wrong.

Are you sure this happened with the latest version? I noticed this in earlier test runs and 799eba1 fixes this. Before that commit I was using .Alloc instead of .TotalAlloc.

From the MemStats comments:

TotalAlloc increases as heap objects are allocated, but unlike Alloc and HeapAlloc, it does not decrease when objects are freed.

Yeah, seems I mixed up the runs. Anyway, merging master should fix the writer.

klauspost · 2020-02-11T22:32:39Z

gzip_norace_test.go

+
+	// Read the final statistics
+	runtime.ReadMemStats(&memstats)
+	allocs := memstats.TotalAlloc - oldTotal


I suspect the error is caused by memstats.TotalAlloc > oldTotal causing an underflow and the crazy numbers.

Not really an expert on how that can happen.

See previous comment

klauspost · 2020-02-11T22:35:31Z

gzip_unreliable_test.go

+	w.SetConcurrency(1000, 1)
+	data := bytes.Repeat([]byte("T"), 100000) // varying block splits
+
+	const n = 1000


Maybe:

Suggested change

const n = 1000

n := 1000

if testing.Short() {

n = 10

}

(or make the i++ change to bigger increments).

Though you have this guarded nicely by tags.

This test tries to reproduce a race condition. I tried, but I cannot reproduce it with n = 10.

ah, ok, no problem.

Co-Authored-By: Klaus Post <[email protected]>

klauspost · 2020-02-12T05:39:37Z

Cool. I will give it some tests before the next release. Thanks a bunch!

wojas force-pushed the fix-memory-allocs branch from 8d889e8 to c548e59 Compare February 11, 2020 05:26

wojas added 4 commits February 11, 2020 14:06

gzip: fix race in prevTail

ec3e622

Use a pool buffer for prevTail instead of slicing the current buffer, which may have been returned to the pool by the time the tail is accessed.

gzip tests: do not run alloc test with -race

3442556

Skip the allocation tests when running with the race detector, because the race detector allocates a lot of memory. Rename the unreliable tests file.

klauspost reviewed Feb 11, 2020

View reviewed changes

gzip tests: private allocBytesPerRun

6b5bd9d

Co-Authored-By: Klaus Post <[email protected]>

wojas force-pushed the fix-memory-allocs branch from a13a592 to 6b5bd9d Compare February 12, 2020 05:08

klauspost merged commit c94cf00 into klauspost:master Feb 12, 2020

dependabot bot mentioned this pull request Mar 15, 2021

Bump github.com/klauspost/pgzip from 1.2.1 to 1.2.5 wtsi-npg/valet#70

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gzip: fix memory allocs (buffers not returned to pool) #28

gzip: fix memory allocs (buffers not returned to pool) #28

wojas commented Feb 11, 2020 •

edited

Loading

klauspost commented Feb 11, 2020

wojas commented Feb 11, 2020

wojas commented Feb 11, 2020

wojas commented Feb 11, 2020

klauspost commented Feb 11, 2020

klauspost left a comment

klauspost Feb 11, 2020

wojas Feb 12, 2020

klauspost Feb 12, 2020

klauspost Feb 11, 2020

wojas Feb 12, 2020

klauspost Feb 11, 2020

wojas Feb 12, 2020

klauspost Feb 12, 2020

klauspost commented Feb 12, 2020

-	const n = 1000
+	n := 1000
+	if testing.Short() {
+		n = 10
+	}

gzip: fix memory allocs (buffers not returned to pool) #28

gzip: fix memory allocs (buffers not returned to pool) #28

Conversation

wojas commented Feb 11, 2020 • edited Loading

klauspost commented Feb 11, 2020

wojas commented Feb 11, 2020

wojas commented Feb 11, 2020

wojas commented Feb 11, 2020

klauspost commented Feb 11, 2020

klauspost left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klauspost commented Feb 12, 2020

wojas commented Feb 11, 2020 •

edited

Loading