Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data race in BucketedBytes pool #4792

Merged
merged 3 commits into from
Oct 20, 2021

Conversation

colega
Copy link
Contributor

@colega colega commented Oct 20, 2021

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Previous test didn't detect the data race: we copied the bytes header to the bytes.Buffer so when appending to the slice we were not modifying the original one.

However, the usage of this in bucketChunkReader.save() actually modifies the referenced slice, so the test was modified to test that it can be done safely.

The race condition happened because we were reading the referenced slice capacity after putting it back to the pool, when someone else might already retrieved and modified it.

Verification

The test was updated to match the real-life usage of the pool.

Before modifying the implementation, this was the data race reported by the updated test:

==================
WARNING: DATA RACE
Read at 0x00c0000bc900 by goroutine 36:
  github.com/thanos-io/thanos/pkg/pool.(*BucketedBytes).Put()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool.go:124 +0x1f9
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet.func1()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:108 +0xfa
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet·dwrap·3()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x65

Previous write at 0x00c0000bc900 by goroutine 27:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet.func1()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:94 +0x1fa
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet·dwrap·3()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x65

Goroutine 36 (running) created at:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x257
  testing.tRunner()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
  1 Fix data race in BucketedBytes pool
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1306 +0x47

Goroutine 27 (running) created at:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x257
  testing.tRunner()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1306 +0x47
==================

Previous test didn't detect the data race: we copied the bytes header to
the bytes.Buffer so when appending to the slice we were not modifying
the original one.

However, the usage of this in bucketChunkReader.save() actually modifies
the referenced slice, so the test was modified to test that it can be
done safely.

The race condition happened because we were reading the referenced slice
capacity after putting it back to the pool, when someone else might
already retrieved and modified it.

Before modifying the implementation, this was the data race reported:

==================
WARNING: DATA RACE
Read at 0x00c0000bc900 by goroutine 36:
  github.com/thanos-io/thanos/pkg/pool.(*BucketedBytes).Put()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool.go:124 +0x1f9
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet.func1()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:108 +0xfa
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet·dwrap·3()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x65

Previous write at 0x00c0000bc900 by goroutine 27:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet.func1()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:94 +0x1fa
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet·dwrap·3()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x65

Goroutine 36 (running) created at:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x257
  testing.tRunner()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
  1 Fix data race in BucketedBytes pool
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1306 +0x47

Goroutine 27 (running) created at:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x257
  testing.tRunner()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1306 +0x47
==================

Signed-off-by: Oleg Zaytsev <[email protected]>
Signed-off-by: Oleg Zaytsev <[email protected]>
@colega
Copy link
Contributor Author

colega commented Oct 20, 2021

Marking this change as not relevant for the end user because the data race is really minor and can't cause data corruption.

@colega colega marked this pull request as ready for review October 20, 2021 11:51
pracucci
pracucci previously approved these changes Oct 20, 2021
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job spotting and fixing it! 👏

Signed-off-by: Oleg Zaytsev <[email protected]>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, thank you so much!

@bwplotka bwplotka merged commit d2d53e5 into thanos-io:main Oct 20, 2021
@colega colega deleted the fix-data-race-in-bucketed-bytes-pool branch October 21, 2021 07:15
GiedriusS pushed a commit to GiedriusS/thanos that referenced this pull request Oct 22, 2021
* Fix data race in BucketedBytes pool

Previous test didn't detect the data race: we copied the bytes header to
the bytes.Buffer so when appending to the slice we were not modifying
the original one.

However, the usage of this in bucketChunkReader.save() actually modifies
the referenced slice, so the test was modified to test that it can be
done safely.

The race condition happened because we were reading the referenced slice
capacity after putting it back to the pool, when someone else might
already retrieved and modified it.

Before modifying the implementation, this was the data race reported:

==================
WARNING: DATA RACE
Read at 0x00c0000bc900 by goroutine 36:
  github.com/thanos-io/thanos/pkg/pool.(*BucketedBytes).Put()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool.go:124 +0x1f9
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet.func1()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:108 +0xfa
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet·dwrap·3()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x65

Previous write at 0x00c0000bc900 by goroutine 27:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet.func1()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:94 +0x1fa
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet·dwrap·3()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x65

Goroutine 36 (running) created at:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x257
  testing.tRunner()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
  1 Fix data race in BucketedBytes pool
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1306 +0x47

Goroutine 27 (running) created at:
  github.com/thanos-io/thanos/pkg/pool.TestRacePutGet()
      /Users/oleg/w/github.com/thanos-io/thanos/pkg/pool/pool_test.go:119 +0x257
  testing.tRunner()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1259 +0x22f
  testing.(*T).Run·dwrap·21()
      /usr/local/Cellar/go/1.17/libexec/src/testing/testing.go:1306 +0x47
==================

Signed-off-by: Oleg Zaytsev <[email protected]>

* Update CHANGELOG.md

Signed-off-by: Oleg Zaytsev <[email protected]>

* goimports fix

Signed-off-by: Oleg Zaytsev <[email protected]>
Signed-off-by: Giedrius Statkevičius <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants