Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mimir query engine: fix issue where rate() over native histograms could panic or return incorrect results #8850

Merged
merged 8 commits into from
Jul 31, 2024

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Jul 30, 2024

What this PR does

This PR fixes an issue where multiple queries running rate() over native histograms simultaneously could panic or return incorrect results.

This could happen when multiple queries reused the same FloatHistogram instance due to a bug in HPointRingBuffer. HPointRingBuffer could retain a reference to FloatHistograms while also returning a slice containing those FloatHistograms to the pool, so a second query could modify those FloatHistograms concurrently with the original query.

Note to reviewers: I've tried to build up this PR incrementally. I'd suggest reviewing each commit individually.

Which issue(s) this PR fixes or relates to

(none)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [n/a] about-versioning.md updated with experimental features.

@charleskorn charleskorn force-pushed the charleskorn/mqe-rate-over-nh branch from 9019dd4 to 1c845c3 Compare July 30, 2024 07:05
@charleskorn charleskorn marked this pull request as ready for review July 30, 2024 07:08
@charleskorn charleskorn requested a review from a team as a code owner July 30, 2024 07:08
Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering about the practical consequences of removing the Reset calls.

pkg/streamingpromql/engine_concurrency_test.go Outdated Show resolved Hide resolved
@@ -161,7 +161,6 @@ func (b *FPointRingBuffer) Reset() {

// Close releases any resources associated with this buffer.
func (b *FPointRingBuffer) Close() {
b.Reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the Reset call unnecessary? Which practical consequences does removing it have?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling Reset is unnecessary because a FPointRingBuffer isn't expected to be used after Close is called.

@aknuds1 aknuds1 requested a review from a team July 30, 2024 08:49
Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@charleskorn charleskorn merged commit f8db025 into main Jul 31, 2024
29 checks passed
@charleskorn charleskorn deleted the charleskorn/mqe-rate-over-nh branch July 31, 2024 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants