-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for errors when iterating through chunks or samples in response to a query #6451
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks correct to me, but I'd also like to see tests. Not sure off the top of my head how to implement tests though, I'll try to see if I can figure out how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. I agree that it's hard to test this. I think the direct usage of *userTSDB
makes it difficult to inject behaviour.
What if we cancelled the context of the Query operation? Would that be susceptible to a race condition? It's a bit of a contrived test, but it might help uncover this bug.
I think that wouldn't guarantee we're exercising the changes in this PR - there are a number of different places that might respond to the context cancellation depending on when it occurs. |
Given the approval from @dimitarvdimitrov (and offline discussion with @aknuds1), I'm going to set this to merge once CI is passing - but if anyone has any post-merge ideas on how to test this (or any other feedback), I'd love to hear them. |
my approach would be to have a |
https://github.com/grafana/mimir/compare/charleskorn/test-for-6451 is the most reliable test I've been able to find. I don't love it: it's tied very closely to the current structure of the |
Reverted in #6576. |
What this PR does
This PR fixes an issue where ingesters could ignore errors encountered while iterating through chunks or samples in response to a query request.
Note to reviewers: I haven't added a test for this case as introducing an error does not seem straightforward with how we test the ingester query methods. I'm open to feedback or suggestions on how to do this though.
Which issue(s) this PR fixes or relates to
(none)
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]