Mimir query engine: correctly handle functions that produce multiple series with the same labels #9019

charleskorn · 2024-08-16T07:04:09Z

What this PR does

This PR fixes MQE's handling of queries that produce results with multiple series with the same labels.

This is currently only possible in MQE in places where we drop the metric name from the input series: in all other cases it's not possible to return multiple series with the same labels. (Functions like label_replace and unary negation will need similar checks when we implement those features.)

See #9019 (comment) for more explanation of the logic in this PR.

Which issue(s) this PR fixes or relates to

(none)

Checklist

Tests updated.
[n/a] Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
[n/a] about-versioning.md updated with experimental features.

jhesketh

lgtm, just one question inline

jhesketh · 2024-08-16T07:19:02Z

pkg/streamingpromql/functions/common_test.go

@@ -30,6 +30,16 @@ func TestDropSeriesName(t *testing.T) {
 	require.Equal(t, expected, modifiedMetadata)
 }

+func TestDropSeriesName_DuplicateSeries(t *testing.T) {


not SadPath? :-(

pkg/streamingpromql/types/data.go

jhesketh · 2024-08-16T07:32:04Z

pkg/streamingpromql/functions/common.go

@@ -22,6 +24,11 @@ func DropSeriesName(seriesMetadata []types.SeriesMetadata, _ *limiting.MemoryCon
 		seriesMetadata[i].Labels = seriesMetadata[i].Labels.DropMetricName()
 	}

+	if hasDuplicate, example := types.HasDuplicateSeries(seriesMetadata); hasDuplicate {


Should this check not be done in query.go when we get all of the series names?
It would help catch other operators incorrectly handling series labels such as aggregations. Although that's more likely a bug and unnecessary overhead. Or perhaps we do it in query.go in convertFunctionCallToOperator so that we apply it to all functions that can manipulate labels. Otherwise if we manipulate labels differently in a future function we may forget to apply this check.

Should this check not be done in query.go when we get all of the series names? It would help catch other operators incorrectly handling series labels such as aggregations. Although that's more likely a bug and unnecessary overhead.

This wouldn't be a thorough-enough check - duplicate series are not permitted in any intermediate stage, not just the end result. And yep, I think it would be unnecessary overhead to do this and a bug if something was not correctly handling series labels.

Or perhaps we do it in query.go in convertFunctionCallToOperator so that we apply it to all functions that can manipulate labels. Otherwise if we manipulate labels differently in a future function we may forget to apply this check.

I'm thinking something more along these lines makes sense, especially given this realisation over the weekend.

Given we need similar checking and merging logic in many places¹, I'm wondering if this should be done in an operator that wraps operators that manipulate labels. This wrapping operator would be responsible for:

merging the results from different input series that map to the same output series

rejecting queries where results from different input series conflict at a single timestamp

We can then apply this operator whenever the expression manipulates labels.

The other option would be to add this logic to FunctionOverInstantVector and FunctionOverRangeVector, and possibly other places in the future, but I think that will make them more complex to reason about and unnecessarily duplicate some logic that can live in one place.

What do you think?

Footnotes

all the functions over instant vectors and range vectors, label_replace, label_join, unary negation, possibly others? ↩

I think a wrapper operator makes the most sense. I think the only reason not to would be if we have an opportunity in a downstream operator to detect such a conflict. However, we can likely do both and if we have an opportunity for such an optimisation we can percolate the error back up.

charleskorn · 2024-08-18T22:11:52Z

Thinking about this more, I realised the solution isn't quite as simple as this, so I'm going to put this PR back to a draft until I can implement this properly:

It's not enough to check for duplicate series, what we need to do is check for conflicting samples for the same output series, and merge the results from input series that map to the same output series.

eg. imagine we're running the query ceil({__name__=~"some_metric|some_other_metric"}).

If the input data is:

some_metric        1 2 _ _
some_other_metric  _ _ 3 4

then we should not produce an error and should merge the result to a single series:

{} 1 2 3 4

However, the implementation here will cause the query to return an error.

And, without this PR, we'll currently return this result, which is also invalid:

{} 1 2 _ _
{} _ _ 3 4

The only time we should return an error is if the input series map to the same output series and have a conflicting sample, for example:

some_metric       1 2 _ _
some_other_metric 1 _ _ _

…abels.

…ge vectors.

…es returned by `HasDuplicateSeries`

… need it

# Conflicts: # CHANGELOG.md # pkg/streamingpromql/operators/binary_operation_test.go

charleskorn · 2024-08-29T07:00:07Z

pkg/streamingpromql/functions.go

@@ -37,7 +37,8 @@ type ScalarFunctionOperatorFactory func(
 //   - name: The name of the function
 //   - metadataFunc: The function for handling metadata
 //   - seriesDataFunc: The function to handle series data
-func SingleInputVectorFunctionOperatorFactory(name string, metadataFunc functions.SeriesMetadataFunction, seriesDataFunc functions.InstantVectorFunction) InstantVectorFunctionOperatorFactory {
+//   - needsSeriesDeduplication: Set to true if metadataFunc may produce multiple series with the same labels and therefore deduplication is required


Note to reviewers: we've previously talked about extracting a type like FunctionOverRangeVector for functions over instant vectors, and I think the addition of this argument is another reason to do this, but I don't want to do this in this PR in the interests of keeping it to a manageable size.

jhesketh

Nice work 🎉, it's a great improvement on what we had. LGTM sans nits :-)

pkg/streamingpromql/operators/series_merging_test.go

pkg/streamingpromql/operators/deduplicate_and_merge.go

charleskorn · 2024-08-30T00:23:25Z

Given the approval, I'm going to merge this, but happy to address any post-merge feedback in a follow-up PR.

charleskorn marked this pull request as ready for review August 16, 2024 07:07

charleskorn requested a review from a team as a code owner August 16, 2024 07:07

jhesketh reviewed Aug 16, 2024

View reviewed changes

charleskorn marked this pull request as draft August 18, 2024 22:11

charleskorn added 2 commits August 20, 2024 14:23

Return an error when a query produces multiple series with the same l…

233b4db

…abels.

Add changelog entry

5220a74

charleskorn force-pushed the charleskorn/duplicate-series branch from 47fdcdf to 5220a74 Compare August 20, 2024 04:24

charleskorn added 13 commits August 20, 2024 14:29

Make SeriesMetadataFunc optional for functions over instant and ran…

74aa407

…ge vectors.

Restore previous behaviour of DropSeriesName and remove sample seri…

aaf1f4d

…es returned by `HasDuplicateSeries`

Rename binaryOperationSeriesBuffer

b3b5389

Move InstantVectorOperatorBuffer to its own file.

c653412

Start to generalise series merging implementation in BinaryOperation

b3930ed

Continue generalising series merging implementation in BinaryOperation

e850dfa

Finish generalising series merging implementation in BinaryOperation

364feba

Extract mergeSeries to its own file

5a5b420

Fix linting

d1ad47f

Add more documentation

e41f648

Add tests for MergeSeries based on binary operation merging test cases

0129eda

Remove unnecessary test cases

290b4b9

Add deduplication and merging operator and apply it to functions that…

3c04836

… need it

charleskorn changed the title ~~Mimir query engine: return an error when a query produces multiple series with the same labels~~ Mimir query engine: correctly handle functions that produce multiple series with the same labels Aug 29, 2024

Merge branch 'main' into charleskorn/duplicate-series

9b93ddc

# Conflicts: # CHANGELOG.md # pkg/streamingpromql/operators/binary_operation_test.go

charleskorn marked this pull request as ready for review August 29, 2024 06:57

charleskorn commented Aug 29, 2024

View reviewed changes

Address PR feedback: move comment

0cc5310

jhesketh approved these changes Aug 29, 2024

View reviewed changes

pkg/streamingpromql/operators/series_merging_test.go Show resolved Hide resolved

pkg/streamingpromql/operators/deduplicate_and_merge.go Show resolved Hide resolved

charleskorn merged commit d341851 into main Aug 30, 2024
29 checks passed

charleskorn deleted the charleskorn/duplicate-series branch August 30, 2024 00:23

charleskorn mentioned this pull request Dec 4, 2024

Mimir Query Engine #10067

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mimir query engine: correctly handle functions that produce multiple series with the same labels #9019

Mimir query engine: correctly handle functions that produce multiple series with the same labels #9019

charleskorn commented Aug 16, 2024 •

edited

Loading

jhesketh left a comment

jhesketh Aug 16, 2024

jhesketh Aug 16, 2024

charleskorn Aug 18, 2024 •

edited

Loading

jhesketh Aug 19, 2024

charleskorn commented Aug 18, 2024

charleskorn Aug 29, 2024

jhesketh left a comment

charleskorn commented Aug 30, 2024

Mimir query engine: correctly handle functions that produce multiple series with the same labels #9019

Mimir query engine: correctly handle functions that produce multiple series with the same labels #9019

Conversation

charleskorn commented Aug 16, 2024 • edited Loading

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

jhesketh left a comment

Choose a reason for hiding this comment

jhesketh Aug 16, 2024

Choose a reason for hiding this comment

jhesketh Aug 16, 2024

Choose a reason for hiding this comment

charleskorn Aug 18, 2024 • edited Loading

Choose a reason for hiding this comment

Footnotes

jhesketh Aug 19, 2024

Choose a reason for hiding this comment

charleskorn commented Aug 18, 2024

charleskorn Aug 29, 2024

Choose a reason for hiding this comment

jhesketh left a comment

Choose a reason for hiding this comment

charleskorn commented Aug 30, 2024

charleskorn commented Aug 16, 2024 •

edited

Loading

charleskorn Aug 18, 2024 •

edited

Loading