-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(logql): Make LabelSampleExtractor
work as expected when label doesn't exist in logline
#6766
Conversation
pkg/logql/log/label_filter.go
Outdated
@@ -342,7 +342,10 @@ func (s *StringLabelFilter) Process(_ int64, line []byte, lbs *LabelsBuilder) ([ | |||
if s.Name == logqlmodel.ErrorLabel { | |||
return line, s.Matches(lbs.GetErr()) | |||
} | |||
v, _ := lbs.Get(s.Name) | |||
v, ok := lbs.Get(s.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the real fix.
Consider Log lines ``` msg=hello subqueries=5 msg=hello subqueries=0 msg=hello ``` If you run following logql ``` {foo="bar"}|logfmt|subqueries!=0 ``` it returns ``` msg=hello subqueries=5 ``` But if you run following logql ``` {foo="bar"}|logfmt|subqueries!="0" # NOTE: "0" as string. ``` it returns ``` msg=hello subqueries=5 msg=hello ``` NOTE: it also returns log lines that doesn't contain `subqueries` label in the first place. This cause subtle error if used in metric queries using that `label` in unwrap. e.g: following logql on above log lines returns ``` sum_over_time({foo="bar"}|logfmt|subqueries!="0"|unwrap subqueries[1m]) ``` Returns `pipeline error: 'SampleExtractionErr'`. Because, the lines without `subqueries` labels are not skipped during `ProcessLine` and during metric extraction, our extractor sets this `SampleExtractionerror`. Fixes: #6713
7b0d82a
to
d0fbf3c
Compare
Signed-off-by: Kaviraj <[email protected]>
Signed-off-by: Kaviraj <[email protected]>
70531ec
to
8120941
Compare
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused here because this seems related to the value type that we use in the query, i.e. either string != "0"
or integer != 0
.
If a log line does not contain a label, e.g.
msg=hello
I would always assume the value of the label subqueries
would be the empty string ""
, and, therefore, it would be different than both string "0"
and integer 0
.
With the changes from this PR it seems this assumption is wrong, so it should be clearly documented
Signed-off-by: Kaviraj <[email protected]>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.6% |
Signed-off-by: Kaviraj <[email protected]>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks 🙏
Do you think it would also make sense to update the documentation with some examples? (can be part of a separate PR)
pkg/logql/log/label_filter.go
Outdated
// 4.msg="hello" subqueries="" | ||
//``` | ||
|
||
// Now when quering like `{...}|logfmt|subqueries=""` should consider lines 3 and 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this, it doesn't really make it easier to understand for the end user as this make it more obscure:
- when using equality we consider non existent label empty
- but for inequality we don't.
Honestly in #6713 I think the expectations were wrong.
sum(sum_over_time({job="loki-ops/query-frontend"}
|= "metrics.go"
| logfmt
| subqueries!="0"
| unwrap subqueries
[5m]
))
When subqueries doesn't exist then subqueries!="0" is true.
What is weird was that empty labels does transform into a 0 integer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When subqueries doesn't exist then subqueries!="0" is true.
So you saying subqueries!=0
and subqueries!="0"
ok to return different log lines?. But that's unexpected behaviour to end-user IMHO. Both should return exactly same results IMO.
What is weird was that empty labels does transform into a 0 integer.
That's because, we don't mutate the log line if label is not present. So treat it as if label="" during label filter
, But still log line doesn't contain actual label, which breaks when doing unwrap
on same label during metrics extraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is weird was that empty labels does transform into a 0 integer.
I agree with @cyriltovena. This is the part I think we need to address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be we discard empty values ? If that is possible.
Yes that's possible. We can skip empty values during metrics extraction when using unwrap without changing any of the existing label filter behavior.
Also the more I think about it, subqueries!="0" and subqueries!=0 are two different label filters and should be treated as such. And former can indeed match log lines that doesn't have subqueries label at all (we are assuming it has default empty string substring=" when doesn't exist).
I have change the PR like it's mentioned here.
Signed-off-by: Kaviraj <[email protected]>
StringLabelfilter
work as expectedLabelSampleExtractor
work as expected when label doesn't exist in logline
Signed-off-by: Kaviraj <[email protected]>
Signed-off-by: Kaviraj <[email protected]>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think the changelog needs a better documentation of the change.
Signed-off-by: Kaviraj <[email protected]>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0.1%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
…oesn't exist in logline (grafana#6766) * fix(logql): Make `StringLabelfilter` work as expected Consider Log lines ``` msg=hello subqueries=5 msg=hello subqueries=0 msg=hello ``` If you run following logql ``` {foo="bar"}|logfmt|subqueries!=0 ``` it returns ``` msg=hello subqueries=5 ``` But if you run following logql ``` {foo="bar"}|logfmt|subqueries!="0" # NOTE: "0" as string. ``` it returns ``` msg=hello subqueries=5 msg=hello ``` NOTE: it also returns log lines that doesn't contain `subqueries` label in the first place. This cause subtle error if used in metric queries using that `label` in unwrap. e.g: following logql on above log lines returns ``` sum_over_time({foo="bar"}|logfmt|subqueries!="0"|unwrap subqueries[1m]) ``` Returns `pipeline error: 'SampleExtractionErr'`. Because, the lines without `subqueries` labels are not skipped during `ProcessLine` and during metric extraction, our extractor sets this `SampleExtractionerror`. Fixes: grafana#6713 * Add it on changelog Signed-off-by: Kaviraj <[email protected]> * Fix linter Signed-off-by: Kaviraj <[email protected]> * Add more tests and fix the edge case with equality to empty strings Signed-off-by: Kaviraj <[email protected]> * typo on the comment Signed-off-by: Kaviraj <[email protected]> * handle this edge case during metric extraction. Not in label filter Signed-off-by: Kaviraj <[email protected]> * Fix linter Signed-off-by: Kaviraj <[email protected]> * Remove commented code Signed-off-by: Kaviraj <[email protected]> * Update changelog Signed-off-by: Kaviraj <[email protected]>
What this PR does / why we need it:
Consider Log lines
If you run following logql
it returns
But if you run following logql
it returns
NOTE: it also returns log lines that doesn't contain
subqueries
label in the firstplace.
This cause subtle error if used in metric queries using that
label
in unwrap.e.g: following logql on above log lines returns
Returns
pipeline error: 'SampleExtractionErr'
. Because, the lines withoutsubqueries
labels are not skipped during
ProcessLine
and during metric extraction, our extractorsets this
SampleExtractionerror
.EDIT: After some discussions with @cyriltovena and @owen-d. We have decided fix only the sample extractor without changing any behavior of label filter.
Which issue(s) this PR fixes:
Fixes # #6713
Special notes for your reviewer:
Checklist
CHANGELOG.md
.docs/sources/upgrading/_index.md