Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogQL: Simple JSON expressions #3280

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 94 additions & 34 deletions docs/sources/logql/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,40 +145,100 @@ If an extracted label key name already exists in the original log stream, the ex

We support currently support json, logfmt and regexp parsers.

The **json** parsers take no parameters and can be added using the expression `| json` in your pipeline. It will extract all json properties as labels if the log line is a valid json document. Nested properties are flattened into label keys using the `_` separator. **Arrays are skipped**.

For example the json parsers will extract from the following document:

```json
{
"protocol": "HTTP/2.0",
"servers": ["129.0.1.1","10.2.1.3"],
"request": {
"time": "6.032",
"method": "GET",
"host": "foo.grafana.net",
"size": "55",
},
"response": {
"status": 401,
"size": "228",
"latency_seconds": "6.031"
}
}
```

The following list of labels:

```kv
"protocol" => "HTTP/2.0"
"request_time" => "6.032"
"request_method" => "GET"
"request_host" => "foo.grafana.net"
"request_size" => "55"
"response_status" => "401"
"response_size" => "228"
"response_size" => "228"
```
The **json** parser operates in two modes:

1. **without** parameters:

Adding `| json` to your pipeline will extract all json properties as labels if the log line is a valid json document.
Nested properties are flattened into label keys using the `_` separator.

Note: **Arrays are skipped**.

For example the json parsers will extract from the following document:

```json
{
"protocol": "HTTP/2.0",
"servers": ["129.0.1.1","10.2.1.3"],
"request": {
"time": "6.032",
"method": "GET",
"host": "foo.grafana.net",
"size": "55",
"headers": {
"Accept": "*/*",
"User-Agent": "curl/7.68.0"
}
},
"response": {
"status": 401,
"size": "228",
"latency_seconds": "6.031"
}
}
```

The following list of labels:

```kv
"protocol" => "HTTP/2.0"
"request_time" => "6.032"
"request_method" => "GET"
"request_host" => "foo.grafana.net"
"request_size" => "55"
"response_status" => "401"
"response_size" => "228"
"response_size" => "228"
```

2. **with** parameters:

Using `| json label="expression", another="expression"` in your pipeline will extract only the
specified json fields to labels. You can specify one or more expressions in this way, the same
as [`label_format`](#labels-format-expression); all expressions must be quoted.

Currently, we only support field access (`my.field`, `my["field"]`) and array access (`list[0]`), and any combination
of these in any level of nesting (`my.list[0]["field"]`).

For example, `| json first_server="servers[0]", ua="request.headers[\"User-Agent\"]` will extract from the following document:
cyriltovena marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"protocol": "HTTP/2.0",
"servers": ["129.0.1.1","10.2.1.3"],
"request": {
"time": "6.032",
"method": "GET",
"host": "foo.grafana.net",
"size": "55",
"headers": {
"Accept": "*/*",
"User-Agent": "curl/7.68.0"
}
},
"response": {
"status": 401,
"size": "228",
"latency_seconds": "6.031"
}
}
```

The following list of labels:

```kv
"first_server" => "129.0.1.1"
"ua" => "curl/7.68.0"
```

If an array or an object returned by an expression, it will be assigned to the label in json format.

For example, `| json server_list="servers", headers="request.headers` will extract:

```kv
"server_list" => `["129.0.1.1","10.2.1.3"]`
"headers" => `{"Accept": "*/*", "User-Agent": "curl/7.68.0"}`
```

The **logfmt** parser can be added using the `| logfmt` and will extract all keys and values from the [logfmt](https://brandur.org/logfmt) formatted log line.

Expand Down
33 changes: 33 additions & 0 deletions pkg/logql/ast.go
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,39 @@ func (e *labelFmtExpr) String() string {
return sb.String()
}

type jsonExpressionParser struct {
expressions []log.JSONExpression

implicit
}

func newJSONExpressionParser(expressions []log.JSONExpression) *jsonExpressionParser {
return &jsonExpressionParser{
expressions: expressions,
}
}

func (j *jsonExpressionParser) Shardable() bool { return true }

func (j *jsonExpressionParser) Stage() (log.Stage, error) {
return log.NewJSONExpressionParser(j.expressions)
}

func (j *jsonExpressionParser) String() string {
var sb strings.Builder
sb.WriteString(fmt.Sprintf("%s %s ", OpPipe, OpParserTypeJSON))
for i, exp := range j.expressions {
sb.WriteString(exp.Identifier)
sb.WriteString("=")
sb.WriteString(strconv.Quote(exp.Expression))

if i+1 != len(j.expressions) {
sb.WriteString(",")
}
}
return sb.String()
}

func mustNewMatcher(t labels.MatchType, n, v string) *labels.Matcher {
m, err := labels.NewMatcher(t, n, v)
if err != nil {
Expand Down
1 change: 1 addition & 0 deletions pkg/logql/ast_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ func Test_SampleExpr_String(t *testing.T) {
`,
`10 / (5/2)`,
`10 / (count_over_time({job="postgres"}[5m])/2)`,
`{app="foo"} | json response_status="response.status.code", first_param="request.params[0]"`,
} {
t.Run(tc, func(t *testing.T) {
expr, err := ParseExpr(tc)
Expand Down
18 changes: 18 additions & 0 deletions pkg/logql/expr.y
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ import (
LabelFormatExpr *labelFmtExpr
LabelFormat log.LabelFmt
LabelsFormat []log.LabelFmt
JSONExpressionParser *jsonExpressionParser
JSONExpression log.JSONExpression
JSONExpressionList []log.JSONExpression
UnwrapExpr *unwrapExpr
}

Expand Down Expand Up @@ -82,6 +85,9 @@ import (
%type <LabelFormatExpr> labelFormatExpr
%type <LabelFormat> labelFormat
%type <LabelsFormat> labelsFormat
%type <JSONExpressionParser> jsonExpressionParser
%type <JSONExpression> jsonExpression
%type <JSONExpressionList> jsonExpressionList
%type <UnwrapExpr> unwrapExpr
%type <UnitFilter> unitFilter

Expand Down Expand Up @@ -211,6 +217,7 @@ pipelineExpr:
pipelineStage:
lineFilters { $$ = $1 }
| PIPE labelParser { $$ = $2 }
| PIPE jsonExpressionParser { $$ = $2 }
| PIPE labelFilter { $$ = &labelFilterExpr{LabelFilterer: $2 }}
| PIPE lineFormatExpr { $$ = $2 }
| PIPE labelFormatExpr { $$ = $2 }
Expand All @@ -226,6 +233,9 @@ labelParser:
| REGEXP STRING { $$ = newLabelParserExpr(OpParserTypeRegexp, $2) }
;

jsonExpressionParser:
cyriltovena marked this conversation as resolved.
Show resolved Hide resolved
JSON jsonExpressionList { $$ = newJSONExpressionParser($2) }

lineFormatExpr: LINE_FMT STRING { $$ = newLineFmtExpr($2) };

labelFormat:
Expand All @@ -252,6 +262,14 @@ labelFilter:
| labelFilter OR labelFilter { $$ = log.NewOrLabelFilter($1, $3 ) }
;

jsonExpression:
IDENTIFIER EQ STRING { $$ = log.NewJSONExpr($1, $3) }

jsonExpressionList:
jsonExpression { $$ = []log.JSONExpression{$1} }
| jsonExpressionList COMMA jsonExpression { $$ = append($1, $3) }
;

unitFilter:
durationFilter { $$ = $1 }
| bytesFilter { $$ = $1 }
Expand Down
Loading