Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json_v2 parser and NULL values #10670

Closed
kgrimsby opened this issue Feb 17, 2022 · 19 comments · Fixed by #10799
Closed

json_v2 parser and NULL values #10670

kgrimsby opened this issue Feb 17, 2022 · 19 comments · Fixed by #10799
Labels
area/json json and json_v2 parser/serialiser related bug unexpected problem or unintended behavior

Comments

@kgrimsby
Copy link
Contributor

Relevant telegraf.conf

None

Logs from Telegraf

None

System info

Telegraf 1.21.4

Docker

No response

Steps to reproduce

None

Expected behavior

None

Actual behavior

None

Additional info

The new release includes #10468 , which I find a little bit inconsistent.

            ## WARNING: Setting optional to true will suppress errors if the configured Path doesn't match the JSON
            ## This should be used with caution because it removes the safety net of verifying the provided path
            ## This was introduced to support situations when parsing multiple incoming JSON payloads with wildcards
            ## More context: https://github.com/influxdata/telegraf/issues/10072
            optional = false

This PR also include throwing exceptins on values being NULL, not just paths.

The setting is also introduced into "objects", but does indeed apply to non-objects as well.

Intended behaviour should be clarified:

  • Are values allowed to be null?
  • Are paths allowed to be null?
  • Are all paths optional when param "optional" is set? If so, should be moved to main config object
@kgrimsby kgrimsby added the bug unexpected problem or unintended behavior label Feb 17, 2022
@telegraf-tiger telegraf-tiger bot added the area/json json and json_v2 parser/serialiser related label Feb 17, 2022
@kgrimsby
Copy link
Contributor Author

@sspaink and @srebhan Any thoughts?

@srebhan
Copy link
Member

srebhan commented Feb 17, 2022

I'll leave this to @sspaink... :-)

@sspaink
Copy link
Contributor

sspaink commented Feb 17, 2022

  • Values are allowed to be null, but should be ignored because they aren't valid in line protocol (this behavior was introduced here: Update json_v2 parser to handle null types #9368)
  • Not sure what you mean by paths being allowed to be null, can you provide an example?
  • The way it was implemented, only any paths configured in a [[inputs.__.json_v2.object]] are ignored. Are you facing a scenario you need to ignore other paths?

So in the PR to introduce optional I am using the logic result.Type == gjson.Null to detect if the path found anything, I didn't consider that this could cause issues if the provided path just goes directly to a null value. Is this causing problems for you? If it is can you provide a sample JSON and config?

@kgrimsby
Copy link
Contributor Author

kgrimsby commented Feb 18, 2022

From parser.go

func (p *Parser) processMetric(input []byte, data []DataSet, tag bool, timestamp time.Time) ([]telegraf.Metric, error) {
	if len(data) == 0 {
		return nil, nil
	}

	p.iterateObjects = false
	var metrics [][]telegraf.Metric

	for _, c := range data {
		if c.Path == "" {
			return nil, fmt.Errorf("GJSON path is required")
		}
		result := gjson.GetBytes(input, c.Path)
		if result.Type == gjson.Null {
			return nil, fmt.Errorf(GJSONPathNUllErrorMSG)
		}

This is in processMetric, which is used by non-objects. Here it returns and error if the value provided is NULL. Subsequently the error get returned form the parser method aswell, which results in the metric not being added at all.

An example is a basic JSON:

{
  "weight_ROWKEY": "123",
  "weight_serialNumber": "AX00",
  "weight_createdAt": 1644708158939,
  "weight_weight": 289.799,
  "sensor_imei": "123",
  "sensor_distributor_name": null,
  "sensor_customer_name": "Customer",
  "sensor_dist_name": null
}

Given the config

      [[inputs.kafka_consumer.json_v2]]
        measurement_name = "weight"
        timestamp_format = "unix_ms"
        timestamp_path = "weight_createdAt"
        [[inputs.kafka_consumer.json_v2.field]]
          path = "weight_weight"
          rename = "weight"
          type = "float"
        [[inputs.kafka_consumer.json_v2.tag]]
          path = "weight_serialNumber"
          rename = "serial_number"
        [[inputs.kafka_consumer.json_v2.tag]]
          path = "weight_ROWKEY"
          rename = "imei"
        [[inputs.kafka_consumer.json_v2.tag]]
          path = "sensor_customer_name"
          rename = "customer_name"
        [[inputs.kafka_consumer.json_v2.tag]]
          path = "sensor_distributor_name"
          rename = "distributor_name"
        [[inputs.kafka_consumer.json_v2.tag]]
          path = "sensor_dist_name"

This will throw an error. And every time NULL is represented in either tag value or field value it will dismiss the metric resulting in error.

As far as I can see from the GJSON library, there are no way to distinguish if a path do not exists or the value is NULL, since both vill return NULL.

Does this make sense?

@kitlaan
Copy link
Contributor

kitlaan commented Feb 20, 2022

I'm thinking the early returns from #10468 are the issue when checking against c.Path (in processMetric and processObjects).

For example in processObjects()

result := gjson.GetBytes(input, c.Path)
if result.Type == gjson.Null {
if c.Optional {
// If path is marked as optional don't error if path doesn't return a result
p.Log.Debugf(GJSONPathNUllErrorMSG)
return nil, nil
}
return nil, fmt.Errorf(GJSONPathNUllErrorMSG)
}

1.21.3 would proceed to the next object if the path doesn't match, but 1.21.4 will bail out on the first non-match. Since processObjects allows returning multiple output metrics (from a single input metric), this early basically neuters the flow?

Should the code instead be:

result := gjson.GetBytes(input, c.Path)
if result.Type == gjson.Null {
    if c.Optional {
        // If path is marked as optional don't error if path doesn't return a result
        p.Log.Debugf(GJSONPathNUllErrorMSG)
        continue      // <<<<<<<<
    }

    return nil, fmt.Errorf(GJSONPathNUllErrorMSG)
}

Similarly in processMetric,

    result := gjson.GetBytes(input, c.Path)
    if result.Type == gjson.Null {
        continue      // <<<<<<<<
        // there's no c.Optional to check against here (yet)...
        // return nil, fmt.Errorf(GJSONPathNUllErrorMSG)
    }

I guess the alternative is to change to having multiple copies of [[inputs.mqtt_consumer.json_v2]], one for each respective unique set of properly-matching fields/tags/objects?

@kitlaan
Copy link
Contributor

kitlaan commented Feb 20, 2022

Note: I changed my file from

[[inputs.mqtt_consumer]]
    [[inputs.mqtt_consumer.json_v2]]
        [[inputs.mqtt_consumer.json_v2.object]]
        [[inputs.mqtt_consumer.json_v2.object]]

into each object having its own:

[[inputs.mqtt_consumer]]
    [[inputs.mqtt_consumer.json_v2]]
        [[inputs.mqtt_consumer.json_v2.object]]
    [[inputs.mqtt_consumer.json_v2]]
        [[inputs.mqtt_consumer.json_v2.object]]

This seems to be working, albeit the debug logs are now full of D! [parsers.json_v2::mqtt_consumer] GJSON Path returned null, either couldn't find value or path has null value. (Aside -- is the debug message even useful? If I were debugging, I probably wouldn't know what path was triggering.)

@kgrimsby
Copy link
Contributor Author

@kitlaan , I think processObjects works as intended, but in the same commit there has been added additional code which affects non-objects, hence the changes in processMetric.

It's important to stress that a metric tag (which is processed in processMetric) should not be excluded because it contains NULL.

I think the problem here is, we have no way of knowing if the path is doesn't exist or the value of the path is actually NULL. This is concerning, since the value very well could be NULL, which is a valid value for a tag, as far as I know.

So, can we distinguish between non existing path and NULL-values?

Also, GJSONPathNUllErrorMSG should not be thrown in processMetricsince it hasn't the optional flag. So either it should implement the optional-flag or remove the error at least.

But I would hope theres a way to differentiate non existing path and NULL.

Example of valid json with NULL

{ "valid": null, "ts": 1234, "value": 82 }

The same json buth non existing path:

{ "ts": 1234, "value": 82 }

So configured with valid as a tag, the first one shouldn't fail no matter what, but the second should fail. As of now both will fail.

@kitlaan
Copy link
Contributor

kitlaan commented Feb 20, 2022

My use case is slightly different than yours, but the end result is the same -- brokenness

Paging through the gjson docs+code... I think this should solve both our cases (this seem right, @sspaink ?)
Any place that #10468 added a check for result.Type == gjson.Null, it really should be !result.Exists().

I still think that the new code shouldn't have early-returned and instead continue the loop.

@kgrimsby
Copy link
Contributor Author

kgrimsby commented Feb 20, 2022 via email

@kitlaan
Copy link
Contributor

kitlaan commented Feb 20, 2022

(Sorry for the random train of thought. This plus random MQTT disconnects has been plaguing me this morning.)

On second thought, some of the places should still be Null checks, and others Exists().

if c.TimestampPath != "" {
result := gjson.GetBytes(input, c.TimestampPath)
if result.Type == gjson.Null {
p.Log.Debugf("Message: %s", input)
return nil, fmt.Errorf(GJSONPathNUllErrorMSG)

This is proper; a null TimestampPath is a problem.

if c.Path == "" {
return nil, fmt.Errorf("GJSON path is required")
}
result := gjson.GetBytes(input, c.Path)
if result.Type == gjson.Null {
return nil, fmt.Errorf(GJSONPathNUllErrorMSG)
}

If !Exists, continue the for loop. If you reach the end of processMetric() and len(metrics) == 0, then error?

result := gjson.GetBytes(input, c.Path)
if result.Type == gjson.Null {
if c.Optional {
// If path is marked as optional don't error if path doesn't return a result
p.Log.Debugf(GJSONPathNUllErrorMSG)
return nil, nil
}
return nil, fmt.Errorf(GJSONPathNUllErrorMSG)
}

If !Exists, continue the for loop. If you reach the end of processObjects() and len(t) == 0, then error?

r.result = gjson.GetBytes(scopedJSON, f.Path)
if r.result.Type == gjson.Null {
return nil, fmt.Errorf(GJSONPathNUllErrorMSG)
}

I'm assuming that Telegraf semantics are "if someone explicitly codes a field/tag", then they're expecting data. In which case, this is proper; can't do anything with a null.

r.result = gjson.GetBytes(scopedJSON, f.Path)
if r.result.Type == gjson.Null {
return nil, fmt.Errorf(GJSONPathNUllErrorMSG)

Ditto for this one.

Side thoughts

Alternatively for the len(...) == 0 checks, could the optional be placed at the Config level? e.g.

fields, err := p.processMetric(input, c.Fields, false, timestamp)
if err != nil {
return nil, err
}
tags, err := p.processMetric(input, c.Tags, true, timestamp)
if err != nil {
return nil, err
}
objects, err := p.processObjects(input, c.JSONObjects, timestamp)
if err != nil {
return nil, err
}

Instead of just checking err != nil, could in addition len(fields) == 0 && !c.Optional trigger the error?

Or maybe a check at the very end...

return metrics, nil

If len(metrics) == 0 && !p.Optional.

Honestly, I'm not sure if the default behavior for "not matching any paths" is to ignore errors (aka only just debug print), or to be explicit and make sure some output metric happens...

@kgrimsby
Copy link
Contributor Author

I ran a quick test just to check if Exists returns expected out put and it does.

 1package main
 2
 3import "fmt"
 4import "github.com/tidwall/gjson"
 5
 6var json []byte = []byte(`{ "var1": null, "var2": "test" }`)
 7
 8func main() {
 9        fmt.Println("Fetching json.")
10
11        fmt.Println(gjson.GetBytes(json, "var1").Exists())
12        fmt.Println(gjson.GetBytes(json, "var2").Exists())
13        fmt.Println(gjson.GetBytes(json, "var3").Exists())
14
15        fmt.Println(gjson.GetBytes(json, "var1").Type)
16        fmt.Println(gjson.GetBytes(json, "var2").Type)
17        fmt.Println(gjson.GetBytes(json, "var3").Type)
18}

Gives this output

Fetching json.
true
true
false
Null
String
Null

I can start a PR with the changes. As you're saying, timestamp path should not allow NULL, only metrics as I can see.

@shrx
Copy link

shrx commented Feb 26, 2022

This update completely broke my setup, I am now constantly getting GJSON Path returned null errors.

@kgrimsby
Copy link
Contributor Author

@shrx I can imagine. I’m working on a fix, but been down with covid the last week. Hope to resume the work on monday.

@sspaink
Copy link
Contributor

sspaink commented Mar 8, 2022

Thank you everyone for diving into this problem, I really appreciate all the effort to provide so much detail. It does seem using Exists() would have been the right solution! @kgrimsby I hope you are feeling better, have you already started working on the fix? If not I'd be happy to put up a PR as well for everyone to try. The next release is coming up (March 16th) and including a fix for this would be great.

@kgrimsby
Copy link
Contributor Author

kgrimsby commented Mar 9, 2022

@sspaink I haven't got around to a PR so you might as well put one up. I only played around with the code to confirm that Exists() does work as intended. And as you're saying it would solve all of the above problems.

Also, you have to add/move the optional config from object to the root config, since the code in root->tags also throws errors the same way as root->object does. Or alternatively add optional to each tag, but seems redundant instead of having it in root.

@sspaink
Copy link
Contributor

sspaink commented Mar 9, 2022

@kgrimsby @kitlaan @shrx I've created a PR to address this issue: #10799
If you have time, could you grab the artifacts posted by the Telegraf Tiger and see if this is working for you? Thank you!

@shrx
Copy link

shrx commented Mar 12, 2022

@kgrimsby @kitlaan @shrx I've created a PR to address this issue: #10799 If you have time, could you grab the artifacts posted by the Telegraf Tiger and see if this is working for you? Thank you!

Hi, I'm running arm64 build of telegraf, and looks like the artifact is not available for this arch. I'll wait for the released build, until then I reverted to 1.21.3.
Edit: nevermind, I had to expand the dropdown list to see all available builds. I'll try it now.

@shrx
Copy link

shrx commented Mar 12, 2022

@kgrimsby @kitlaan @shrx I've created a PR to address this issue: #10799 If you have time, could you grab the artifacts posted by the Telegraf Tiger and see if this is working for you? Thank you!

I tested the new version and while I don't get the "GJSON Path returned null" error anymore, I get a new ones instead:

Mar 12 11:10:56 raspberrypi telegraf[53452]: 2022-03-12T10:10:56Z E! [inputs.mqtt_consumer] Error in plugin: the path battery doesn't exist
Mar 12 11:10:58 raspberrypi telegraf[53452]: 2022-03-12T10:10:58Z E! [inputs.mqtt_consumer] Error in plugin: the path battery doesn't exist
Mar 12 11:12:30 raspberrypi telegraf[53452]: 2022-03-12T10:12:30Z E! [inputs.mqtt_consumer] Error in plugin: the path color_temp doesn't exist

The first two come from a tradfri lightbulb, which indeed doesn't have a battery field, and the last one is from a sonoff temperature sensor, which obviously doesn't have the color_temp field.

This is my relevant config:

[[inputs.mqtt_consumer]]
    servers = ["tcp://127.0.0.1:1883"]
    topics = [
      "zigbee2mqtt/lightbulb",
      "zigbee2mqtt/lightbulbRemote",
      "zigbee2mqtt/sonoffOutside",
      "zigbee2mqtt/sonoffInside"
    ]
    topic_tag = ""
    data_format = "json_v2"
    [[inputs.mqtt_consumer.json_v2]]
      measurement_name = "zigbee"
    [[inputs.mqtt_consumer.json_v2.object]]
      path = "@this"
      excluded_keys = ["action", "color_mode", "update", "update_available"]
    [[inputs.mqtt_consumer.json_v2.field]]
      path = "linkquality"
      type = "int"
    [[inputs.mqtt_consumer.json_v2.field]]
      path = "color_temp"
      type = "int"
    [[inputs.mqtt_consumer.json_v2.field]]
      path = "brightness"
      type = "int"
    [[inputs.mqtt_consumer.json_v2.field]]
      path = "battery"
      type = "int"
    [[inputs.mqtt_consumer.json_v2.field]]
      path = "temperature"
      type = "float"
    [[inputs.mqtt_consumer.json_v2.field]]
      path = "humidity"
      type = "float"
    [[inputs.mqtt_consumer.json_v2.field]]
      path = "voltage"
      type = "int"
    [[inputs.mqtt_consumer.topic_parsing]]
      topic = "+/+"
      tags = "_/device"

@shrx
Copy link

shrx commented Mar 12, 2022

Appears to be fixed when I add optional = true to all fields that are not common between all devices. But why introduce a backwards-incompatible default configuration? This would be unnecessary if you set all fields to be optional by default, which was the case before this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/json json and json_v2 parser/serialiser related bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants