-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kapacitor record query does not return same points as line protocol query #1294
Comments
@hraftery Thanks for the detailed write up. I think the issue you are running into is the difference between fields and tags being returned from the query. Running Example:
I have confirmed that the above returns the fields and tags as expected using the simple Output from the log line above: {"name":"m","tmax":"2017-04-03T18:31:35.471555487Z","group":"t=x","tags":{"t":"x"},"points":[{"time":"2017-04-03T18:31:24.352794932Z","fields":{"a":1},"tags":{"t":"x"}},{"time":"2017-04-03T18:31:35.471555487Z","fields":{"a":3},"tags":{"t":"x"}}]} |
@nathanielc thanks for the response. After quite some time battling with Kapacitor I was able to reproduce your results, which in particular do not contain the points with "b" fields. Do you agree that your results show that only half the data is returned? My steps to reproduce:
Then have a look in
Took me forever to figure out the tick script format for batch tasks. Getting the So in the end I think we've just confirmed the original list of issues also applies to batch tasks as well as stream tasks? |
Hmm, I can reproduce the issue using your script. I'll take a look into what is going on. Thanks! |
How to show all "batch point" in alert handler, like email? |
Initial problem description is here, but that didn't elicit any response.
I've a couple of days on this issue now and have more to report. This issue is a big deal for me, because I'm using kapacitor to run a calculation algorithm on live data. I routinely make improvements to the algorithm and wish to run it on historical data before committing to the live stream. The bug described here makes that infuriatingly complex.
As described in the original post, kapacitor's record query does not return the same points as influx's queries. This makes constructing the queries an endless game of trial and error. Here's what I've discovered:
SELECT *
in influx returns all tags and fields, including time. In kapacitor it only returns the first field by alphabetical order!SELECT field1, tag1
does not return the tag in kapacitor.SELECT field1 ... GROUP BY *
does return the tags, but ordered by tag values, not by time. This makes replaying useless.SELECT field1, field2
does not return field2 unless the point also has a field1 set.Putting all this together, I've come up with a horrendous workaround for replaying historical data.
First consider this example measurement where a and b are fields and t is a tag:
I could find no way to replay this data to a kapacitor stream script. Instead, I found this:
That creates a new measurement where t is now a field, like so:
Then, in my kapacitor script I call a UDF which restores the tags:
And finally, I can record and replay:
I have to repeat for every two day time periods because the API kapacitor uses is limited to 10000 rows. Surely this is not intended functionality?
The text was updated successfully, but these errors were encountered: