Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kapacitor record query does not return same points as line protocol query #1294

Closed
hraftery opened this issue Apr 2, 2017 · 5 comments · Fixed by #1320
Closed

Kapacitor record query does not return same points as line protocol query #1294

hraftery opened this issue Apr 2, 2017 · 5 comments · Fixed by #1320
Assignees
Labels

Comments

@hraftery
Copy link

hraftery commented Apr 2, 2017

Initial problem description is here, but that didn't elicit any response.

I've a couple of days on this issue now and have more to report. This issue is a big deal for me, because I'm using kapacitor to run a calculation algorithm on live data. I routinely make improvements to the algorithm and wish to run it on historical data before committing to the live stream. The bug described here makes that infuriatingly complex.

As described in the original post, kapacitor's record query does not return the same points as influx's queries. This makes constructing the queries an endless game of trial and error. Here's what I've discovered:

  • SELECT * in influx returns all tags and fields, including time. In kapacitor it only returns the first field by alphabetical order!
  • SELECT field1, tag1 does not return the tag in kapacitor.
  • SELECT field1 ... GROUP BY * does return the tags, but ordered by tag values, not by time. This makes replaying useless.
  • SELECT field1, field2 does not return field2 unless the point also has a field1 set.

Putting all this together, I've come up with a horrendous workaround for replaying historical data.

First consider this example measurement where a and b are fields and t is a tag:

name: m

time a b t
---- - - -
1    1   x
2      2 y
3    3   x 
4      4  y

I could find no way to replay this data to a kapacitor stream script. Instead, I found this:

influx
> SELECT a,t INTO "m_flat" FROM m WHERE t='x'
> SELECT b AS a, t  INTO "m_flat" FROM m WHERE t='y'

That creates a new measurement where t is now a field, like so:

name: m_flat

time a t
---- - -
1    1 x
2    2 y
3    3 x 
4    4 y

Then, in my kapacitor script I call a UDF which restores the tags:

  if point.name == "m_flat": #then unflatten
    point.tags['t'] = point.fieldsString['t']
    point.fieldsDouble['b'] = point.fieldsDouble['a']

And finally, I can record and replay:

RID=$(kapacitor record query -query $'SELECT a,t FROM "db"."autogen"."m_flat" WHERE time > \'2017-02-27T16:30:00Z\' AND time < \'2017-03-01T16:30:00Z\' ' -type stream)
kapacitor replay -task calc_history -recording $RID -rec-time

I have to repeat for every two day time periods because the API kapacitor uses is limited to 10000 rows. Surely this is not intended functionality?

@nhaugo nhaugo added the question label Apr 3, 2017
@nathanielc
Copy link
Contributor

@hraftery Thanks for the detailed write up.

I think the issue you are running into is the difference between fields and tags being returned from the query. Running SELECT * returns all the fields, running SELECT field1, tag1 returns the field and tag as fields. Basically anything in the SELECT [list of things here] clause of the query is returned as a field. To get all the tags of a query one must use the GROUP BY * syntax.

Example:

batch
   // This returns all fields and all tags correctly.
    |query('SELECT * FROM ...')
        .groupBy(*)
   |log()

I have confirmed that the above returns the fields and tags as expected using the simple a,b,t data set above.

Output from the log line above:

{"name":"m","tmax":"2017-04-03T18:31:35.471555487Z","group":"t=x","tags":{"t":"x"},"points":[{"time":"2017-04-03T18:31:24.352794932Z","fields":{"a":1},"tags":{"t":"x"}},{"time":"2017-04-03T18:31:35.471555487Z","fields":{"a":3},"tags":{"t":"x"}}]}

@hraftery
Copy link
Author

hraftery commented Apr 9, 2017

@nathanielc thanks for the response. After quite some time battling with Kapacitor I was able to reproduce your results, which in particular do not contain the points with "b" fields. Do you agree that your results show that only half the data is returned?

My steps to reproduce:

$ influx -execute "create database d"
$ influx -execute "INSERT m,t=x a=1" -database d
$ influx -execute "INSERT m,t=y b=2" -database d
$ influx -execute "INSERT m,t=x a=3" -database d
$ influx -execute "INSERT m,t=y b=4" -database d
$ cat batch_all.tick 
batch
    |query('SELECT * FROM "d"."autogen".m')
        .groupBy(*)
        .period(1s)
        .every(1s)
    |log()

$ kapacitor define batch_all -type batch -tick batch_all.tick -dbrp d.autogen
$ kapacitor replay-live batch -task batch_all -rec-time -past 30m

Then have a look in /var/log/kapacitor/kapacitor.log to find:

[task_master:0de2417b-2fd9-468f-9749-b7b05357a820] 2017/04/09 13:33:48 I! opened
[task_master:0de2417b-2fd9-468f-9749-b7b05357a820] 2017/04/09 13:33:48 I! Started task: batch_all
[httpd] ::1 - - [09/Apr/2017:13:33:52 +1000] "GET /kapacitor/v1/replays/0de2417b-2fd9-468f-9749-b7b05357a820 HTTP/1.1" 202 238 "-" "KapacitorClient" 59ef7366-1cd5-11e7-8733-000000000000 324
<SNIP "GET" line repeated heaps of times>
[batch_all:log2] 2017/04/09 13:34:01 I!  {"name":"m","tmax":"2017-04-08T23:48:54.020542218Z","group":"t=x","tags":{"t":"x"},"points":[{"time":"2017-04-08T23:48:53.449898839Z","fields":{"a":1},"tags":{"t":"x"}}]}
[batch_all:log2] 2017/04/09 13:34:01 I!  {"name":"m","tmax":"2017-04-08T23:48:59.020542218Z","group":"t=y","tags":{"t":"y"}}
[batch_all:log2] 2017/04/09 13:34:01 I!  {"name":"m","tmax":"2017-04-08T23:49:04.020542218Z","group":"t=x","tags":{"t":"x"},"points":[{"time":"2017-04-08T23:49:03.658124222Z","fields":{"a":3},"tags":{"t":"x"}}]}
[batch_all:log2] 2017/04/09 13:34:01 I!  {"name":"m","tmax":"2017-04-08T23:49:09.020542218Z","group":"t=y","tags":{"t":"y"}}
[httpd] ::1 - - [09/Apr/2017:13:34:01 +1000] "GET /kapacitor/v1/replays/0de2417b-2fd9-468f-9749-b7b05357a820 HTTP/1.1" 202 238 "-" "KapacitorClient" 5fa2c74f-1cd5-11e7-8747-000000000000 401
<SNIP "GET" line repeated heaps of times>
[task_master:0de2417b-2fd9-468f-9749-b7b05357a820] 2017/04/09 13:34:24 I! Stopped task: batch_all

Took me forever to figure out the tick script format for batch tasks. Getting the SELECT slightly wrong or not having the period and every statements right will cause it to fail in mysterious ways.

So in the end I think we've just confirmed the original list of issues also applies to batch tasks as well as stream tasks?

@nathanielc
Copy link
Contributor

Hmm, I can reproduce the issue using your script. I'll take a look into what is going on. Thanks!

@nathanielc
Copy link
Contributor

nathanielc commented Apr 12, 2017

@hraftery Well that was a simpler fix than I expected for such an oddly behaving bug. See #1320.

Please let me know if other weird behavior still exists.

@crazy-canux
Copy link

How to show all "batch point" in alert handler, like email?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants