Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other ways to work with TICK Scripts? #259

Closed
gargooza opened this issue Feb 25, 2016 · 13 comments
Closed

Other ways to work with TICK Scripts? #259

gargooza opened this issue Feb 25, 2016 · 13 comments
Milestone

Comments

@gargooza
Copy link

Is there another way to represent TICK Scripts? A JSON format? Some kind of TOML file?

Some context…
I’m building a web app for people at my company to do two basic things:

  1. Create custom inserters for InfluxDB. We generate data on the fly and users choose what they store in an Influx database
  2. Generate alerts based on criteria for the data they store in influx

Part 2 uses Kapacitor. The problem is that users need to write their own TICK Scripts. That’s an insurmountable hurdle since TICK Script is a DSL and debugging them can be challenging. I can’t expect my non-programmer colleagues to do this on their own.

I’m creating a tool that will guide users through the process dynamically. I’ve got a working prototype and the online tool generates some JSON. I feed that JSON into a json2tick converter I wrote. Using the (familiar) CPU tutorial example, consider an alert that writes to a log when the CPU idle usage falls below a threshold of 90. I can represent that as a JSON file:

{
  "sources": [
    {
      "stream": {
        "db": "davedb",
        "rp": "default",
        "measurement": "cpu",
        "aggregation": "median",
        "field": "usage_idle",
        "period": "20s",
        "every": "10s"
      }
    },
    {
      "constant": "90"
    }
  ],
  "conditions": {
    "crit": {
      "LHS": 0,
      "comparison": "<",
      "RHS": 1
    },
    "logfile": "/Users/djohnson/kapacitor/high_cpu.log"
  }
}

My json2tick converter parses this and spits out this TICK Script:

var stream1 = stream.from()
        .database('davedb')
        .retentionPolicy('default')
        .measurement('cpu')
        .window()
            .period(20s)
            .every(10s)
        .mapReduce(influxql.median('usage_idle'))
        .alert()
            .crit(lambda: "median" < 90)
            .log('/Users/djohnson/kapacitor/high_cpu.log')

My json2tick utility has a lot of work to do to support all of the features of TICK Script. But I’ll get there… unless… you plan on supporting some like this?

JSON -> TICK is one thing, but TICK -> JSON is quite another thing. And since kapacitor show <task_name> gives me the TICK Script for the task as Kapacitor sees it, I'd prefer to work with that and not keep track of my own separate JSON config files.

Have others expressed interest in something like I have described. Are there plans on your end w.r.t. to providing other ways to work with TICK Scripts?

@nathanielc
Copy link
Contributor

At the end of the day, a TICKscript gets parsed and structured into a DAG representing the data pipeline; so it would be possible to have other means of defining tasks beyond just TICKscript. But I think it would not be worth the trouble of maintaining different languages. TICKscript and the Kapacitor workflow can be made to fit your needs without having to create a new language definition. Let me explain...

When exposing some of Kapacitor high level features, like alerting there is commonly an already understood structure in place, i.e the data schema is known and what kind of transformation wants to be performed on the data. As such I feel it is possible to define a template TICKscript for a given higher level use case and then just fill in the details.

Using your above example, you can define variables for each component you want to have control over.

var db = 'davedb'
var rp = 'default'
var measurement = 'cpu'
// skipping aggregation since it is the tricky one
var field = 'usage_idle'
var period = 20s
var every = 10s
var threshold = 90
var logfile = '/Users/djohnson/kapacitor/high_cpu.log'

var stream1 = stream.from()
        .database(db)
        .retentionPolicy(rp)
        .measurement(measurement)
        .window()
            .period(period)
            .every(every)
        .mapReduce(influxql.median(field))
        .alert()
            .crit(lambda: "median" < threshold)
            .log(logfile)

Just like that we have parameterized almost all of the features without having to do much work. Just change the values of the vars and define the task. This way you do not have to expose the TICKscript to your end user, you can define templates for each use case and fill in the vars as needed.

There are two short comings in this example:

  1. The aggregation method is not dynamic.
  2. The lambda expression for the alert is always a simple threshold alert.

I have a few ideas for how to solve each one.

First, you could just template your TICKscript and generate a new one for each supported aggregation method. That is the easy solution, but means you have to implement it outside of Kapacitor. To implement it natively within Kapacitor will require more thought. More on that in a bit.

Second, currently vars cannot be lambda expressions, but that would be an easy intuitive change, i.e:

var field = "median"
var threshold = 90
var condition = lambda: field > threshold AND field != 0

...
.alert()
   .crit(condition)

Since the expression language is common and familiar you could expose it directly to an end user or just create it via a UI as well since its a well defined operator precedence language. Something like this would work reasonably well.

Coming back to the first problem of finding a native solution for dynamic pieces of the task.
One bad idea would be to define vars as partial trees, something like:

var agg = .mapReduce(influxql.median(field))

var stream1 = stream.from()
        .database(db)
        .retentionPolicy(rp)
        .measurement(measurement)
        .window()
            .period(period)
            .every(every)
        @agg
        .alert()
          ...

I don't like this idea because it can quickly get really complicated and hard to read. And you would still have to construct the value for the var which really leaves you back at the beginning.

I am open to suggestions here, maybe we implement templating in Kapacitor. That would make defining vars and pieces of the pipeline easy to do. Not quite sure what is the right way forward here, again ideas welcome.

My reasoning for wanting to keep a TICKscript pure solution is so that there is just one interface with Kapacitor. Otherwise you are always running into issues with which features work in which interfaces. That being said we want to make that interface robust enough so that it is easy to abstract and extend. One of the reasons we decided to use a DSL in the first place is it is flexible. We can make the DSL do what we need it to.

@pauldix I would be interested in your thoughts on this.

@nathanielc
Copy link
Contributor

@gargooza Any thoughts on this?

@gargooza
Copy link
Author

gargooza commented Mar 1, 2016

Thanks a ton for the super thoughtful response. I think implementing templating makes a lot of sense and I'd be interested in seeing that implemented. I have users who will be creating and later editing alerts with a web form. I need to be able to do a two way conversion:
web form -> TICKscript
and then
TICKscript -> web form

Your suggestion to prepend a list of var = ... to a TICKscript and then fill in the TICKscript with the appropriate values helps me in the first direction. It does not, however, help me in the other direction unless I write my own parser to extract those variables from the variable list. And if I'm going to do that I'm sure others would want to do it to. At some point it seems like Kapacitor might just want to support templating. Using the example from this thread, we could template something like...

var stream1 = stream.from()
        .database('davedb')
        .retentionPolicy('default')
        .measurement('cpu')
        .window()
            .period(20s)
            .every(10s)
        .mapReduce(influxql.median('usage_idle'))
        .alert()
            .crit(lambda: "median" < 90)
            .log('/Users/djohnson/kapacitor/high_cpu.log')

With a mustache/handlebar-style template like this

var stream1 = stream.from()
        .database({{DB_NAME}})
        .retentionPolicy({{RP_NAME}})
        .measurement({{MEASUREMENT_NAME}})
        .window()
            .period({{PERIOD}})
            .every({{FREQUENCY}})
        .mapReduce(influxql.{{AGGREGATOR}}({{FIELD}}))
        .alert()
            .crit(lambda: {{CRIT_EXPR}})
            .log({{LOG_FILE}})

Which could be combined with a JSON key/value pair like this:

{
'DB_NAME' : 'davedb',
'RP_NAME' : 'default',
'MEASUREMENT_NAME' : 'cpu',
'PERIOD' : '20s',
'FREQUENCY' : '10s',
'AGGREGATOR' : 'median',
'FIELD' : "'usage_idle'",
'CRIT_EXPR' : '"median" < 90',
'LOG_FILE' : "'/Users/djohnson/kapacitor/high_cpu.log'"
}

At this point it's trivial to use the template and the JSON object to generate the TICKscript. And it's almost trivial (or at least relatively easy?) to use the generated TICKscript and the template to generate the JSON object.

@nathanielc
Copy link
Contributor

Help me understand the need for the reverse conversion, tick -> webform

My thoughts would be to use templating like you say and then have a pattern like

web form -> json -> database
json -> tick
database -> json -> web form

This way you are not trying to parse TICKscript or anything, but you can always repopulate the web form because you saved the intermediate JSON values.

In other words don't use the TICKscript as source of authority for the user data.

@nathanielc
Copy link
Contributor

My proposal is to make Kapacitor the database in the previous example, this way TICKscripts are static but filled in at runtime. Each task could be defined via a TICKscript and an optional set of parameters.

@gargooza
Copy link
Author

gargooza commented Mar 2, 2016

Exactly -- I would love to see Kapacitor act as the database in the previous example. Would that database also hold the optional set of parameters? If so then I won't have to stand up my own database just to store those parameters.

@nathanielc
Copy link
Contributor

nathanielc commented May 17, 2016

I am starting work on adding template tasks. Here is my current plan:

Template tasks will have a set of typed template variables defined in a template TICKscript, more on that in a sec. These template tasks will have basic CRUD operations. A normal task can be created from a template task by specifying the set of values for the template vars. The template vars will be typed so its clear what valid values are and how they can be set.

List of available types for template vars:

  • boolean
  • int
  • float
  • string
  • duration
  • regex
  • lambda expression

To define a template task I think adding a new syntax to the TICKscript will be clear and simple.

// Define a template variable named `threshold` of type int, with a default value 42
tvar threshold = 42

// Define a template variable named `crit` of type lambda expression, with default value lambda: "value" < threshold
tvar crit = lambda: "value" < threshold

// Define a template variable named `message` of type string that has no default, aka required
tvar message string

// Define normal variable that cannot be set via a template.
var post_url = 'http://myinternal.alert.handler.com/'

stream
   |from()
      .measurement('x')
   |alert()
     .message(message)
     .crit(crit)
     .post(post_url)

Open Questions:

  • Should we support the lambda expression var type? It feels useful and powerful but at the same time since the purpose of the template tasks is to expose a simpler interface to an end user it may only add confusion. The end user doesn't have the necessary context to know how to write a valid lambda expression, i.e. which fields are in scope etc. The context problem could possibly be solved by exposing the comments for a given tvar.
  • So far TICKscript vars have essentially had only immutable use cases, even though the language does allow them to be changed. I think it would be valuable to change all vars to be immutable. It simplifies implementation and reasoning about the scripts. Specifically around lambda expressions, since all referenced vars will be immutable. That way there is no question about a referenced var's value from declaration time vs evaluation time.

@yosiat
Copy link
Contributor

yosiat commented May 18, 2016

@nathanielc Template tasks sounds great!

But, and to answer the need for lambda expressions - you must first set the limit of template tasks, how much generic do you want it to be?
In my use case, if template tasks supports lambda expression I will have only 2 to 10 templates at most.

By the way, what will be the api for this? POST with json of the templates values, for example:

{
  "threshold: 234,
  "crit": "\"value\" < threshold", 
   "message": "MY messagee."
}

@nathanielc
Copy link
Contributor

@yosiat Yes, I think we want the template tasks to be a generic as possible with the exception of being able to change the DAG pipeline itself.

Yep, that JSON looks about what I had in mind.

Also I think we can forget the var vs tvar distinction and just make it so all vars are template vars, keep it simple.

@thom-nic
Copy link

Is anyone from the InfluxData team collaborating with the Grafana team? I found this issue b/c I want to use Kapacitor for user-defined (not dev/ops defined) alerting on data that I'm displaying from InfluxDB to Grafana. The Grafana team are working on an alerting frontend (see grafana/grafana#2209). I don't know if they plan to have a single backend or if it will be pluggable. But If the goal here (or one of the goals) is to create ability to create Kapacitor alerts from a GUI it would be pretty awesome to use Kapacitor for the alerting backend to Grafana.

@yosiat
Copy link
Contributor

yosiat commented May 20, 2016

@nathanielc by the way, can I compose lambda expressions? for example, accept something like this:

{
   'a': '"value" > 10',
   'b': ' "value" > 20'
}

and then in the tick script, do something like this: crit(lambda: a OR b)

@nathanielc
Copy link
Contributor

@yosiat Yes, I expect that to work. How would you want the state to work? One shared state for the merged expression or 3 separate states for a, b and the combined expression. I think three separate states is cleaner and easier to reason about.

For example:

{
   'a': ' count() > 10',
   'b': ' count() > 20'
}

You would be confused if the count functions in a and b affected one another.

@nathanielc
Copy link
Contributor

This has been implemented in #577

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants