Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reseting window over time #727

Closed
dan505512 opened this issue Jul 20, 2016 · 7 comments
Closed

reseting window over time #727

dan505512 opened this issue Jul 20, 2016 · 7 comments

Comments

@dan505512
Copy link

Hi,
I'm using Kapacitor in order to alert when a rate of some increasing value changes over time.
For example: The disconnection of servers on the last 10 minutes.

var windows = stream
    |from()
        .measurment('DisconnectionsCounter')
    |where(labmda: "server_name" == 'TestServer' and "path" =~ /^(.*)\.DisconnectionsCounter$/)
    |window()
        .period(10m)
        .every(10s)
        .align()

var previous = windows
    |first('value')

var current = windows
    |last('value')

previous
    |join(current)
        .as('previous', 'current')
    |eval(lambda: "current.last" - "previous.first")
        .as('rate')
    |alert()
        .id('DisconnectionsCounterOnTestServer')
        .info(labmda: "rate" == 0 OR "rate" > 0)
        .crit(labmda: "rate" > 0)
        .post('http://MyTestServer:8323')

Now, if i reset the value once a day the rate value is negative, bacause the last value is higher than the value after reseting(zero),
and so for the period(10m) of the window i have no way to tell if there were any disconnections.

The optimal solution for me is to "somehow" reset the window when i reset my disconnections counter or do the next logic

previous
    |join(current, min)
        .as('previous', 'current', 'min_value')
    |eval(lambda:"current.last" > "previous.first" ? windows|min())
        .as('current_value')
    |eval(lambda: "current.last" - "current_value")
        .as('rate')
    |alert()
        .id('DisconnectionsCounterOnTestServer')
        .info(labmda: "rate" == 0 OR "rate" > 0)
        .crit(labmda: "rate" > 0)
        .post('http://MyTestServer:8323')

As you can see, I need to do some if logic in eval. is it possible? or is there any patch i can to in order to achieve such behavior?

@nathanielc
Copy link
Contributor

Use derivative with the nonNegative flag so that is skips negative results. This will compute the difference between each pair of points. Skipping the one pair of points where the rate is negative. Then you can sum or average the rate across the whole batch in order to get back the rate for the whole window. Its not exactly the same but it should work well.

@nathanielc
Copy link
Contributor

For exmaple:

var windows = stream
    |from()
        .measurment('DisconnectionsCounter')
    |where(labmda: "server_name" == 'TestServer' and "path" =~ /^(.*)\.DisconnectionsCounter$/)
    |window()
        .period(10m)
        .every(10s)
        .align()
    |derivative()
      .as('rate')
      .nonNegative()
    |sum('rate')
      .as('rate')
    |alert()
        .id('DisconnectionsCounterOnTestServer')
        .info(labmda: "rate" >= 0)
        .crit(labmda: "rate" > 0)
        .post('http://MyTestServer:8323')

@dan505512
Copy link
Author

This might work for disconnection counters where the value needed is zero or more.
I tried this on my db insertion values but was unable to tell the difference between 10 insertion to 100 insertions. If i need the value of insertions in 10 seconds to be over 100 and reset the counter once a day i still encounter the same problem

@nathanielc
Copy link
Contributor

The derivative also has a unit property that allows you to scale the rate but some denominator of time, in your case 10s. Can you share your TICKscript for db insertion values? Seems like we should be able to get this working.

@dan505512
Copy link
Author

var windows = stream
    |from()
        .measurment('InsertionsCounter')
    |where(labmda: "server_name" == 'TestServer' and "path" =~ /^(.*)\.InsertionsCounter$/)
    |window()
        .period(10m)
        .every(10s)
        .align()
    |derivative()
      .as('rate')
      .nonNegative()
    |sum('rate')
      .as('rate')
    |alert()
        .id('DisconnectionsCounterOnTestServer')
        .info(labmda: "rate" >= 100)
        .crit(labmda: "rate" > 100)
        .post('http://MyTestServer:8323')

It's pretty close to the disconnection counter. The only difference is that I need it to be more a 100 and not just something or nothing

@nathanielc
Copy link
Contributor

If you add a unit(10s) to the derivative does that make it over 100?

var windows = stream
    |from()
        .measurment('InsertionsCounter')
    |where(labmda: "server_name" == 'TestServer' and "path" =~ /^(.*)\.InsertionsCounter$/)
    |window()
        .period(10m)
        .every(10s)
        .align()
    |derivative()
      .as('rate')
      .nonNegative()
      .unit(10s)
    |sum('rate')
      .as('rate')
    |alert()
        .id('DisconnectionsCounterOnTestServer')
        .info(labmda: "rate" >= 100)
        .crit(labmda: "rate" > 100)
        .post('http://MyTestServer:8323')

I am going to test this out locally to make sure I am understanding this correctly.

@nathanielc
Copy link
Contributor

nathanielc commented Jul 25, 2016

Playing around with the derivative I see that it doesn't quite work well as you don't know the elapsed time of the window. There are ways to get the elapsed time but it gets overly complicated fast.

But I had another idea that is simpler. What about always using the min of the window? In the case that no reset has occurred then it is the first point. In the case a reset occurred somewhere in the window it will start from there. You loose the first part of your window during the reset but since you suggested that in your example it seems like that would be ok with you.

var windows = stream
    |from()
        .measurment('DisconnectionsCounter')
    |where(labmda: "server_name" == 'TestServer' and "path" =~ /^(.*)\.DisconnectionsCounter$/)
    |window()
        .period(10m)
        .every(10s)
        .align()

var previous = windows
    |min('value')

var current = windows
    |last('value')

previous
    |join(current)
        .as('previous', 'current')
    |eval(lambda: "current.last" - "previous.min")
        .as('rate')
    |alert()
        .id('DisconnectionsCounterOnTestServer')
        .info(labmda: "rate" == 0 OR "rate" > 0)
        .crit(labmda: "rate" > 0)
        .post('http://MyTestServer:8323')

Will that work?

Also #745 is adding if logic to evals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants