-
-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support alerting provider overrides in alert configuration #96
Comments
ref: #84 |
Hi there, Wouldn't it be easier to allow a list of webhooks indexed by names under slack or mattermost ?
We could also add a Btw just discovered your work, it rocks !! Good job Zeylos |
We have set up a web hook for Rocket.Chat using The most versatile approach would imho be someting like suggested by @zeylos - have the ability to configure multiple alerting "channels" and reference them by name. For convenience, the reference in the alerting:
slack:
- name: manager
webhook-url: "https://hooks.slack.com/services/**********/**********/webhook-for-managers"
- name: devs
webhook-url: "https://hooks.slack.com/services/**********/**********/webhook-for-devs"
custom:
- name: frontend-ops
url: ...
- name: backend-ops
url: ...
services:
- name: relevant-to-managers
url: "https://example-for-managers.org/"
alerts:
- type: slack
enabled: true
target: manager
conditions:
- "[STATUS] == 200"
- name: relevant-to-devs
url: "https://example-for-devs.org/"
alerts:
- type: slack
enabled: true
target: devs
conditions:
- "[STATUS] == 200"
- name: relevant-to-fe-ops
url: "https://example-for-devs.org/"
alerts:
- type: custom
enabled: true
target: frontend-ops
conditions:
- "[STATUS] == 200"
- name: relevant-to-be-ops
url: "https://example-for-devs.org/"
alerts:
- type: custom
enabled: true
target: backend-ops
conditions:
- "[STATUS] == 200" Great project btw!! |
FYI: #181 added support for the I've renamed them to The idea is that porting this to other alerting providers would enable something like this: alerting:
slack:
webhook-url: "https://hooks.slack.com/services/**********/**********/default-webhook"
overrides:
- group: "frontend"
webhook-url: "https://hooks.slack.com/services/**********/**********/frontend-webhook"
- group: "backend"
webhook-url: "https://hooks.slack.com/services/**********/**********/frontend-webhook"
endpoints:
- name: "something-else"
group: "misc"
# ...
- name: "home"
group: "frontend"
# ...
- name: "api-health"
group: "backend"
# ... While this doesn't offer the same flexibility as allowing an override at the service level, this should at least solve some use cases. |
Add group-specific to list for email alert TwiN#96 Signed-off-by: Bo-Yi Wu <[email protected]>
Add group-specific webhook URL for teams alert ref: TwiN#96 Signed-off-by: Bo-Yi Wu <[email protected]>
* feat(alert): Add group-specific to email list Add group-specific to list for email alert #96 Signed-off-by: Bo-Yi Wu <[email protected]> * docs: update Signed-off-by: Bo-Yi Wu <[email protected]> * Update README.md * Update README.md * Update README.md * chore: update Signed-off-by: Bo-Yi Wu <[email protected]> * Update README.md
* feat(alert): Add group-specific webhook URL for teams Add group-specific webhook URL for teams alert ref: #96 Signed-off-by: Bo-Yi Wu <[email protected]> * Update README.md * Update README.md
Add group-specific webhook URL for discord alert Provides support for paging multiple Discords based on the group selector while keeping backward compatibility to the old Discords configuration manifest integration per team can be specified in the overrides sections in an array form. ref: TwiN#96 Signed-off-by: Bo-Yi Wu <[email protected]>
Add group-specific webhook URL for Google Chat ref: TwiN#96 Signed-off-by: Bo-Yi Wu <[email protected]>
* feat(alerting): Add group-specific webhook URL for discord Add group-specific webhook URL for discord alert Provides support for paging multiple Discords based on the group selector while keeping backward compatibility to the old Discords configuration manifest integration per team can be specified in the overrides sections in an array form. ref: #96 Signed-off-by: Bo-Yi Wu <[email protected]> * docs: update Signed-off-by: Bo-Yi Wu <[email protected]> * Update README.md * Update README.md * Update alerting/provider/discord/discord.go Co-authored-by: TwiN <[email protected]> * Update README.md Co-authored-by: TwiN <[email protected]> * test: revert testing name * Update alerting/provider/discord/discord_test.go Co-authored-by: TwiN <[email protected]> Co-authored-by: TwiN <[email protected]>
Here's an example of something that might work better: For example: alerting:
slack:
webhook-url: "https://hooks.slack.com/services/aaaaaaaaaaaaaaaaaaaaaaaaaaa"
endpoints:
- name: "example-with-no-override"
alerts:
- type: slack
- name: "example-with-override"
alerts:
- type: slack
config:
webhook-url: "https://hooks.slack.com/services/bbbbbbbbbbbbbbbbbbbbbbbbbbbbb" Basically, |
Looking at the PRs #272, #271, #266, and #264, looks like its being implemented in an adhoc fashion? This is unfortunate as there are a lot of alert providers and the ones I need (specifically Ntfy) have not been updated to support overrides. Additionally for certain alert providers (like Ntfy and email), you often need to overwrite multiple fields. Take email, currently you can only overwrite the In my case with Ntfy, overriding both the In other words, I think supporting multiple independent instances of an alert type is a more robus solution than just implementing a single field override for select alert providers. |
@bishtawi The PRs you listed implement these at the group level, whereas this issue suggests implementing them at the alert level. I'd also like to point out that there is no limitation on the number of fields that could be overridden. For instance, Telegram's override has two fields - The truth is that I don't personally use all the alerting providers that are implemented in Gatus, as such, I delegate the implementation of overrides and such to those that use it. I do welcome you to implement group-level overrides for ntfy, if you need it, though. |
Sorry, I didn't mean to sound ungrateful or anything. I understand what I proposed is a more radical change and requires a lot more work and time to implement.
Next week I'll submit a PR to bring overrides to the Ntfy alert provider
-------- Original Message --------
On 11/28/24 15:11, TwiN wrote:
***@***.***(https://github.com/bishtawi) The PRs you listed implement these at the group level, whereas this issue suggests implementing them at the alert level.
I'd also like to point out that there is no limitation on the number of fields that could be overridden. For instance, Telegram's override has two fields - id and token.
The truth is that I don't personally use all the alerting providers that are implemented in Gatus, as such, I delegate the implementation of overrides and such to those that use it.
I do welcome you to implement group-level overrides for ntfy, if you need it, though.
As for the feature that this specific issue suggests, I haven't had the time to work on this yet.
—
Reply to this email directly, [view it on GitHub](#96 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAFQDHMWVH7GP5NYVU47JZL2C6PJ3AVCNFSM6AAAAABSUREDKGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMBWHA2DEMBXGI).
You are receiving this because you were mentioned.Message ID: ***@***.***>
@github.com>
|
Okay, this is happening. I'm starting the work for this tonight. |
Despite my initial enthusiasm, things aren't going as well as I'd have liked. The big problem is that Gatus is built in a way that allows its configuration to fail fast. In other words, on start, if the configuration is invalid, it'll crash right away. This is to provide the best user experience possible; people want their applications to run properly if they start properly, as opposed to crashing, for instance, 20 hours after starting. Now this alert override makes this much harder, because the alerts themselves live under the endpoint, while the providers' configuration live completely separately. The provider takes in the alert and "sends" (triggers/resolves) it. For this to work, the provider would now need to take in the alert, modify its own configuration, and then "send" the alert. Right now, even the group override is very simple, plus, it lives directly in the provider as opposed to the alert overrides which would live under endpoint, which means I have to find sneaky ways to avoid circular dependencies. Anyways, I digress, and I think code speaks louder than words, so for those interested, here's where I am right now: func (provider *AlertProvider) getConfigWithOverrides(group string, alert *alert.Alert) (*Config, error) {
// TODO: Caching? would need to pass the endpoint's key so that I can create a working cache key
cfg := provider.Config
// First, we handle group overrides
if provider.Overrides != nil {
for _, override := range provider.Overrides {
if group == override.Group {
if len(override.AccessKeyID) > 0 {
cfg.AccessKeyID = override.AccessKeyID
}
if len(override.SecretAccessKey) > 0 {
cfg.SecretAccessKey = override.SecretAccessKey
}
if len(override.Region) > 0 {
cfg.Region = override.Region
}
if len(override.From) > 0 {
cfg.From = override.From
}
if len(override.To) > 0 {
cfg.To = override.To
}
break
}
}
}
// Second, we handle the alert's overrides
if len(alert.Override) != 0 {
cfgAsYAML, err := yaml.Marshal(cfg)
if err != nil {
return nil, fmt.Errorf("failed to marshal provider config with group overrides to YAML: %w", err)
}
alertOverridesAsYAML, err := yaml.Marshal(alert.Override)
if err != nil {
return nil, fmt.Errorf("failed to marshal alert overrides to YAML: %w", err)
}
mergedCfg, err := deepmerge.YAML(cfgAsYAML, alertOverridesAsYAML)
if err != nil {
return nil, fmt.Errorf("failed to merge config with alert overrides: %w", err)
}
if err = yaml.Unmarshal(mergedCfg, &cfg); err != nil {
return nil, fmt.Errorf("failed to unmarshal merged config: %w", err)
}
}
return &cfg, nil
} The big concern here is while this implementation tries to keep things as simple as possible (because I don't want to make creating an alerting provider extremely complicated to add, otherwise I'll put myself in a situation where I'm the only one that can create new providers and it's hard for people to contribute). I'm just rambling at this point and not really expecting any answers, but man, things would be so much easier if I could work on this full time as opposed to juggling this and work 😩 One day, maybe one day, I'll be able to focus on my passion without worrying about paying the bills. TL;DR: I'm working on it, but it's a little more complicated that I anticipated to design it in a way that wouldn't make it too difficult for new contributors to add new alerting providers in the future. |
Looking good so far! This is what the configuration will look like: endpoints:
- name: example
url: "https://example.com"
conditions:
- "[STATUS] == 200"
alerts:
- type: discord
provider-override:
webhook-url: https://discord.com/api/webhooks/someother/webhook |
* feat(alerting): Implement alert-level provider overrides Fixes #96 * Fix tests * Add missing test cases for alerting providers * feat(alerting): Implement alert-level overrides on all providers * chore: Add config.yaml to .gitignore * fix typo in discord provider * test: Start fixing tests for alerting providers * test: Fix GitLab tests * Fix all tests * test: Improve coverage * test: Improve coverage * Rename override to provider-override * docs: Mention new provider-override config * test: Improve coverage * test: Improve coverage * chore: Rename Alert.OverrideAsBytes to Alert.ProviderOverrideAsBytes
@zeylos, @philippe-granet, @tiwood, @allanviana-logcomex, @sashkab, @Coolomina, @pollosp, @dedo1911, @umputun @GithubUser5462 I apologize for ping, but since you folks have expressed your interest by reacting with 👍 to this issue, I assume it should should be acceptable for me to ping. If you no longer use Gatus or are not interested, please disregard this message. For those of you still using Gatus and interested in this feature, it is now available under the If some of you have the time to spare, I'd very much appreciate if you could try the new provider-override feature with whatever alerting provider you use. Here's an example for Discord: alerting:
discord:
webhook-url: "https://discord.com/api/webhooks/11111111..." One for GitHub: alerting:
github:
repository-url: "https://github.com/TwiN/first-repo"
token: "github_pat_123456..."
endpoints:
- name: example
url: "https://example.org"
interval: 5s
conditions:
- "[STATUS] == 200"
- "[RESPONSE_TIME] < 5"
alerts:
- type: discord
description: "alert going to the default webhook-url"
send-on-resolved: true
- type: discord
description: "alert going to the overridden webhook-url"
send-on-resolved: true
provider-override:
webhook-url: "https://discord.com/api/webhooks/222222222..."
- type: github
description: "alert going to the default repository-url"
send-on-resolved: true
- type: github
description: "alert going to the overridden repository-url"
send-on-resolved: true
provider-override:
repository-url: "https://github.com/TwiN/second-repo"
token: "github_pat_example-of-some-other-token-if-your-default-token-does-not-have-access-to-the-second-repo" If you don't have the capacity to help, don't worry, no hard feelings :) Happy Holidays. |
Hey N.P and thanks for your work in this tool, I think I could help you with slack and pagerduty , which are what we are currently using |
Currently, users can target specific groups of people by using the description as a way to notify specific groups for specific alerts. This, however, does not apply for all alerting providers.
For instance:
(note the
services[].alerts[].description
, reference)By tweaking the notification settings, each groups could be notified only when it concerns them.
That being said, I do understand that this is more of a hack than a viable solution; I think the root cause is that there may be a need for supporting multiple slack webhooks rather than just one, even if you could technically use
custom
to create a 2nd Slack provider, having the ability to create N alerting providers would be better.Across all alerting providers, there appears to always only one parameter that may need to change in order to support multiple recipients:
As a result, we may be able to just add a single optional parameter to the service alert (as opposed to the alerting provider) to override that "key" parameter for a specific service:
As a result, the
relevant-to-devs
service would send to the overridden webhookhttps://hooks.slack.com/services/**********/**********/webhook-for-devs
instead of the default"https://hooks.slack.com/services/**********/**********/webhook-for-managers"
.It's a little hacky, especially when you look at the Slack alerting provider which only has a single parameter (
webhook-url
) as opposed to the Twilio alerting provider, which has several parameters out of which only one is relevant to alert a different target (to
), but I think that if we want to avoid breaking changes, this is the best way to do it: it's transparent for those that don't need the feature, and while most users will likely never use that feature, the option is still available.The text was updated successfully, but these errors were encountered: