-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nats module improvements #22445
Nats module improvements #22445
Conversation
Signed-off-by: chrismark <[email protected]>
💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Pinging @elastic/integrations-platforms (Team:Platforms) |
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that a metricset that collects events per connection shouldn't be enabled by default.
With this change, we are modifying the behaviour of a default metricset from collecting a summary of metrics for all the connections, to collect one event per connection. I guess that a production nats server can have hundreds or thousands of connections at a given time, would this generate such number of events per fetch period?
I think that the summary, and the metrics per connection should be in different metricsets.
I see that we also collect the number of connections in the stats
metricset (assuming that nats.stats.total_connections
is the same thing).
Maybe a plan could be to:
- Leave the summarized metrics (the total count would be the only one at least by now) in the
stats
metricset. - Deprecate the
connections
metricset in 7.x/7.11, and remove it in master/8.0, the only metric it collects is also collected by thestats
metricset. - Add a new, non-default
connection
metricset, that collects the metrics per connection. Other reason to move fromconnections
toconnection
is that singular namespaces is usually preferred in metric names (e.g:system.process...
,postgresql.statement...
).
For the routes we could follow a similar plan, to deprecate routes
and create a new non-default route
metricset, but we would first need to add the count of routes to the stats
metricset if possible.
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Added 2 new extra metricsets, disabled by default, to collect metrics per connection/route. After this we can deprecate |
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for addressing all the comments. I think we could keep nats.server.time
as deprecated instead of removing it, but I am also ok with removing it now, I don't think this field is used much.
@@ -13,7 +13,3 @@ | |||
type: keyword | |||
description: > | |||
The server ID | |||
- name: server.time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to keep this in existing modules to avoid breaking changes, though I don't think this field is going to be missed.
if err != nil { | ||
return errors.Wrap(err, "failure applying module schema") | ||
} | ||
timestamp, err := util.GetNatsTimestamp(moduleFields) | ||
moduleFields.Delete("server.time") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to keep this value to avoid a smallish breaking change.
Signed-off-by: chrismark <[email protected]>
Signed-off-by: chrismark <[email protected]>
(cherry picked from commit 70ff0a7)
What this PR adds
This PR adds more metrics at
connections
androutes
metricsets. Prior to this no metrics have been collected for each of the connections and for each of the routes. With this PR we collect metrics like in/out bytes etc for each the connections and for each of the routes. In order to have routes in a NATS server a more than 1 NATS servers should be presented creating a NATS cluster. For this docker-compose file of the module adds an extra NATS instance so as populate routes and their respective metrics.Why it is important
It related to elastic/integrations#359. Next steps would be to enhance/improve dashboards.
How to test this manually
docker-compose
file of nats moduleconnections
androutes
metricsets and point them to the monitoring port of thenats
container (exported port of 8222 internal)