-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(inputs.chrony): Use DGRAM for the unix socket #15552
Conversation
794fb7e
to
5ef9848
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution @Frankkkkk! I do have some questions and comments in the code. One additional request is to somewhere add a comment with a reference to a document stating that the socket has to be unixgram! Looking forward to the next round!
f220411
to
32c8763
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more questions...
We must use the DGRAM type when connecting to the chrony socket. On top of that, the socket must be duplicated in order to allow concurrent calls to it. Fixes influxdata#15549 Signed-off-by: Frank Villaro-Dixon <[email protected]>
Chrony ≥ 4.4 introduces a new ServerStats response with u64 numbers and more fields. Fixes influxdata#15553 NB: The plugin still doesn't work due to influxdata#15549. To test this commit, please make sure you use influxdata#15552. Signed-off-by: Frank Villaro-Dixon <[email protected]>
Chrony ≥ 4.4 introduces a new ServerStats response with u64 numbers and more fields. Fixes influxdata#15553 NB: The plugin still doesn't work due to influxdata#15549. To test this commit, please make sure you use influxdata#15552. Signed-off-by: Frank Villaro-Dixon <[email protected]> Signed-off-by: Frank Villaro-Dixon <[email protected]>
@powersj @srebhan Can we recap what is needed for this MR/issue to be closed ? Do we agree that we still need unix sockets in order to access the serverstats metrics (which are quite helpful IMHO)? If so, then:
Thanks! |
@Frankkkkk I would say
Regarding item 3: I don't think we can accept |
Regarding the
Regarding the 666: Without it it doesn't work. I don't know why. Maybe we should check that before proceeding further.. Looking at
Pinging @mlichvar, one of the fathers of chrony: can you help us here? Thanks! |
chronyd normally runs under a non-root user, so the client's socket needs to have permissions or ownership set in such a way that chronyd will be able to write to it (to send the response). Access to the socket is still protected by the permissions/ownership of the /var/run/chrony directory. |
@Frankkkkk sorry for the confusion I cause! :-( What I meant with
is to use the code you do have now but using a dgram socket if the user specifies a @mlichvar thanks for chiming in! Would it work if we set the group of the socket to chrony's group and set 0660? I guess it would and would improve security, wouldn't it? |
|
Hi,
Sorry, don't have the time for that right now, but if I recall correctly, it didn't work with 660.
I'll try to report back, but not before some weeks.
Thanks & Cheers
…On 15 July 2024 21:12:10 CEST, Joshua Powers ***@***.***> wrote:
@powersj commented on this pull request.
> + if err := os.Chmod(local, 0666); err != nil {
+ return nil, err
+ }
@Frankkkkk if you changed the permissions to `0660` do you still get success with your branch?
--
Reply to this email directly or view it on GitHub:
#15552 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
@Frankkkkk of course the telegraf-user (the one executing the telegraf) need to be in the correct group... :-) |
Confirmed that 0660 hangs: root@j1:~# stat /run/chrony
File: /run/chrony
Size: 120 Blocks: 0 IO Block: 4096 directory
Device: 4ah/74d Inode: 342 Links: 2
Access: (0750/drwxr-x---) Uid: ( 106/ _chrony) Gid: ( 113/ _chrony)
Access: 2024-07-19 15:15:09.536385535 +0000
Modify: 2024-07-19 15:13:19.722551918 +0000
Change: 2024-07-19 15:13:19.722551918 +0000
Birth: 2024-07-19 15:09:35.908770107 +0000
root@j1:~# whoami
root
root@j1:~# groups
root root@j1:~# ./telegraf-660 --config config.toml --once
2024-07-19T15:13:19Z I! Loading config: config.toml
2024-07-19T15:13:19Z I! Starting Telegraf 1.32.0-2fcfc3eb brought to you by InfluxData the makers of InfluxDB
2024-07-19T15:13:19Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-07-19T15:13:19Z I! Loaded inputs: chrony
2024-07-19T15:13:19Z I! Loaded aggregators:
2024-07-19T15:13:19Z I! Loaded processors:
2024-07-19T15:13:19Z I! Loaded secretstores:
2024-07-19T15:13:19Z I! Loaded outputs: file
2024-07-19T15:13:19Z I! Tags enabled:
2024-07-19T15:13:19Z D! [agent] Initializing plugins
2024-07-19T15:13:19Z D! [agent] Connecting outputs
2024-07-19T15:13:19Z D! [agent] Attempting connection to [outputs.file]
2024-07-19T15:13:19Z D! [agent] Successfully connected to outputs.file
2024-07-19T15:13:19Z D! [agent] Starting service inputs
2024-07-19T15:13:19Z D! [inputs.chrony] Connected to "unixgram:///var/run/chrony/chronyd.sock"...
2024-07-19T15:13:29Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-07-19T15:13:39Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-07-19T15:13:49Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-07-19T15:13:59Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-07-19T15:14:09Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-07-19T15:14:19Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2024-07-19T15:14:29Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
^C2024-07-19T15:14:39Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
^C^C^C^C^C2024-07-19T15:14:49Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
Killed
root@j1:~# |
Having the socket with 0660 and ensuring it matches the Is the way forward then to expose new config options for the group (default: _chrony) and permissions (default: 0660) and only try creating this socket if the user is trying to collect serverstats metrics? [[inputs.chrony]]
server = "unixgram:///run/chrony/chronyd.sock"
metrics = ["serverstats"]
[[outputs.file]] diff --git plugins/inputs/chrony/chrony.go plugins/inputs/chrony/chrony.go
index 784f8e929..dfddd1b06 100644
--- plugins/inputs/chrony/chrony.go
+++ plugins/inputs/chrony/chrony.go
@@ -9,6 +9,7 @@ import (
"net"
"net/url"
"os"
+ "os/user"
"path"
"strconv"
"syscall"
@@ -55,7 +56,19 @@ func dialUnix(address string) (*net.UnixConn, error) {
return nil, err
}
- if err := os.Chmod(local, 0666); err != nil {
+ if err := os.Chmod(local, 0660); err != nil {
+ return nil, err
+ }
+
+ group, err := user.LookupGroup("_chrony")
+ if err != nil {
+ return nil, err
+ }
+ gid, err := strconv.Atoi(group.Gid)
+ if err != nil {
+ return nil, err
+ }
+ if err := os.Chown(local, os.Getuid(), gid); err != nil {
return nil, err
} root@j1:~# ./telegraf --config config.toml --once
2024-07-24T19:59:07Z I! Loading config: config.toml
2024-07-24T19:59:07Z I! Starting Telegraf 1.32.0-2fcfc3eb brought to you by InfluxData the makers of InfluxDB
2024-07-24T19:59:07Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-07-24T19:59:07Z I! Loaded inputs: chrony
2024-07-24T19:59:07Z I! Loaded aggregators:
2024-07-24T19:59:07Z I! Loaded processors:
2024-07-24T19:59:07Z I! Loaded secretstores:
2024-07-24T19:59:07Z I! Loaded outputs: file
2024-07-24T19:59:07Z I! Tags enabled: host=j1
2024-07-24T19:59:07Z I! [agent] Hang on, flushing any cached metrics before shutdown
chrony_serverstats,host=j1,source=/run/chrony/chronyd.sock ntp_drops=0i,ntp_timestamps=0i,cmd_hits=5i,cmd_drops=0i,log_drops=0i,ntp_hits=0i,ntp_auth_hits=0i,ntp_interleaved_hits=0i,ntp_span_seconds=0i,nke_hits=0i,nke_drops=0i 1721851147000000000
2024-07-24T19:59:07Z I! [agent] Stopping running outputs |
* Introduce two config options for the chrony group and permissions of the socket. Use these to set the permissions and group ownership of the temporary socket. * Clean up the temporary socket on close TODO: do we need to keep "unix" as a supported scheme TODO: do we need to make the local socket opt-in or only when requesting server stats to avoid regressions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one small comment about the README section naming... Other than that the PR looks good.
Co-authored-by: Sven Rebhan <[email protected]>
Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip. 🥳 This pull request decreases the Telegraf binary size by -7.56 % for linux amd64 (new size: 243.7 MB, nightly size 263.6 MB) 📦 Click here to get additional PR build artifactsArtifact URLs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect thanks @Frankkkkk and @powersj! And thanks @mlichvar for your great input!
(cherry picked from commit 9df4800)
Summary
We must use the DGRAM type when connecting to the chrony socket. On top of that, the socket must be duplicated in order to allow concurrent calls to it.
Checklist
Related issues
resolves #15549