-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netlink: receive hangs indefinitely on invalid message #20
Comments
I see same issue. I am playing with ipset messaages and using nltrace I can see that the buffer sent is matching to what it's expecting and ipset (iphash) is created (following dnsmasq's code). Is this a bug in this library or malformed message issue? I am formatting the buffer manually and use Execute() which hangs somewhere. When I terminate the program ipset operation still carried out. It would be nice to resolve this issue. |
Typically it's the result of a malformed message. I may be able to tweak a couple of things to get netlink to reject messages instead of dropping them. It would also be nice if we could have timeouts on the socket (using the runtime network poller), but those are not yet available. Maybe for 1.9 though! As this is my first venture into netlink sockets, maybe I overlooked something simple that would make this work as expected. Would be happy to accept PRs in the mean-time. |
Interesting. I would not expect this to be the case. Does netlink not respond with an acknowledgement? Maybe explicitly request one with the appropriate header flag and see what happens. Execute assumes that netlink will send a reply. You could try just Send instead. |
I found a fix (more like user error). It appears that the request I was sending used syscall.NLM_F_REQUEST (1) which shows as by nltrace tool. ipset itself sends 517 flag in NlMsgHdr which translates into <REQUEST, ACK, MATCH>. I immediately tried it and it works. So I tried your flag which is <MULTI, 0x10> which is blocking and returns if passing syscall.NLM_F_ACK. Basically based on the observed behavior, without syscall.NLM_F_ACK flag Execute is blocked (maybe we need a method that doesn't wait for a response). You will still get an error like this "failed to unmarshal response: not enough data to create a netlink message" but at least you will get a response msg from kernel about your request: ex: [NETLINK HEADER] 16 octets For now I am satisfied with this solution. |
Yep, this is why Execute is just a convenience method for Send, Receive, and Validate. I recommend using them separately if your use case needs it.
I would expect this package to be able to unmarshal the error message properly. Mind writing a test case and possibly sending a PR to fix that, with the error you encountered? If it's just returning 0 bytes or something, we should account for that with a more descriptive error probably. |
By the way, @NaerChang2, if your code is open source, I'd love to take a look and see what you're doing with it. |
Thinking about this again now. I am curious if replacing |
This won't be a problem in 1.12. Closing because #119 covers this. |
When using
conn.Execute
to send/receive an invalid (unknown) netlink message, receive will hang, waiting for a reply.There probably needs to be some kind of timeout to happen in receive.
For example, send this message:
The text was updated successfully, but these errors were encountered: