-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syscall: caching RLIMIT_NOFILE is wrong implementation #66797
Comments
Please don't show us code in images. Link to go.googlesource.com or GitHub instead. Images are difficult to read. Thanks. |
If I understand this correctly, the problem is that if process A uses We can't change the current behavior in which process B passes the original So I think the question is whether there is a way that process B could detect that the process limit was changed by some other process via |
|
I think there is also a get/set/set race situation here, please see opencontainers/runc#4266 and opencontainers/runc#4265 . Go CL 393354 (in Go 1.19beta1) introduced raising the RLIMIT_NOFILE soft limit to the current value of the hard limit. Further CLs (in Go 1.21, but backported to Go 1.20.x and 1.19.x) introduced more code in between getrlimit and setrlimit syscalls (see Go's src/syscall/rlimit.go). In runc exec, we use
So, I think we need to find a way to solve two problems:
|
What's the progress now? |
We already have a public function to clear the cache: var r syscall.Rlimit
syscall.Getrlimit(syscall.RLIMIT_NOFILE, &r)
syscall.Setrlimit(syscall.RLIMIT_NOFILE, &r) |
We have considered this method, but it will cause new get/set race. |
This only matters just before starting a child, so do it then. If that is a race, then you have a race problem no matter what. |
Maybe |
Hi @lifubang, do you agree with what Ian wrote in #66797 (comment), or if not, could you briefly expand on why? For the people reporting this here, is this something that affects Go code primarily consumed as a library, or package main? If there is not a valid workaround or solution in the near term, I wonder if a GODEBUG controlling the behavior could be temporary solution to help with backwards compatibility, or maybe that does not make sense? |
Sorry for delay, because I'm in another busy thread.
The core reason is that using Get/Set to clear the internal cache is not a atomic operation, I suggest to add a public method to clear this cache directly. For example: |
I understand the race. My point is that this only matters when you are about to start a child process. So do it then. There is already a race when starting a child process: you can start the child before or after the other process calls Two additional system calls are trivial, it's not worth adding a very special purpose operation to avoid that. |
@ianlancetaylor the problem is, this relies on Perhaps something like this will fix the issue for us: diff --git a/src/syscall/rlimit.go b/src/syscall/rlimit.go
index d77341bde9..8184f17ab6 100644
--- a/src/syscall/rlimit.go
+++ b/src/syscall/rlimit.go
@@ -39,11 +39,10 @@ func init() {
}
func Setrlimit(resource int, rlim *Rlimit) error {
- err := setrlimit(resource, rlim)
- if err == nil && resource == RLIMIT_NOFILE {
+ if resource == RLIMIT_NOFILE {
// Store nil in origRlimitNofile to tell StartProcess
// to not adjust the rlimit in the child process.
origRlimitNofile.Store(nil)
}
- return err
+ return setrlimit(resource, rlim)
} |
I'm OK with that kind of change. |
Change https://go.dev/cl/587918 mentions this issue: |
Change https://go.dev/cl/588076 mentions this issue: |
I think this needs more discussion. If I’m saying wrong, please let me know. |
Whoever calls Setrlimit will return an error in this case. They will know they are not able to set the limits. In our (runc) use case, we just need to remove the cache. We can set the limit as well, but that's optional, since we will use unix.Prlimit as we always did to set the limit. To me, this is the best middle ground between Go and runc needs. PS we can discuss it further in opencontainers/runc#4290 (i'm going to add go tip to its CI). |
Yes, I think this is enough for runc, but not enough for golang. |
Soft > Hard |
I don't think it matters much. If |
I looked into the manual page of `setlimit(2)`, there are two errors we should consider:
```
EPERM An unprivileged process tried to raise the hard limit; the
CAP_SYS_RESOURCE capability is required to do this.
EPERM The caller tried to increase the hard RLIMIT_NOFILE limit
above the maximum defined by /proc/sys/fs/nr_open (see
proc(5))
```
I think these two errors are normal errors that could be met by users.
For example(Not in runc):
If someone try to raise as more hard limit as possible in the process, but he hits one of EPERMs error, in a big probability he will ignores this ordinary call error, becuase this is just a try.
But at this time, the fork/execing still will be success.
If we clear the cache in Setlimit after got an error, the forked sub process's original nofile rlimit will be wrong.
It will make the user confused.
So, if there is a chance, I still suggest to add a specific public API: runtime.ClearSyscallRlimitCache().
But If all people think that to clear the cache using Setrlimit is an accepetable way, I have no strong objection opinion.
------------------------------------------------------------------
发件人:Ian Lance Taylor ***@***.***>
发送时间:2024年5月24日(星期五) 10:25
***@***.***>
***@***.***>; ***@***.***>
主 题:Re: [golang/go] syscall: caching RLIMIT_NOFILE is wrong implementation (Issue #66797)
I don't think it matters much. If Setrlimit fails for an ordinary call, it's likely to also fail when it is called while fork/execing a child.
—
Reply to this email directly, view it on GitHub <#66797 (comment) >, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAF7IQBCUK55UBW3QQOW6NLZD2QJ7AVCNFSM6AAAAABGDYYRQKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRYGM3TSMBWGY >.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
I am reluctant to add new API for this very special case. It is very unusual for a program to care about the rlimit for child processes. It is very unusual for a program to change its rlimit and fail. The new API is only required for the intersection of those two very unusual cases. |
Since the introduction of origRlimitNofileCache in CL 476097 the only way to disable restoring RLIMIT_NOFILE before calling execve syscall (os.StartProcess etc) is this: var r syscall.Rlimit syscall.Getrlimit(syscall.RLIMIT_NOFILE, &r) syscall.Setrlimit(syscall.RLIMIT_NOFILE, &r) The problem is, this only works when setrlimit syscall succeeds, which is not possible in some scenarios. Let's assume that if a user calls syscall.Setrlimit, they unconditionally want to disable restoring the original rlimit. For #66797. Change-Id: I20d0365df4bd6a5c3cc8c22b0c0db87a25b52746 Reviewed-on: https://go-review.googlesource.com/c/go/+/588076 Run-TryBot: Kirill Kolyshkin <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Auto-Submit: Ian Lance Taylor <[email protected]> TryBot-Bypass: Ian Lance Taylor <[email protected]>
Since CL 588076 runc can do fine without the kludge. The code accessing the symbol is now guarded with `go:build !go1.23` in all supported runc branches (main: [1], release-1.1: [2]). This reverts part of CL 587219. Updates #67401. For #66797. [1]: opencontainers/runc#4290 [2]: opencontainers/runc#4299 Change-Id: I204843a93c36857e21ab9b43bd7aaf046e8b9787 Reviewed-on: https://go-review.googlesource.com/c/go/+/587918 Auto-Submit: Ian Lance Taylor <[email protected]> Reviewed-by: Michael Knyszek <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
One other issue is that Go runtime's caching of This causes several issues (one of which is the series of race conditions and logic issues we have now fixed in runc with stdlib workarounds by telling the Go runtime that we plan to change our |
@cyphar I tend to take the opposite view, which is that the auto-adjustment logic works well for general-purpose programs, but not for special-purpose environments. Across all uses of Go, using That said, we are certainly open to a different approach if you have any suggestions. |
Is Go the right language for something like runc, or should runc be deprecated in favor of crun? A container runtime must operate in extremely unusual environments and I’m not sure if it is reasonable to expect Go’s runtime to work there. C, C++, and Rust all seem like better choices here. It is true that C and C++ are not memory safe, but I expect most security problems in a container runtime to be logic errors interfacing with the Linux kernel, not memory safety problems. Given what a container runtime does, a logic error and memory corruption can easily lead to equally severe vulnerabilities. |
Go is advertised as a systems language and in 2013-ish the decision was made to write Docker in Go, which lead to a lot of large projects using Go (such as most of the container ecosystem -- I would argue that Docker's usage of Go helped Go's popularity in the early days). Many of those projects use runc's code directly or use runc (because runc is by far the most commonly used container runtime). Deprecating it wouldn't make any sense. The "maybe Go isn't the right language for this" ship has long since sailed. If we could travel back to 2013, I would argue that Go is the wrong language to use -- but at the time Go was the only language that made sense to use (Rust wasn't ready, and writing Docker in C/C++ would've lead to more security issues because Docker is a privileged daemon that has to deal with far more things than runc). FWIW my experience is that Go isn't really a "systems language", it's more of a language for writing tools and web servers that don't require a lot of fine-grained control that system languages provide. But that isn't what it was advertised as in 2013, and even today Go is still sold as a "systems language" (it seems obvious now that that term means different things to different people). When I submit bug reports as a runc maintainer, I'm more than aware that we are not seen as a welcome member of the Go community because we are always seen as "doing weird things" (such as being put on the "wall of shame" for trying to work around problems in the Go runtime that are not documented and took us ages to figure out how to fix). I try to avoid opening bug reports for these kinds of issues because I know that they will just be ignored. Being told "maybe you should just shut up shop and go somewhere else" is a little hurtful, to be honest.
Sure, but adding an additional basket of possible vulnerabilities isn't a good thing. We have to do a lot of parsing of data and other operations where memory safety helps us avoid dumb security issues. I suspect that most programs could have a large number of security issues from non-memory-safety bugs, but I'm sure we agree that memory safe languages should be used when possible? Fewer possible bugs is better, right? For this particular issue, it seems very strange to argue that a process doing |
I'm sorry that you don't feel like a welcome member of the community. Certainly my goal is to work with tools like runc to find solutions that will work for the whole community. Using I stand by my claim that using For |
@cyphar If |
We have opened RFCs to try to solve these problems without using
I never said they don't matter, I was saying that the current way of doing it and the fixes that have been applied to it are flawed because they're designed around the idea that Go processes will always be the process that sets their own rlimits. I don't think this is true in general -- sysadmins sometimes need to change process rlimits for long-running daemons and might not expect their settings to not apply to children (unlike Unix programs written in other languages). (runc is not a long-running daemon, this is just something that seems like a problem to me in general.) We can work around this in runc because while the same process doesn't set its own rlimit, both processes are runc and so we can co-ordinate things (which we've already done -- this issue is no longer a problem for us). My concerns are purely based on wanting to make sure that this issue doesn't bite other users where they can't work around the problem. A "good enough" solution would be to save the rlimits set by Go during
Maybe, but we have already fixed the issue for runc (by doing a dummy However, that wouldn't help with the case where a sysadmin does a |
Quite agree.
I think cyphar's original mean is that Runc maintainers always submit issues which are difficult to resolve in go community, and then we had to decide to using some weird ways to fix them in my own, for example
For this specific issue, I think the original implementation(#46279) is not recommended. This is the core reason of this discuss. I think that if we want to raise the soft limit of nofile, we should let the users to do this, we should not do it silently in runtime. It's the responsibility of the user, but not the runtime. I think a language runtime should provide more and more controllable functions for the users, but not make a decision for them. Once we have a wrong beginning, we need to add more imperfect implementations to fix some unimportant issues, it's not worth to do. |
@cyphar Thanks for the suggestion. What do you think of https://go.dev/cl/607516? |
Change https://go.dev/cl/607516 mentions this issue: |
Go version
go version go1.20.13 linux/amd64
Output of
go env
in your module/workspace:What did you do?
RLIMIT_NOFILE is cached in the go runtime, and RLIMIT_NOFILE in the cache is restored for the child process before exec.
Assume that the system's default RLIMIT_NOFILE is 1024, and process A generates child process B through fork. B is a go application, so the go runtime will cache RLIMIT_NOFILE as 1024. When I set B's RLIMIT_NOFILE to 10000 through prlimit in A, and use exec.Command to create process C in B, C's RLIMIT_NOFILE is restored to 1024.
This behavior is incompatible with the operating system. It also caused problems with runc:
opencontainers/runc#4195
The problem originates from f5eef58
What did you see happen?
Reference opencontainers/runc#4195
What did you expect to see?
RLIMIT_NOFILE should be correctly inherited to child processes
The text was updated successfully, but these errors were encountered: