-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: clarify how unsafe.Pointers may be used to determine offsets between addresses #12445
Comments
it's probably make sense to introduce a unsafe.Overlap
(or runtime.Overlap) function to determine if two slices
overlap.
Allowing subtraction of unsafe.Pointer (after converted
to uintptr), might give the user false impression that
there is a fixed distance between two pointers (i.e.
pointers don't move)
|
|
To add a data point to this. We now implement strided block overlap detection in github.com/gonum/matrix/mat64, documentation here and detection implementation here. This is done essentially as described in the original discussion, both with unsafe for environments that allow that and via reflect for those that don't. I don't see a constant time implementation that can be achieved without access to an offset call. |
Something like below seems like it should be safe even under a moving GC.
|
That does not do what the linked code does. We want to return whether there is an i,j and x,y such that a[i + jstride] and b[x + ystride] address the same variable when the indices are constrained such that they index a subset of the possible blocks that may be described with the given stride over the slices. However it does suggest a possible solution.
and the family friendly version (this one is more tenuous since it depends on values passed back from a reflect call):
These are then used in the functions also linked above. Note that it is OK for our use case for the offset to be used after a possible GC move since if a and b are part of the same allocation they move together and so the offset remains valid, and if they are not, an offset cannot become indicative of an overlap. The potential race in the sign condition is troubling though. |
There are certainly whole classes of concurrent GC algorithms for which On Sat, Nov 28, 2015 at 4:17 AM, Dan Kortschak [email protected]
|
This is true if the GC algorithms are considered in isolation from the language's promises about behaviour. This issue aims to address that. Currently there seems to be a de facto promise* see discussion in #7192 and #8994 that use of converted unsafe.Pointers is atomic wrt the GC if the use is in a single expression. This is how the unsafe.Pointer->uintptr happens in the code above, with the exception of the condition which troubles me. * Not a promise.
This is understood; hence the comments in the linked code. The issue for us is that introduction of GC behaviour that breaks this code without a work around to obtain an offset will break gonum/matrix/mat64 behaviour. We use unsafe for the normal case and we don't expect Go1 coverage for that, but we also use the reflect equivalent for environments that disallow unsafe. |
@RLH Sorry, when you say "for which this will not work", what are you referring to by "this"? E.g., would the overlap function I wrote above (setting aside momentarily that it doesn't address @kortschak's actual use case) not be safe under some of those GC implementations? The Go spec says "For struct If those aren't safe extrapolations, perhaps the spec should be revised to only guarantee |
@kortschak They're certainly very related. But my impression is #8994 is about the rules for how you can safely convert unsafe.Pointer to uintptr and back, whereas this issue seems to be more about which arithmetic laws still hold after converting unsafe.Pointer to uintptr. E.g., the |
Yes, that's true, though I think that once the rules for the other issue are settled that defines what happens here.
|
Let me try again. Various concurrent moving collectors including those This creates implementation issues around the implementation of == which The single example found in the unsafe part of the spec seems to extend the The examples provided in this thread use unsafe.Pointer in different In summary, the statement that "Something like below seems like it should More to the point is that both #7192 On Sat, Nov 28, 2015 at 9:32 PM, Dan Kortschak [email protected]
|
@RLH Thanks. I understand the idea that under a moving GC that a single "pointer value" (in the Go spec/language sense) might at a given time have multiple bit patterns at the machine implementation level, and so this complicates the implementation of Go pointer equality. However, we seem to have different interpretations of the spec's
.
So at the very least I think it's worth deciding/clarifying which interpretation of the spec is intended. |
The problem with
is that it can be viewed as making assumptions about atomicity with respect to garbage collection. The issue is not rules of integer equality, it is instead about what GC activity might occur between evaluating I'd be far happier if the runtime had a variant of kortschak's To someone who's worked on GC (and optimizers, and concurrency) it is extremely nervous-making to see people so happy to assign precise semantics to unsafe code (and generally, so happy to use unsafe code). |
Technically I think we understand each other. Short term expanding the semantics of unsafe will lead to more and Long term expanding the semantics of unsafe will lead to technical debt and I think we should spend some time and see if we can come up with solutions On Sun, Nov 29, 2015 at 2:57 PM, Matthew Dempsky [email protected]
|
@dr2chase Arguably there are atomicity constraints even in expressions like If adding more primitive operations is on the table, I'd suggest something like:
So for example, Then @kortschak's offset can be implemented as:
|
@mdempsky It's not clear to me that unsafe.Add was rejected. It just hasn't been accepted. |
I think the preferred approach is don't use unsafe. |
@dr2chase By corresponding inverse, do you mean the |
I'm not a huge fan of Base because it implies that the runtime knows the answer and can compute it in a timely fashion. I think that's mostly true today but might not be in some possible GC implementations. For instance, if p is a pointer into a global variable, we don't have an efficient way right now to determine, given p, the starting point of the containing global variable. It also doesn't work on non-Go-managed memory, which will be confusing. If a function gets two slices and uses Base to determine aliasing, why should it work for slices whose backing store is allocated by Go and not for backing stores allocated by mmap? |
@randall77 Fair points! So perhaps tying the wording to "Go variables" is wrong. Instead something like "indivisible block of memory" or whatever appropriately describes the granularity at which the GC may rearrange memory. Also, for pointers to objects that will never be moved by the GC (e.g., global Go variables, or unmanaged memory), That would allow offset (as written above) to work for slices whose backing store is allocated by mmap, because their base pointers would be equal (both Downside is it would also work for slices of even separate mmap allocations (or of an mmap allocation and a Go global variable), but at least since those objects aren't going to move around in memory anyway, I don't think it could cause problems? (Though I'll continue thinking about that.) Also, I don't see how Go could be expected to reliably distinguish those cases anyway. |
Even for Go-allocated objects, I'm not sure we want to commit to Base. Imagine the following scenario. We allocate objects from a region using a bump pointer and record size&type info in a side buffer. So object start info is encoded in the side buffer, but not in any easily seekable way. Presumably GC would need to find object starts, but maybe it only does so in bulk - i.e. a whole region at a time. To answer an individual Base query would require a scan of the side buffer. I'm not saying a GC like that would ever work, but I'm concerned that by adopting Base we'd be preventing any such implementation. I'm with Minux that if we're trying to determine overlap, ask the runtime directly. I think this handles the common case, effectively the stride==1 case. unsafe.Overlap(p unsafe.Pointer, plen uintptr, q unsafe.Pointer, qlen uintptr) bool? The nice thing about this function is that the API is GC-agnostic (the implementation would have to understand the GC, of course). Maybe we handle stride>1 queries as the stride==1 query above together with the guarantee that uintptr(p)-uintptr(q) is always correct if p and q are in the same allocation? I'm less sure about whether this makes sense and/or is possible given a moving GC. |
This does not handle the block matrix use that we have.
Nor does this AFAICS. |
@kortschak, could you be more specific about what you're trying to do then? I'm afraid I don't understand what you mean in the second half of your description:
|
@randall77, probably the best thing to do is look at the docs linked above and the implementation of the checkOverlap methods in the code linked above. However, in brief we need to be able to detect overlap in views of a 2-d matrix. An overlap is when two matrices share an element in the backing store - this is more complex that just a trivial overlap as described in minux's first proposal and also more complex than a stride>1 overlap, since a description of a matrix is |
I think you could do it with:
|
I don't see how that helps. The issue is that the test for a block overlap requires that arithmetic be performed on an offset, specifically the test in rectanglesOverlap. You are right that it simplifies these two conditions, but that is a minor component of the test. |
But my code is computing an offset you can use. Isn't it exactly what your offset call returns? (Modulo size of the base type.) Just pass |
Thanks Ian. I will refrain from doing this so to avoid the “I told you so” moment :) However, please consider guaranteeing this for this type of usage. I’ve coded a free list over a single byte slice consisting of byte-records where the “next pointers” (record offsets) are encoded within the records themselves and records are removed from the list as reslices. Having a reliable way to do pointer subtraction between a reslice and the initial byte slice would have allowed me to stay with an Api for returning back records (reslices) which accepts just the reslice itself without having to maintain an additional “record offset” context adjacent (or encoded within) each record (reslice). |
@kortschak Re-reviewing this thread, in your earlier comment, you mention "We want to return whether there is an Is this still the main/only motivating case within the Go BLAS libraries, or are there others to consider? Also, can you link us to the code you're currently using to do this? (I imagine it's unchanged, but the code you linked here says it's been archived, so I wanted to double check.)
I think this concern is obsolete? App Engine has supported package unsafe for a while now, and cmd/compile subsequently removed the |
Thanks @mdempsky. The code now lives here and here.
This is still the concern.
This is good to know. I think we will conitinue to provide the safe option, but separate that from the appengine case. Is this documented somewhere? |
Do you mean App Engine support for package unsafe? If so, surprisingly, it doesn't seem to be documented anywhere officially. You can certainly find mention of this, like here: https://news.ycombinator.com/item?id=18230938 ("You can even import 'unsafe'!"). But even the App Engine KB still suggests "unsafe" is a reason a program might not work on App Engine (though perhaps "work" is meant more broadly than "compile"). It's been supported by App Engine since the Go runtime switched to using gVisor, which was the Go 1.11 runtime. According to Long-term support for legacy runtimes, it sounds like projects using older Go runtimes are going to be migrated to Go 1.11, so they'll have access to package "unsafe" too. |
Yes, that's what I was referring to. It's not that surprising. My experience with appengine has been pretty much generally that the documentation is not so good. The reason I ask, is because we test with the "appengine" tag on the basis that it's a synonym of "safe". If we differentiate the two, it would be good to know if using the "appengine" tag is essentially the same as no tag for the purposes of our code. |
There is no longer any available way to deploy on App Engine that rejects use of unsafe (Go 1.11 runtime is the minimum one available now), and the -u compiler option is gone. You don't need to worry about unsafe not being available in standard binaries. (Not sure about wasm?) |
Thanks. We will go through and remove the |
I’m confused. Given @rsc recent comment #41049 (comment), in particular pointing out a similar usage in aliasing.go I feel compelled to ask why should I consider the sample above #12445 (comment) to be “unsafe” when a similar calculation (which I figure depends on the same guarantees) is considered “safe” by the go team ? |
Ian said above that it's not, strictly speaking, guaranteed to work on all possible Go implementations. |
I don't think there is any contradiction here. The standard library sometimes does unsafe things. If the implementations change, then the standard library changes. The code in crypto/internal/subtle isn't considered to be safe. It's just considered to work with the current implementation. And that is what I said about your code as well. The difference is that if we change the implementation we will fix crypto/internal/subtle, but we won't fix your code. |
@rsc one follow up on the appengine environment. Will it allow non-stdlib assembly code, or should we protect that? |
@kortschak As I understand it, Go 1.11+ on App Engine is just running inside gVisor, which emulates a typical Linux VM. So I would expect assembly, cgo, etc. to all function normally like with non-App-Engine builds. Evidently App Engine's Go 1.11 runtime doesn't even set the |
Great. Thanks. |
That doesn't sound fair to me because it strongly suggests preferential treatment to package authors on the Go team in the sense they are allowed to benefit through deviating from the language spec and be assured that if things break, the code will get fixed one way or another where I as a Go user not only in inferior position to know what unspecified /prohibited language usage "actually works" with the current implementation, but have no way to be informed when such "unsafe" usage is expected to stop working, let alone taking on the risk of not being able to find a workaround in such case. In short, what's good for package authors on the Go team should be good for all external package authors which in this instance implies that you really should fix these types of expressions if they break in the future. |
My intent was not to challenge the truthiness of your statements but rather point out an "illegal" example I did not expect to see in the Go library asking how you would deal with such similar expressions in user code if they ever break. |
It is preferential treatment for the Go standard library. That is not the same as authors on the Go team. Many people not on the Go team have made changes to the Go standard library. Most code written by people on the Go team is not in the Go standard library. In any case I think the whole point of this open issue is to find some way to make this safe for any Go program. |
Perhaps a similar approach to how this issue is handled for syscalls is sufficient?
(*) actually, it would be nice to have f func(a ...uintptr), but it seems Go doesn't support passing on variadic arguments |
unsafe.Fix? If the only extension needed here is for determining slice overlap, I think an (Also, Go does support passing on variadic arguments. See the last two paragraphs of "Passing arguments to ... parameters".) Pointer subtraction. I think we should define That would guarantee that if It would also guarantee that if This would make gonum's code correct as-is. The crypto/internal/subtle code though would need revision to something like:
Pointer comparison? It might be tempting to codify additional operations beyond just subtraction; for example, However, guaranteeing unsafe.Overlap? It was also previously suggested to add an It was also suggested earlier to allow pointer subtraction, but only for pointers into the same variable. However, that still requires pointer subtraction to be GC-atomic, at least for pointers into the same variable. Moreover, it would require users to perform two separate GC-atomic operations (test for overlap, then subtraction), whereas gonum's code currently only requires one. |
Here is another case: #12445 (comment)
Why does the following error on playground ?: https://play.golang.org/p/AIIBrJr86QS
I think unsafe.Fix provides a general solution so you don't need to "solve" for each particular usage. |
https://play.golang.org/p/9CStA2kerRX for how to pass on slices to variadic parameters. The |
Good to know, but this requires a mental leap thinking about the caller variadic parameter as a slice when passed
Why? you can explicitly control which objects are locked in memory and these are clearly apparent from the argument list, |
Additionally, we can have higher order api’s like AnyOverlap built on top of Fix having solutions both for the general case and one-liners for specific common cases . |
In that comment you said: "Having a reliable way to do pointer subtraction between a reslice and the initial byte slice would have allowed me to stay with an Api for returning back records (reslices) which accepts just the reslice itself without having to maintain an additional “record offset” context adjacent (or encoded within) each record (reslice)." I don't follow what you mean by the rest of that sentence, but it sounds to me like the same slice overlap/offset issue, which I addressed in my previous comment (with pointer subtraction even). If you don't think that's the case, it would be helpful to know what code you're currently writing instead, and what code you would like to write (e.g., using Notably, your unsafe.Fix example above is simply pointer subtraction too.
This is getting off-topic. Please see https://github.com/golang/go/wiki/Questions.
Because pinning memory is undesirable. It constrains the runtime, and we prefer to avoid that if possible. We pin memory for cgo and syscalls out of necessity. Here there's not yet any evident need for pinning memory. Pointer subtraction seems adequate for the use cases enumerated so far. |
You wrote:
I read this as an ask for additional programing scenarios which would benefit from fixing objects in memory, thus my free list example.
Yes, and it's a key requirement for the free-list implementation I attempted to describe above. If I can determine the reslice (e.g node) position in the free list (e.g source slice) simply by subtracting it's pointer value from the base pointer value of the free list slice then I have it's "address" in the free-list. I can then use it to replace the current address of the next first element in the list.
(1) My comment was in response to the argument made that the Fix API is "unnecessarily complicated".
The Fix abstraction was introduced using a simple example in order to illustrate a general approach which addresses more complex cases like the one you described under Pointer comparison? where multiple distinct objects need to be locked atomically. |
I need some sort of overlap detection for #60138. |
For background see discussion at https://groups.google.com/d/topic/golang-nuts/Uet5p_3JhZs/discussion
Currently it is possible to determine the offset between two values' addresses by finding the difference between uintptrs converted from unsafe.Pointer values. This is the only way to be able to determine whether two slices overlap in memory since the advent of three index slicing and the only way to determine in less than linear time whether blocked sparse slices overlap (relevant for views in constructed 2D slices for e.g. BLAS operations). In the event that a moving GC is implemented this may no longer be safe; if a GC move occurs between taking the conversion to uintptr of address of the first and second values, the offset may be incorrect. Note that the GC move does not otherwise invalidate the offset between overlapping slices or falsely result in non-overlapping being construed as overlapping.
The text was updated successfully, but these errors were encountered: