-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: make package lookups in vendor directories case-insensitive on case-sensitive filesystems #38342
Comments
We probably can't escape paths in the vendor directory the same we do in the module cache. That would break compatibility with However, we should provide a clear diagnostic when this comes up. A module that names itself |
I had not fully considered the ramifications of First, I can create a valid package, not even module, that hard breaks with Second the import path acts as both an identifier and access control. Any paths that contain case can be exploited to inject code or break packages. This is doubly scary for hosting providers and module proxies that are case sensitive, since username collision is possible. This level of divergence from the spec seems problematic; and while I agree that fixing things now is not going to be trivial, because we intend to drag If we intend to be backwards compatible we need a new directory, and a resolution scheme, assuming both exist. For the former, I propose Next we need a resolution/migration scheme. I think it is sane to have all new clients write to the new path, I will assume
EDIT: |
I don't think any functional change (aside from improved diagnostics) should be made here. A clarification though: neither GOPATH nor modules are part of the Go language specification. They're just part of GOPATH has a lot of problems. Unpredictable versions, case-insensitive collisions, unicode normalization, symbolic link cycles, and other issues have always been there. Modules set out to solve these problems and many others. Since GOPATH will be deprecated eventually (not in 1.15, but eventually), let's focus on making improvements to modules. Vendor directories the only mechanism that packages in GOPATH can use to ensure they have the correct versions of their dependencies. So let's just make sure |
That is fair, I do frequently conflate the two. Is there anywhere that defines the specification for import paths, as defined by
This seems reasonable; I will strike through that part of my previous comment, as it is not as important to me either. This may be a different discussion, but it seems well past the point that
Does that not speak to the importance of making
I should get clarification: is the Assuming
This seems like a really jarring ux:
There are also edge cases to consider:
|
That sounds like a bad interaction of a number of factors:
|
Don't. Stick to the original case, forever. (We need a better story for that, but that's the answer today.) |
The issue here is not about the mounting the vendor directory, that works correctly per docker/for-mac#320, but
To confirm,
I really dislike the message we are sending. Organizations and repos store more than Go code. There are a myriad of reasons that rebranding occurs. I do not think it is our place as a language to dictate such conventions. This is not the case today, you can have two modules that share different case import paths. The problem comes about because the Furthermore we are colliding separate identities on hosting providers that are case sensitive. This seems like a security issue. EDIT: Reversed my stance on the upstream ticket. Not unwilling, just failed to provide full context. |
Yes. I would argue that it there is no portable alternative:
|
I did say we need a better story on that. Feel free to open a separate issue. 🙂 (It's closely related to #30831.) |
Submit a change to fix one (or both). If they disagree on the domain name portion of the path, use all-lower-case: that is the overwhelming convention, and domain names are explicitly case-insensitive. If they disagree on other portions... find the person or organization who owns the common prefix and ask them which variant is canonical. (Or — better still! — look at existing users and try to do whatever will be least disruptive for them.) If you can't get the owners of those dependencies to agree, then drop one or the other dependency. (Do you really want to depend on a module whose owner is willing to leave users in a broken state over a path-casing disagreement?) |
I agree today we do not do the right thing, and a sane first step could be to make Also, sorry for not being more clear: the intent of those questions was more rhetorical, to illustrate that there are a myriad of edge cases to this approach. And while your responses are sane, they illustrate that resolving this, while maintaining the status quo is non-trivial. Would it not be easier to fix the bug, and keep the ux? |
Case-mismatched collisions are rare in practice. Have you run into many of these issues in practice? Addressing such a rare use-case does not seem worth the complexity of defining and maintaining a new, case-encoding |
I am sorry for driving this, but think some of this discussion seems to be missing the forest for the trees; the core of my concerns centre around a few points:
|
Yes, but only for
Yes. That's why
Not necessarily: the
Not to my knowledge, but if you believe you've found one please send a report per https://golang.org/security. I believe that the only hazards today are:
|
Do we expect to support things like Dockerfile
So, vendoring
Since Or phrased as a hazard:
This was not something I considered, since module vendoring completely shadows and verifies this via |
This is actually more widespread than I thought. Based upon all the modules in
|
I should have clarified this earlier, but this statement, to my visibility, is patently untrue. Problems with go-get and auth, see #26232, or appengine private module container builds, lead to |
I discussed this with @jayconrod, @matloob, and @rsc this afternoon, and here's what we came up with. As demonstrated in https://play.golang.org/p/_KZirMBFC1l, we already disallow case-insensitive collisions for individual package paths (at least in module mode). So we know that each individual package has at most one canonical casing. Furthermore, in module mode with Similarly, given the absence of package collisions we can implement a “case-insensitive” lookup in a case-collapsed Given the relative rarity of these collisions, we don't have a concrete plan to implement this approach in the foreseeable future, but if you'd like to contribute an implementation (with tests!) we'd be happy to review. |
Interesting, I was not aware that the toolchain disallowed case-insensitive collisions for individual package paths.
I am happy to, can you point me at the affected symbols/packages? (Unless you think Just stating the failure modes and outcomes this fix will provide for confirmation:
|
Playing with some variations; this seems to work though; https://play.golang.org/p/SIFKKkS9XHa (perhaps intentional?). (arrived here because I saw this issue linked to docker/for-mac#320) |
Yes, this is expected: (excepting my outstanding concerns)
|
There is a block that would need to change for the forward lookup: go/src/cmd/go/internal/modload/import.go Lines 137 to 153 in b0da26a
And one for the reverse lookup: go/src/cmd/go/internal/modload/load.go Lines 259 to 261 in b0da26a
(There may be other places, too, but those are the entry-points that come to mind.) |
Probably not. We've only had one report of an affected user (you), and there seems to be a relatively straightforward workaround: namely, build in module mode without |
Looks like CL 7314104, early 2013.
I'm not sure! Perhaps to avoid module adopters accidentally breaking GOPATH-mode users by adding imports with the wrong casing? |
I am happy to do the review as part of a CL, but thought that an early pass here may be easier, since it is currently a single insertion. To pin down requirements, I would like to confirm what behaviours we need to support. My assumption was that we needed to allow for the application of any casing strategy, (alllowercase, CamelCase, ALLUPPERCASE, etc.), since some case insensitive file systems may not even store case metadata, meaning that checking diff --git a/src/cmd/go/internal/modload/import.go b/src/cmd/go/internal/modload/import.go
index 162c29d2a6..d2d1fa1f1a 100644
--- a/src/cmd/go/internal/modload/import.go
+++ b/src/cmd/go/internal/modload/import.go
@@ -9,11 +9,13 @@ import (
"fmt"
"go/build"
"internal/goroot"
+ "math"
"os"
"path/filepath"
"sort"
"strings"
"time"
+ "unicode"
"cmd/go/internal/cfg"
"cmd/go/internal/load"
@@ -337,6 +339,24 @@ func dirInModule(path, mpath, mdir string, isLocal bool) (dir string, haveGoFile
dir = mdir
} else if mpath == "" { // vendor directory
dir = filepath.Join(mdir, path)
+ if _, err := os.Stat(string(path)); err != nil { // this is only needed when the import path does not exist in the vendor directory because of case manipulation
+ rpath := []rune(path)
+ bits := int(math.Pow(2, float64(len(rpath))))
+ for i := 0; i < bits; i++ {
+ for j := 0; j < len(rpath); j++ {
+ if i>>j%2 == 1 {
+ rpath[j] = unicode.ToUpper(rpath[j])
+ } else {
+ rpath[j] = unicode.ToLower(rpath[j])
+ }
+ }
+ cdir := filepath.Join(mdir, string(rpath))
+ if _, err := os.Stat(cdir); err == nil {
+ dir = cdir
+ break
+ }
+ }
+ }
} else if len(path) > len(mpath) && path[len(mpath)] == '/' && path[:len(mpath)] == mpath {
dir = filepath.Join(mdir, path[len(mpath)+1:])
} else { After this is merged, and GOPATH is sunset, there is also an open question of if the toolchain, specifically |
Instead of probing each sequence of runes, I think we should use If need be, we could cache those reads (perhaps in a Using the actual directories found, we may end up with more than one possible search path, but that's ok: we can keep all possible search paths and simply try to extend each candidate for every element. (There will typically be only one possible extension, but may in general be multiples — particularly if the |
I would like to clarify what we intent to support:
|
Do version control systems report changes due to directory collapse? It would be interesting to see what happens if you run I suspect that that can result in a mixed-case |
I was working off the understanding that Git fails on collisions. This is not true. For directories, it happily collides them without notifying the user what happened. For file collisions, the action succeeds, but prints a warning:
Thankfully the toolchain prevents file collisions from being too much of an issue on case sensitive file systems:
Unfortunately, the lack of any warning for folders means we can be left with a very segregated vendor directory: (from b512ce0e728f45799c94fed649aa31e5, see commit history for details)
I think the prospect of having to merge different directories together, is enough work to justify considering the alternate vendor directory spec, but regardless further discussion is warranted. |
I think we can and should support such a directory. I have an algorithm in mind, but I ran out of bandwidth for the day. I'll try to post a sketch later this week. |
I agree we should support this ux, checking in 0] There may be an argument to never checking in |
For the record, the toolchain enforces that domains are /cc @jayconrod |
Right. That's good. Less opportunity for this problem to happen. I'm going to back away from this discussion for now. I'll keep following, but at the moment I don't have bandwidth to do research that would be useful for active discussion. |
Totally fair, my goal was just to close that thread, not unnecessarily drag you back into this; sorry for the hassle. |
Ok, I think I failed to understand the original comment about merging multiple directories. I don't think that we should support merging two distinct case-equivalent directories that both contain However, I would argue that such a directory represents a defect in the VCS, not in the So, let's reject that case explicitly, and make this work for the reasonable one: https://play.golang.org/p/FKDojBNKpgx |
Sounds pragmatic; I am onboard. How smart should the toolchain be about detecting and altering on issues? I have a fix that is pretty simple, but it is going to yield in cryptic errors. My only fear is that plumbing the correct error messages may not be possible or at least feasible. |
Good question. The most conservative approach would be to do a full filesystem walk as soon as we load The fastest approach would be to only report an ambiguous package if the exact-case lookup for that package fails, but that would mean that we rarely (if ever) actually diagnose the problem in practice: the lookup won't fail on case-insensitive filesystems (because the directories are merged), and likely won't fail on case-sensitive ones in a lot of cases either (because the case-sensitive lookup will find a package, just an incomplete one). On the other hand, those cases will usually result in a build break — they only won't break the build if the case-collided files contain unused declarations (which don't matter anyway) and/or |
The mapping of capital letter to lowercase prefixed by Cases for a repository seem to create all sorts of issues beside potential collisions
From a theoretical perspective, package paths are unique which prevents collisions. This escape system to avoid enforcing rules of all lowercase seems to reduce the ease of portability. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Case-conflicting import paths cause issues when using `go mod vendor` as the layout on case-sensitive file systems is different from that on case-insensitive file system. This is a long-known issue with the Go toolchain, however fixing this is non-trivial and could have undesirable consequences. Since the capitalized form of `DataDog` (sic) is most common today, renaming the `orchestrion` module is the most pragramtic solution today. Caused-By golang/go#38342 Fixes #287
Case-conflicting import paths cause issues when using `go mod vendor` as the layout on case-sensitive file systems is different from that on case-insensitive file system. This is a long-known issue with the Go toolchain, however fixing this is non-trivial and could have undesirable consequences. Since the capitalized form of `DataDog` (sic) is most common today, renaming the `orchestrion` module is the most pragramtic solution today. Caused-By golang/go#38342 Fixes #287
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
go mod vendor
What happened?
When vendoring on MacOS, files are written with case sensitivity to a case insensitive filesystem. This can cause two issues, but is only present for vendoring, not the module cache, since modules use
!
prefixes to make the module cache case insensitive.Collision
Because import paths are case sensitive, in the event that two packages share case insensitive , but not case sensitive, prefixes collision can occur. This is infrequent due to
alllowercaseoneword
package name conventions, and github username and repo name fields being case insensitive (but with case metadata), however the it is a bug and I can provide code samples to produce compilation failures.Portability
This issue was discovered when sharing the vendor directory between systems that have different case sensitivity (MacOS to Linux) and does not require package collision to reproduce.
Given two modules share a case insensitive prefix, say
github.com/Golang/exp
andjackfan.us.kg/golang/tools
. On MacOS, they will both be written tojackfan.us.kg/[Gg]olang
and the first writer will pick the case. Builds work fine because bothjackfan.us.kg/Golang
andjackfan.us.kg/golang
exist, but when you mount thevendor
directory in a docker containerjackfan.us.kg/Golang
stops existing. This also happens for other data sharing mechasisms between filesystems likecp
,tar
,git
, etc.The text was updated successfully, but these errors were encountered: