-
-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0107] Nixpkgs version
attribute usage normalization
#107
Conversation
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/brainstorming-for-rfc-pname-and-version/12873/102 |
version
attribute usage normalization
I don't really like |
Using
The idea is using a
I am striving to make this specification VCS-agnostic. If e.g. the program team changes the VCS, it will not affect |
I don't think that is a problem 0.0 is a very unpopular version name, the only times i've seen a 0.0.0 (or anything alike) is for name squatting, which are essentially empty packages that shouldn't be packaged in repositories in the case of |
If we use |
rfcs/0107-version-normalization.md
Outdated
this order: | ||
- the version of the latest labeled snapshot, as defined above; | ||
- If the project never released a labeled snapshot, `0_0` should be used | ||
as default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the change from 0_0
to i.j
impact lexicographical ordering?
I'm thinking most (all?) systems probably use ASCII values to sort, but it'd be good to both research this and explicitly consider it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version ordering is the same as used by Nix itself, being registered in the nix-env manpage.
I am treating it as previous knowledge here.
The idea of 0_0
is that, being an uncommon numbering scheme (underscores are not employed in Nixpkgs version attrs, as far as I remember), it will not clash anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about systems other than Nix as well, as other programs operating with nixpkgs data may be using the version attribute for their own purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I see in the nix repl 0_0
sorts as greater ('newer') than for example 0.1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also against 0_0
. It is not very intuitive for readers. If we need to put a number to satisfy parseDrvName
I think we should just use 0
or something more obvious like 0unreleased
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"0_0" is [0, "_", 0] thus smaller than "0.0" but larger than "0"
Lexicographic comparisons applies to string components after splitting.
"0.pre" seems to be smaller than anything not starting with "0.pre.pre" (dot after zero is optional). I hope that "0.pre.pre" upstream version will not be a frequent issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the man page does not mention this but both .
and -
are special-cased separators that don't become part of any component! I had assumed the .
was part of the evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
holy cow. just use sort -V
aka sort --version-sort
printf "%s\n" 0.0.0 0.0.1 0.0 0_0 1.0 0.1 2.0 \
1.0-unstable-2023-02-04 1.0+unstable-2023-02-04 \
1.0-unstable.2023-02-04 1.0+unstable.2023-02-04 \
| sort -V
0.0
0.0.0
0.0.1
0.1
0_0
1.0
1.0+unstable-2023-02-04
1.0+unstable.2023-02-04
1.0-unstable-2023-02-04
1.0-unstable.2023-02-04
2.0
nix should produce the same sort order, otherwise i would call that a nix bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nix does not produce the same order, pre
is special-cased to sort earlier than in sort -V
, and this is not a bug but a reasonable decision, compatibility with which we want to keep to avoid globally-breaking changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aah, makes sense
builtins.sort (a: b: (builtins.compareVersions a b) < 0) [ "000" "00" "0_0" "0" "0.0" "1.0" "1.0pre" "1.0-pre" "1.0post" "1.0-post" "1.0-unstable-2023-02-04" "1.0-unstable.2023-02-04" ]
[ "000" "00" "0" "0_0" "0.0" "1.0pre" "1.0-pre" "1.0" "1.0post" "1.0-post" "1.0-unstable-2023-02-04" "1.0-unstable.2023-02-04" ]
The usage of only |
Both are extremely unpopular, 0_0 is a less conventional format but it can still exist in the wild, and I still personally prefer using
I like this idea. it looked a little weird to me initially but it is growing on me. regex can still be used if we end up with the semver variation
I agree. I removed
I don't think it's a problem for |
It was an idea @7c6f434c pointed me out in the RFC brainstorm. I didn't find a clear wording to this at that time. On the other hand, I think that it should not be encoded directly in Suppose we have a program, I don' think this situation should be encoded in |
My understanding is that the case you are describing is for packages that are present in nixpkgs in both variants - would it be confusing (for humans or programs) to change the pname for a single package, possibly even in quick succession? For example, we may want to switch to a newer (unreleased) commit of a stable branch because it includes some critical fix but not imply with that that we are now following the development branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up. It is important to have some documented conventions around this.
I suspect it will be interesting to bikeshed for a while but at some point it will be best to cut the discussion at what we have and if needed future RFCs can change the format if they have a strong reason to do so.
rfcs/0107-version-normalization.md
Outdated
this order: | ||
- the version of the latest labeled snapshot, as defined above; | ||
- If the project never released a labeled snapshot, `0_0` should be used | ||
as default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also against 0_0
. It is not very intuitive for readers. If we need to put a number to satisfy parseDrvName
I think we should just use 0
or something more obvious like 0unreleased
This brings me to another finer point about lexicographic ordering: What DO we do about pre-releases? Using the previous version number is misleading, and using a suffix will sort them as newer than the actual release. I think maintaining lexicographic ordering in line with chronology is worth it to avoid subtle bugs in version checks. If we do not do that, we should maintain very explicit rules about what is allowed and how to sort it. |
Sorry if I'm being very particular here, I just think we have a chance to get all the version discussions behind us at once and for all, and while things are still nicely fluid as well. I really appreciate this RFC rationalizing the whole affair and would like to maximize that effect (to a reasonable degree). |
rfcs/0107-version-normalization.md
Outdated
- Branch denotes a logical sequence of snapshots, as identified by the program's | ||
team; | ||
- Usually these branches are denoted by names such like `stable`, `master`, | ||
`unstable`, `trunk`, `experimental`, `staging` etc.; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or by major version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A major version is more akin to a labeled snapshot, not a logical sequence of them. Indeed, major versions don't necessarily follow a continuous sequence. An example is Zig: between 0.8.0 and 0.8.1 there are many small, discrete snapshots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These only differ in patchlevel anyway.
I mean that PostgreSQL has branches 13.x and 14.x and some more.
Linux kernel, annoyingly, has stable branches differing just in minor version (although major.minor is just formatting for Linux kernel versions). So maybe a branch can be called by a version prefix instead of a name?
I think at least Nix does sort |
Good point. We could canonize this for minimal change (but it means breaking with lexicographical ordering in both the Nix language and outside consumers) as an example of the kind of explicit rules I mentioned; or think about an alternative that universally preserves correct ordering. The only ones I can think of require a mandatory suffix for all versions, so that isn't very attractive either. |
e2d53ea
to
250e5be
Compare
One issue that I don't think has been discussed is that some ecosystems of packages (Python libraries, as a running example) may have their own definition of what constitutes a valid version string (PEP 440) and tools that expect versions to conform to that definition. I've seen comments to the effect that it's useful to end users if they see versions that match the versions used by upstream/in other package managers; I'm piping up to note that it's also useful to automated tools ( Taking a step back: this RFC has seen a lot of discussion and the most recent discussion seems to be in the direction of designing the one version string format to rule them all, which strikes me as a poor use of everyone's time—it's been tried by so many others, and no true consensus has emerged; with no disparagement meant to any of the intelligent folks commenting here, why should any of us believe that we're situated to do better? Given the broad agreement that the status quo is bad, and the absence of a consensus on what is best, I propose restructuring this RFC as something like this:
This would surely be better than the status quo, would keep automated tools and our relationship with Repology working, would keep Nix store paths friendly to globbers, would keep versions as strings without making those strings have to look like data structures encoded as URL query strings, would give individual special packages ways to continue to be special without amending Nixpkgs-wide policy again, and would provide a clear path forward if the recommended version scheme needs to evolve. |
Taking a step back: this RFC has seen a lot of discussion and the most recent discussion seems to be in the direction of designing the one version string format to rule them all, which strikes me as a poor use of everyone's time—it's been tried by so many others, and no true consensus has emerged; with no disparagement meant to any of the intelligent folks commenting here, why should any of us believe that we're situated to do better?
Given the broad agreement that the status quo is bad, and the absence of a consensus on what is best, I propose restructuring this RFC as something like this:
* The version number of a package, stable or unstable, *should* conform to the nomenclature used by upstream, with any initial non-digits removed.
* A `passthru.compareVersion` attribute, if it exists, should be a function that can compare a provided version string to this package's version; this is for Nix tools.
* We should `lib` some common schemes to use here, so maintainers can write `passthru.compareVersion = lib.versions.compareWithSemver version;` for standard cases.
* Recommend a version scheme for use as a default if the upstream nomenclature is underspecified or somehow technically incompatible with Nix's versioning requirements—this is where we can bikeshed about `unstable` or `post` or `patched` or whatever, but it's just a recommendation so maintainers can use their judgment with edge cases, and we don't have to spend years talking about it.
This proposal misses the main motivation of this RFC: some agreement how to append Nixpkgs-specific extensions to an upstream version when even the notion of an upstream version is not well-defined for some snapshot.
…This would surely be better than the status quo, would keep automated tools and our relationship with Repology working, would keep Nix store paths friendly to globbers, would keep versions as strings without making those strings have to look like data structures encoded as URL query strings, would give individual special packages ways to continue to be special without amending Nixpkgs-wide policy again, and would provide a clear path forward if the recommended version scheme needs to evolve.
|
If that's the main motivation, then I missed it. The RFC as written cites three motivations:
It does not appear to be a motivation that Nixpkgs-specific extensions to the version need to be formatted in the same way across all packages, and if comparison logic can be customized then I don't see why this would be necessary (though I could be missing something). As an outside observer, it seems to me that this is what has caused this proposal to stagnate, as people come up with different edge cases that could pose difficulties for different formats and use cases. If it stops us making progress and we don't need it, let's stop trying to achieve it—choose a good-enough default for the common case, allow maintainers to solve the problem in other ways if the need arises, and start reaping the benefits of resolving the cited motivations instead of letting them linger on indefinitely. |
I am no longer sure there is a way to specify that we use an untagged snapshot that is accepted by everyone as «well, that's the most readable we will actually get» and at the same time is clearly safe for rev-deps (where things go wrong without always being obviuous to the maintainer of the package where the version is set) |
Precisely my point. If it seemed to me that such a specification were imminent I would keep my thoughts to myself and trust the process. But if you and I agree that making everyone happy doesn't seem likely, why not go with a recommendation that makes some people happy, leaving enough flexibility for maintainers to define their own version comparison semantics when issues arise? I haven't seen anyone arguing for keeping the status quo; almost anything you do here that resolves the stated motivations would be some sort of improvement on net, even if there remain people who think that a slightly different proposal would be better still. The potential for harm here isn't worth spending years in indecision.
If this is an important point that I'm missing, my apologies, I would need it spelled out more than this to address it. |
> I am no longer sure there is a way to specify that we use an untagged snapshot that is accepted by everyone
Precisely my point. If it seemed to me that such a specification were imminent I would keep my thoughts to myself and trust the process. But if you and I agree that making everyone happy doesn't seem likely, why not go with a recommendation that makes some people happy, leaving enough flexibility for maintainers to define their own version comparison semantics when issues arise? I haven't seen anyone arguing for keeping the status quo; almost anything you do here that resolves the stated motivations would be some sort of improvement on net, even if there remain people who think that a slightly different proposal would be better still. The potential for harm here isn't worth spending years in indecision.
Given that people have strongly objected to any word choice proposed (yes, sometimes including me, I will not support the idea that something that upstream _calls_ nightly releases is «unreleased»), I am not even sure we can build a shepherd team consensus around any set of wording markers.
Version comparison function is a very different proposal and doing it here without addressing the handling of untagged snapshots would be hijacking, in my opinion. Doing it as a separate RFC is fine, of course (well, whether it will get stuck is a different question).
> and at the same time is clearly safe for rev-deps
If this is an important point that I'm missing, my apologies, I would need it spelled out more than this to address it.
We have seen some weird issues with using + as a separator. And Nix doesn't give us _that_ many characters we can put into the derivation names without gotchas.
|
Look, as the man said, if you choose not to decide, you still have made a choice. If the shepherd team can't find a way to move forward, that is equivalent to the shepherd team choosing the status quo: What are the objections to that? That version scheme doesn't handle every corner case we can throw at it? It handles more than the status quo does. Don't like the word ‘unstable’? That's what the status quo uses. It fully resolves the first two of the three issues cited as motivations for this RFC, and maybe it isn't perfect, but after some amount of time for deliberation you have to stop letting the perfect oppose the good, and I think the appropriate amount of time in this case is probably less than two years, don't you? At this point, it is my humble opinion that the shepherd team should stop entertaining (or, I guess, raising?) objections to progress that apply equally to the status quo, because the status quo is the consensus they are choosing through their inaction.
I saw some discussion upthread of making versions structured values instead of strings and implementing version comparison with typeclass-style polymorphism, and if that's part of the conversation then I really don't see how my suggestion is hijacking. But maybe you think that was hijacky too and just didn't object to it at the time, I don't know. I included it because the third motivation in this RFC seems difficult to properly address without either a versioning scheme that somehow includes all others as subsets (highly unlikely if not outright impossible) or per-package customization. But fine, if you think that's out of scope I think we still have a proposal that is an improvement over the status quo and hits two out of three. |
Ryan Hendrickson ***@***.***> writes:
> I am not even sure we can build a shepherd team consensus around any set of wording markers.
Look, as the man said, if you choose not to decide, you still have made a choice. If the shepherd team can't find a way to move forward, that is equivalent to the shepherd team choosing the status quo: `unstable-YYYY-MM-DD` for snapshots. We know this is technically flawed because it doesn't start with a digit so `parseDrvName` doesn't like it, and it doesn't sort with stable versions the way we'd want. So, do the closest thing that resolves those issues: use `X.Y.Z-unstable-YYYY-MM-DD` for a snapshot if it succeeds a stable release of `X.Y.Z`, and `0-unstable-YYYY-MM-DD` if there is no preceding stable release. You are sad because according to current version sorting rules `2.11-unstable-2018-04-01` is greater than `2.11.1`? Then change the default version sorting rules to recognize `-unstable-` as a super-separator, so that versions are compared as if they were parsed into a pair `(StableVersion, Maybe Text)` with the natural lexicographic ordering, using the existing version comparison logic on the `StableVersion` part. (This is way smaller in scope than what the RFC as written proposes, intentionally; sorry if that means this counts as ‘hijacking’ but clearly the bigger thing isn't moving.)
What are the objections to that? That version scheme doesn't handle every corner case we can throw at it? It handles more than the status quo does. Don't like the word ‘unstable’? That's what the status quo uses. It fully resolves the first two of the three issues cited as motivations for this RFC, and maybe it isn't perfect, but after some amount of time for deliberation you have to stop letting the perfect oppose the good, and I think the appropriate amount of time in this case is probably less than two years, don't you? At this point, it is my humble opinion that the shepherd team should stop entertaining (or, I guess, raising?) objections to progress that apply equally to the status quo, because the status quo is the consensus they are choosing through their inaction.
This sounds like a much better change to me than the one the RFC proposes.
|
However, RFC choices are more authoritative so they should be held to at least some standard.
Not fully uniformly, and improving uniformity of an unfortunate thing is a loss. For most packages where it is used, it is not an absolute lie at least. The few where it is are free to just use a word which is actually a good idea, because it is just a prevailing practice, not well codified.
Getting a response from the author which of the proposal would be considered for incorporation (when there were multiple in parallel) doesn't always work here… and by process, this is a pretty blocking circumstance.
I was talking about hijacking in a context that completely omitted the part about snapshots not bearing an upstream version. The new proposal is not hijacking, sure.
If you put it like this (and given the timing precedent in the previous iterations of the discussion), it sounds like as different RFC with an author focused on getting no matter what as soon as possible should be opened (presumably by @rhendric ) For the case I personally care about, the change proposed is useless: if patch gating needs the expensive evaluation of custom code, that we can have now, and what (Also, neither «unstable» nor «unreleased» are good mandatory choices because they are incompatible with the language some upstreams use, so the status quo where you can change the word and get someone to merge before nitpickers arrive is better than fully codifying the sometimes-bad words) |
I suppose that's the crux, then; the rest of this stuff doesn't matter if the process prevents progress on the RFC without author participation. I'll start another one and we'll see how far that goes. |
Given that the original intent of this RFC was smashed by corner cases, we are at the step zero. Any idea will look good. Especially, a complex data structure for monotonic versioning instead of magic with strings. |
Given that many people are discussing the new RFC #147, I will give for this a grace period of 2 days and close it. |
I will close this in two weeks. |
This RFC proposes to normalize the
version
attribute used in Nixpkgs' expresions.Rendered!