Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0107] Nixpkgs version attribute usage normalization #107

Closed
wants to merge 19 commits into from
Closed
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 206 additions & 0 deletions rfcs/0107-version-normalization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
---
feature: nixpkgs_version_normalization
start-date: 2021-09-24
author: Anderson Torres
co-authors:
shepherd-team:
shepherd-leader:
related-issues:
---

# Summary
[summary]: #summary

Normalize the `version` attribute used in Nixpkgs' expresions.

# Motivation
[motivation]: #motivation

Nowadays, the most commonly used format for the `pname` and `version` attributes
along the Nixpkgs' expressions is:

- For stable releases:
- `pname` reflects the software name;
- `version` reflects the released version.
- For unstable releases:
- `pname` reflects the software name;
- `version` is a string following the format `unstable-YYYY-MM-DD`, where
`YYYY-MM-DD` denotes the date when the code was generated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not quite right. The date is the date of the commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am avoiding to use VCS (especially git)-loaded vocabulary.

Do you think "commit" is a better wording regardless of it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use the date of the change or when the change was published?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording «date when the change was made or published» sounds better than code being generated to me, too.


This is a simple and easy-to-understand format.

However, it does not map very well with the Nix function
`builtins.parseDrvName`:

```example
# expected: { name = "mpv"; version = "0.35.15"; }
nix-repl> builtins.parseDrvName "mpv-0.35.15"
builtins.parseDrvName "mpv-0.35.15"
{ name = "mpv"; version = "0.35.15"; }

# expected: { name = "mpv"; version = "unstable-2021-05-03"; }
nix-repl> builtins.parseDrvName "mpv-unstable-2021-05-03"
builtins.parseDrvName "mpv-unstable-2021-05-03"
{ name = "mpv-unstable"; version = "2021-05-03"; }
```

It happens because the `version` attribute returned by `builtins.parseDrvName`
starts with a digit. This is not a bug, but a deliberate design decision of Nix
specification; therefore we should strive to follow it, neither circumventing
nor ignoring it.

That being said, `version` attribute should start with a digit.

Furthermore, `version` attribute should follow the semantics for upgrading as
stated in `nix-env` manpage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this motivation a bit lacking. From my experience there is quite strong consensus that nix-env and builtins.parseDrvName have some inherent flaws (which have already caused all sorts of problems, see NixOS/nixpkgs#114995, NixOS/nixpkgs#118481 and NixOS/nixos-homepage#672) and we should be getting rid of them as soon as feasible.

In that sense the RFC solves a problem that would be disappear anyway. Since the new Nix CLI would introduce a replacement fornix-env which hopefully doesn't exhibit these flaws.

On the other hand having a solution for the meantime could be interesting, since the new nix profile subcommand seems to heavily rely on flakes which means it can't be a stable replacement for nix-env in Nix 2.4. This is just speculation though, I'm not sure what @edolstra has planned here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the BDFL said there is no flaw in parseDrvName. It just splits name and version from a given string, returning a {name, version} set.

I don't think the examples given above point to a flaw in it.


This document describes a format suitable to fix this issue while striving for
simplicity.

# Detailed design
[design]: #detailed-design

## Terminology

_Disclaimer_: whereas some of the terms enumerated below are borrowed from
[pkgsrc guide](https://www.netbsd.org/docs/pkgsrc/) plus a bit of terminology
employed by git, this document aims to be general, not being overly attached to
them.

- Program denotes the piece of software to be fetched and processed via Nixpkgs.
- This term makes no distinction about the immediate or intended uses of the
program; it can range from non-interactive programs to full GUI
applications, even including filesets intended as input to other programs
(such as firmwares and fonts).
- Team denotes the maintainers of the program.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to map to the above linked guide? Otherwise if it is only relevant to this RFC we should replace team entirely with maintainer.

- This term makes no distinction among the various models of organization of a
team; it ranges from a solitary programmer to a business company - and
everything inbetween.
- Source denotes the origin of the program.
- This term makes no distinction among precompiled, binary-only or high-level
code distributions.
- Snapshot denotes the source of the program taken at a point of the time.
- Labeled snapshot denotes any snapshot explicitly labeled by its team.
- Unlabeled snapshot denotes any snapshot not explicitly labeled by its team.
- Release denotes any distributed snapshot, as defined by its team.
- Branch denotes a logical sequence of snapshots, as identified by the program's
team;
- Usually these branches are denoted by names such as `stable`, `master`,
`unstable`, `trunk`, `experimental`, `staging`, `X.Y` (where `X` and
`Y` are numbers) etc.

## Design

- For a labeled snapshot:
- `version` should be constituted of the version of the snapshot, as defined
by the program team, without any alphabetical characters (e.g. "v", "rel")
Copy link
Member

@davidak davidak Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is rel popular? i have never seen it. do you have exapmles?

what about versions like R12? can we add that here and in the examples? maybe replace rel

examples:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
by the program team, without any alphabetical characters (e.g. "v", "rel")
by the program team, without any alphabetical characters (e.g. "v", "rel", "R", "release")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is clear enough. The listing is not exaustive, only exemplificative.

But I can add it, no problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add R and remove rel. Then we have the most common as examples.

prepending it.
- Alphabetical characters following the first numerical character of version
(as defined above) should be maintained, except optionally those clearly
used as separators, in which case they are replaced by dots (emulating a
typical dot-separated version).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allow replacing them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is ugly to use RELEASE_17_0_RC1 as version string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, that makes sense. Although in my experience, that sort of naming is usually just used for git/svn tags, not how upstream actually refers to their releases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndersonTorres Can you add an example of such replacing?


- For an unlabeled snapshot:
- `version` should be constituted of a concatenation of the elements below in
this order:
- the version of the latest labeled snapshot (on the same branch, when
applicable), as defined above;
- If the project never released a labeled snapshot, `0.pre` should be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.pre is an arbitrary choice? Why not 0.0.0 or 0pre?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because 0.0.0 has a small chance of being captured in the wild. I did not remember to test 0pre, tho.

The idea is that unlabeled snapshots of a never-released program are the smallest possible; 0.pre looks like a reasonable choice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. Maybe someone else who is familiar how nix works internal can suggest what a good value would be because I don't feel comfortable/informed enough to make a good decision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, pre is a -∞ component, and 0pre and 0.pre are equal w.r.t. builtins.compareVersions. 0.pre precedes 0.0.pre, and is preceded by 0.pre.pre, but the latter looks a bit ridiculuous.

used as default.
- the string `+unstable=YYYY-MM-DD`, where `YYYY-MM-DD` denotes the date
the mentioned unlabeled snapshot was distributed.

- Some atypical programs have special considerations:
- Linux kernel modules:
- Besides the rules established above for typical snapshots (whether labeled
or unlabeled), `version` shoud have appended `+linux=XX.XX.XX`, where
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that done right now? Is there already a consensus for kernel modules that we need to change as little as possible?

Copy link
Member Author

@AndersonTorres AndersonTorres Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consensus, as far as I know, is to append +kernel=<kernel version employed to build this module>. I just switched it to linux, aiming to keep some generality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make any difference for us if we would keep kernel instead of linux? Maybe someone that often works on kernel modules could give some insight here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, no. Someone suggested it was to disambiguate with other possibilities like kfreebsd...

`XX.XX.XX` is the corresponding Linux kernel version used to build the
aforementioned module.

# Examples and Interactions
[examples-and-interactions]: #examples-and-interactions

Some useful examples:

- Bochs is a typical Sourceforge-hosted project; its labeled snapshots can be
fetched from tarballs obtained via URLs like
<https://sourceforge.net/projects/bochs/files/bochs/2.6.11/>

For this example, we have `pname = "bochs"; version = "2.6.11";`.

- MPV is a typical Github-hosted program; its labeled snapshots can be fetched
from tarballs obtained via URLs like
<https://github.com/mpv-player/mpv/releases/tag/v0.33.1>.

For this example, we get rid of the `"v"` prepended to the version tag: `pname
= "mpv"; version = "0.33.1";`.

- SDL2 is hosted on Github; its latest labeled version can be downloaded from
<https://github.com/libsdl-org/SDL/releases/tag/release-2.0.14>. Therefore we
have `pname = "SDL2"; version = "2.0.14";`.

_However_, this labeled version was released December 21, 2020, while the
latest change was done in May 28, 2021.

Therefore, for this particular unlabeled releases of SDL2, we have `pname =
"SDL2"; version = "2.0.14+unstable=2021-05-28";`.

- Cardboard is a typical Gitlab-hosted program. It has no labeled release yet,
therefore we use `0.pre` as default dummy stable version; further, the latest
commit was made on May 10, 2021.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify to prevent any confusion that not the latest commit is relevant but the commit we are fetching.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Python example is the most descriptive one on this regard.


Therefore, we have `pname = "cardboard"; version =
"0.pre+unstable=2021-05-21";`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Therefore, we have `pname = "cardboard"; version =
"0.pre+unstable=2021-05-21";`.
Therefore, we have `pname = "cardboard"; version =
"0.pre+unstable=2021-05-21";`.

I really do not like 0.pre and then the additional unstable part. Any form of unstable already implies a pre-release, especially if it's in a completely unreleased program.

Copy link
Member Author

@AndersonTorres AndersonTorres Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a semantical quirk. This default string for unreleased releases should satisfy two constraints:

  • Start with a digit (because parseDrvName splits it on the first - before a number);
  • Be the smallest possible, in the sense of compareVersions

A version="unstable-DATE"; doesn't work, because it will complain with parseDrvName (as already said in the RFC motivation).

The string 0.0.0 was found in the wild. Therefore, it makes that "special string" essentially blocked.

The next idea was to use 0.pre because pre has the special property of being the smallest thing possible.

The unstable name was the best we could think until now, and it eases the transition because all packages already use it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #107 (comment) and the following discussion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it impossible to change parseDrvName to at least fix some weird behavior? Only working around the issue seems counterproductive when you could also fix the roots of the problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, 'parseDrvName' requires a numerical version, or at least a thing that starts with a number.


- Python is a famous programming language and interpreter. Before the
deprecation of its 2.x series in 2020, Python had two release branches,
popularly known as 'Python 2' and 'Python 3'. Indeed this peculiar situation
reflected in many package managers, especially Nixpkgs, that employed
`python2` and `python3` as `pname`s for these particular programs.

As an exercise of imagination, suppose the scenarios described below:

Python 2.6 was released 2008-10-01; an unlabeled snapshot of Python 2 branch
released at 2008-12-04 would have `version="2.6+unstable=2008-12-04"`.

At the same time, Python 3.0 was released 2008-12-03; an unlabeled snapshot of
Python 3 branch released at 2008-12-04 would have
`version="3.0+unstable=2008-12-04"`.

AndersonTorres marked this conversation as resolved.
Show resolved Hide resolved
- The Linux drivers for Realtek rtl8192eu can be fetched from a Github page,
<https://github.com/Mange/rtl8192eu-linux-driver>. It has no labeled release;
the latest code is from May 12, 2021. Supposing e.g. it was built for Linux
kernel version 5.10.01, we therefore have `pname = "rtl8192eu"; version =
"0.pre+unstable=2021-05-12+linux=5.10.01";`.

# Drawbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repology compatibility and formats used by other distributions needs investigation here.

If we break compatibility with the version formats present on Repology, we create more noise for maintainers using this as a tool and may even impede some automation which uses repology as a data source.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talk about this with the Repology maintainer. He thinks it is a good format, however he will not implement it because it is not the prevalent format in Nixpkgs.

[drawbacks]: #drawbacks

The main drawback is the conversion of the already existent expressions which
does not follow the format proposed here. It can require a degree of manual
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that we would have another extremely big task to convert the entire tree to this. I think an automated way which is 95% correct from the current format should be preferred to make it very easy for us.

intervention and code review, especially for machine-generated expressions (such
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generators are probably sprintable and can be easily fixed after which we regenerate the files. Big question is what happens to all programs using unstable-XXXX-XX-XX.

as Lua or Node library sets). However, this task is easily sprintable and
amenable to automation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In some cases, the build system is passed the version, and might be confused by this format.
However, there are already precedents when Nixpkgs version and build system visible version differ.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it also cause issues with testVersion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, although for pre-releases I would expect any use of testVersion to need overrides anyway… Maybe I am too pessimistic.

Copy link
Member

@davidak davidak Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing leading "R" causes the test to fail.

[davidak@gaming:~/code/nixpkgs]$ nix build .#vapoursynth.tests.version --no-link --print-build-logs
warning: Git tree '/home/davidak/code/nixpkgs' is dirty
vapoursynth> Version string '59' not found!
vapoursynth> The output was:
vapoursynth> VapourSynth Video Processing Library
vapoursynth> Copyright (c) 2012-2022 Fredrik Mellbin
vapoursynth> Core R59
vapoursynth> API R4.0
vapoursynth> API R3.6
vapoursynth> Options: -
error: builder for '/nix/store/x8dngsnq6v6him83z8f574pprc6kwnbp-vapoursynth-59-test-version.drv' failed with exit code 1

Workaround

Created wanted format in the test.

    tests.version = testers.testVersion {
      package = vapoursynth;
      version = "R${version}";
    };

Adding "+date=2021-05-25" to version causes the test to fail.

[davidak@gaming:~/code/nixpkgs]$ nix build .#vapoursynth.tests.version --no-link --print-build-logs
warning: Git tree '/home/davidak/code/nixpkgs' is dirty
vapoursynth> Version string '59+date=2021-05-25' not found!
vapoursynth> The output was:
vapoursynth> VapourSynth Video Processing Library
vapoursynth> Copyright (c) 2012-2022 Fredrik Mellbin
vapoursynth> Core R59
vapoursynth> API R4.0
vapoursynth> API R3.6
vapoursynth> Options: -
error: builder for '/nix/store/m7ig1gqkd1wyn008d9b61xmhf2g17vz5-vapoursynth-59+date=2021-05-25-test-version.drv' failed with exit code 1;

Workaround

Overwrite the version in the test that the build should have.

    tests.version = testers.testVersion {
      package = vapoursynth;
      version = "R59";
    };

@AndersonTorres and everyone: Should we keep the leading "R" in this case where it is part of the official upstream release version as displayed in --version or document and use these workarounds?

Even the format with leading v is often part of the official version.

https://github.com/NixOS/nixpkgs/blob/350fd0044447ae8712392c6b212a18bdf2433e71/pkgs/tools/misc/adrgen/default.nix#L21-L25

So should we respect upstream choice or change the version format, because we don't like it?

# Alternatives
[alternatives]: #alternatives

The alternative is doing nothing. The impact of it is keeping the Nixpkgs
codebase confusing, less discoverable and incompatible with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add inconsistent to the list.

`builtins.parseDrvName`.

# Unresolved questions
[unresolved]: #unresolved-questions

- Allow some degree of freedom and extensibility for handling deviations, such
different, non-standard naming schemes or unusual releasing schedules
eventually employed by many teams.

- Interactions between `pname` and `version`, like multi-branch releases.

# Future work
[future]: #future-work

- Update expressions that do not follow this proposal.
- Update manuals and related documentation in order to reflect this proposal for
future expressions.