-
-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0131] Enforce the usage of SRI hashes in Nixpkgs #131
Conversation
# Motivation | ||
[motivation]: #motivation | ||
|
||
Currently in Nixpkgs, the hash formats that are used in parameters to fetchers range from SRI and base-32 (the most common ones) to hex. In addition to this, packages use the algorithm-specific arguments (`sha256`, `sha512`, etc.) or the generic `hash` argument at random. This creates inconsistencies that we'd rather not have. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is SRI in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subresource Integrity, whose hashes look like sha256-2mE26ES+fYSWdfMr8uTsX2VVGTNMDQ9MXEk5E/L95UI=
. Maybe I should add at least a definition of SRI in the RFC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates inconsistencies that we'd rather not have.
Why? This is not a motivation, just a claim.
Using the different hash encodings genuinely has value, for example I frequently just copy over the base16 hash from a SHA256SUMS
file provided by upstream, requiring less button presses, round trips and most importantly allows others to quickly and easily match the hashes textually when reviewing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a link to what SRI is in the RFC would be good to have. I also didn't know what SRI was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the different hash encodings genuinely has value...
I agree with this point, but at the same time most people use either the SRI hashes given by Nix's hash mismatch errors (which I now realize is only in 2.4), or convert those to base-32 (the default in 2.3 and below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vojta001 That's why these interfaces tend to have an assert
or two behind the scenes…
nix hash to-sri
Well that's actually a very good point: There is no stable interface for converting hashes into the SRI format, because nix-hash(1)
only supports SRI as an input, not an output format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the different hash encodings genuinely has value, for example I frequently just copy over the base16 hash from a
SHA256SUMS
file provided by upstream, requiring less button presses, round trips and most importantly allows others to quickly and easily match the hashes textually when reviewing.
no need to convert the hash to base64
SRI hashes only require a prefix:
{
sha256 = "1cc9110705165dc09dd09974dd7c0b6709c9351d6b6b1cef5a711055f891dd0f";
sha256 = "sha256-1cc9110705165dc09dd09974dd7c0b6709c9351d6b6b1cef5a711055f891dd0f";
hash = "sha256-1cc9110705165dc09dd09974dd7c0b6709c9351d6b6b1cef5a711055f891dd0f";
}
base16 hashes are found with
grep -r -E 'sha256 = "[0-9a-f]{64}";'
new and better hash technologies appear everywhere.
sha256 is de-facto standard in nixpkgs,
because its good enough for our use case = global IDs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sha256 is de-facto standard in nixpkgs,
MD5 was de facto standard no more than a decade ago.
Also, this was not about "good enough algorithms", but about a better format for the expression - namely, hash = "hashAlgorithm-..."
instead of hashAlgorithm = "..."
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates inconsistencies that we'd rather not have.
@winterqt could you add something more specific here, considering this is the whole motivation for the RFC?
The discussion above has brought up some more points:
- semantic vs syntactical mutual exclusion
- unique attribute name rather than multiple, algo-specific ones
- ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@milahu: conversion would be needed because SRI is currently only accepting base64. However, it would be relatively easy to change: NixOS/nixpkgs#199660
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
None at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's (anecdotically) still a significant number of Nix 2.3 users who contributing to nixpkgs would be made unnecessarily harder for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, another ten years of deprecation window...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You joke, but I seriously think this would be a sensible thing to do. (Also, things can be deprecated without the intent to eventually removing them, see Java which did not remove deprecated APIs until Java 9?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall we have developed a strange culture of removing stuff unnecessarily in nixpkgs, I feel like.
In this case, we shouldn't remove support for these formats from Nix anyways (to continue to be able to evaluate older nixpkgs revisions (the ability to reproduce ancient stuff truely is a strength of Nix)) and for existing APIs in nixpkgs it's laughably simple to keep supporting it (you don't need to change anything, in fact!). For all I care, stop supporting the sha256
and sha512
attributes for new APIs…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text does not talk about removal, I suppose, but about enforcement. Older expressions can be rendered by newer versions, whereas new expressions should be written in newer versions of Nix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So how does something like this sound: we strongly encourage using SRI hashes and hash
in new packages, if there is no justifiable reason to not use them. Justifiable reasons include, but are not limited to, taking hashes directly from upstream, and using automated tooling that does not use SRI hashes.
I'm not sure about that last reason, but I also want to provide more than one example of a reason.
I'm also not sure about how to handle converting old packages (with the same criteria of having no justifiable reasoning), if at all. Thoughts welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old packages can be converted by anyone interested ion converting them. I have started a preliminar work a long time ago :P
Further I believe this should be decided on a case by case basis. Maybe allowing old Nix tools is acceptable, whereas upstream hashes is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warns the user when the code uses dangerous functions
Not sure how apt this comparison is, as there's nothing fundamentally different about a sha256 hash in SRI format or one in base16 format.
I'm open to changing it to "discourage", but... I don't know, if non-SRI hashes keep slipping in anyways because it's not required, what's the point of this? I guess that's why I chose that strong of a word.
Stale hashes is not something caused by a particular hash format. It happens because expressions bitrot and we'll have stale hashes in SRI format before long. Doing a mass conversion of hashes may turn up stale hashes, but doesn't have to since you would likely just convert them without refetching, since that is much quicker.
If you are concerned about stale hashes, I'd suggest trying to refetch fixed output derivations in nixpkgs with cache disabled. I believe this has been done quite a few times over the years and it should be easy to turn up useful tools for this.
strongly encourage
Newer Nix versions already default to this, so there is already a lot of encouragement…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"A lot of encouragement" that ends with "my version is too old" or "this is mere personal preferences, don't force them on me"? Wow cool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version is probably too old for the format to be convenient to produce but can evaluate the expressions with either hash format. For «mere personal preferences» argument new Nix making use of SRI format easier than non-SRI is an encouragement, as submitters will typically use whatever is the easiest and changing the format is pointless churn so the initial hash format will stay.
# Motivation | ||
[motivation]: #motivation | ||
|
||
Currently in Nixpkgs, the hash formats that are used in parameters to fetchers range from SRI and base-32 (the most common ones) to hex. In addition to this, packages use the algorithm-specific arguments (`sha256`, `sha512`, etc.) or the generic `hash` argument at random. This creates inconsistencies that we'd rather not have. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates inconsistencies that we'd rather not have.
Why? This is not a motivation, just a claim.
Using the different hash encodings genuinely has value, for example I frequently just copy over the base16 hash from a SHA256SUMS
file provided by upstream, requiring less button presses, round trips and most importantly allows others to quickly and easily match the hashes textually when reviewing.
- Do we want to eventually deprecate the use of `sha256`, `sha512`, etc. in fetchers that support SRI hashes, leading to their eventual removal? | ||
- This would require updating every script that uses these arguments, and would cause breakage for any usage of them out-of-tree; we'd need to weigh the benefits of this | ||
- Is `hash` the correct name to use for the argument? | ||
- Nix builtins use `narHash` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care about the name, but built-in fetchers and nixpkgs fetchers should be consistent.
|
||
- How can we do the treewide migration efficently? | ||
- Text replacement for the majority of cases, with an AST-based solution for others? | ||
- Do we want to eventually deprecate the use of `sha256`, `sha512`, etc. in fetchers that support SRI hashes, leading to their eventual removal? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can deprecate them without removing them.
Hello everyone, this RFC is now open for nominations. If you think of someone (including yourself) who you think would make a good shepherd on this RFC, please nominate them now |
Recommend looking through Nixpkgs for uses in the wild to find additional shepherds. Edit: for examples, people using or recommending hash vs sha256 already |
What do you mean by this (what "uses"), who would we be looking for? |
Nominating myself |
I nominate @sternenseemann |
We still need at least two more shepherds. @sternenseemann Are you willing to help shepherd this RFC? |
I won't have time until December sadly. |
- Is `hash` the correct name to use for the argument? | ||
- Nix builtins use `narHash` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see much value in renaming hash. A lot of work which we can safe ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See NixOS/nixpkgs#79987 (comment) for context, I added it here just to include it since the question was raised (should probably cite that discussion, though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are not talking about narHashes here, so this should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NAR hashes are necessary for any cases where the result is more than a single file, e.g. fetchFromGitHub
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, as we have flat hashes for normal files and NAR hashes over arbitrary trees (or normal files), I'd say it makes sense to call them differently. At least if I don't consider any backwards compatibility questions for now. For example, keep hash
for normal ones and rename NAR hashes to narHash
. It might even avoid some confusion, and functions could allow specifying either (though of course the value differs, as something different is hashed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that could be done under the hood in the fetchers? It seems confusing for new contributors to have to think about whether to use a flat hash or a NAR hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We already do this with outputHashMode
.
@sternenseemann December would be fine IMHO, would you be willing to join the shepherd team then? Alternatively, I nominate @vcunat -- would you be willing to shepherd? |
# Alternatives | ||
[alternatives]: #alternatives | ||
|
||
- Do nothing: further inconsistencies will continue to be introduced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are a number of alternatives.
Different rules can be considered. For each of the following formats it could be decided to allow/disallow/preferred:
hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
sha256 = "0000000000000000000000000000000000000000000000000000000000000000";
sha256 = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
sha256 = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
I've gathered there are a number of ways these rules can be 'enforced':
- Throw evaluation errors upon not conforming to rule.
- Have a it be a potential future linter rule.
- Have PR reviewers nudge contributors to follow the rule
- Do nothing: keep things like the are right now. Reviewers will suggest their own opinion in PRs if they'd like.
- Allow any format: do not allow PR reviewers to suggest changing the format in PRs to their preferred format. If the format is allowed, just let it be. This avoids PRs being always potentially invalid due to reviewers having different opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow anything: Do not allow PR reviewers to suggest changing the format in PRs. If the format is allowed, just let it be. Avoid conflicting reviewer opinions.
I'm mentioning this option, as at the moment PRs are being reviewed where the reviewers suggest their own 'opinion' on the matter; even though there is no real rule. One outcome of this RFC could be to just let it be.
It appears that we need one more shepherd. Any takers? |
# Summary | ||
[summary]: #summary | ||
|
||
Use SRI hashes and the `hash` parameter for fetchers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"SRI hash" needs a reference link with a definition and globally this RFC needs some more context to be explained.
FWIW, since I became aware that this kind of hash was preferred, I have tried to use them more systematically, but I'm annoyed by nix-prefetch-url
not generating this type of hash (at least all the way to Nix 2.12) and not knowing how to convert a hash generated by nix-prefetch-url
into an SRI hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, for conversion I'm using
nix hash to-sri --type sha256 theHash
EDIT: or you could use
nix hash file path/to/file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! But another thing that is becoming annoying to me lately is to be forced to activate the nix-command
experimental feature for more and more basic things that I cannot avoid.
Another solution that I've found meanwhile was openssl dgst -sha256 -binary /path/to/file | openssl base64 -A
, but it's not as convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But another thing that is becoming annoying to me lately is to be forced to activate the
nix-command
experimental feature for more and more basic things that I cannot avoid.
I would rather view it as a missing feature of nix-hash
that needs to be implemented to get this RFC merged. In other words, we should have
nix-hash --sri --type sha256 path/to/file
and
nix-hash --to-sri --type sha256 theHash
Update: I just created NixOS/nix#7690. Please take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! And thanks to you, I also learned that, in the meantime, I can use nix --extra-experimental-features nix-command file to-sri --type sha256 ...
. I didn't realize that I don't need to activate the experimental feature globally, and this is already better than nothing.
From RFC Steering Committee: please nominate yourselves or someone you think would do a good job as a shepherd. |
I think this should be closed due to a lack of interest according to RFC 130 – Search for shepherds started in August and the two month period has long passed. |
I'm somewhat inclined to agree, though I suppose it wouldn't hurt to have a phase where we create a new RFC based on these ideas and reopen it later. If anyone's interested in doing this, please reach out. I'd make a Matrix room, but that's proving harder than I'd expect... 😅
I don't think @vcunat responded to his nomination, though I don't think that would make that much of a difference. |
I didn't feel like accepting, because I couldn't see how to progress this – except for one minor improvement for which I filed a nixpkgs PR but got (almost?) no reaction. (and I still see it the same) |
The followings sound more actionable:
|
@ShamrockLee should there be the part where Nix never outputs a «new» non-SRI hash (new as in not the exact same meaningful characters as some non-SRI hash that was passed in)? |
@7c6f434c Sorry, but which hashes Nix would output are you referring to? |
> the part where Nix never outputs a «new» non-SRI hash (new as in not the exact same meaningful characters as some non-SRI hash that was passed in)
Sorry, but which hashes Nix would output are you referring to?
Some package updates in reality are done by bumping the version and breaking the hash, then looking at the Nix error message and copying the «observed» hash.
|
I see. IIUC, the hash is produced according to the Updated. |
I see. IIUC, the hash is produced according to the `hashAlgo` of the source derivation set by the fetcher, and the fixes should go to each fetcher. Is that the case?
Prefetchers might differ, but on the Nix level most (all?) hashAlgo's have matching SRI encodings thus the hash-mismatch messages are to be fixed once in Nix.
|
The RFC steering committee has decided to close this RFC due to a lack of shepherd nominations. However, it can be reopened if people are interested in shepherding. |
Rendered