-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc/stdenv/cross-compilation.chapter.md: explain tuples #180030
Conversation
This commit adjusts the documentation for the `config` field of `{host,build,target}Platform` in the following ways: 1. It clarifieds that stdenv uses multiarch tuples, which are the same thing as autoconf tuples except for two cases involving 32-bit architectures where autoconf tuples are ambiguous. 2. It mentions and gives examples of the 5-field form for tuples, which include both a `libc` and an `abi`. 3. It documents that this field should be canonicalized, and explains where to find the canonicalization rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes, I wouldn't document it in this way, by referring to Debian's multiarch triples, but the intention is good. If you're interested, I have a note somewhere on my laptop from when I researched this topic, maybe that's helpful for your purposes.
@@ -44,7 +44,14 @@ The exact schema these fields follow is a bit ill-defined due to a long and conv | |||
|
|||
`config` | |||
|
|||
: This is a 3- or 4- component shorthand for the platform. Examples of this would be `x86_64-unknown-linux-gnu` and `aarch64-apple-darwin14`. This is a standard format called the "LLVM target triple", as they are pioneered by LLVM. In the 4-part form, this corresponds to `[cpu]-[vendor]-[os]-[abi]`. This format is strictly more informative than the "Nix host double", as the previous format could analogously be termed. This needs a better name than `config`! | |||
: This is a 3-, 4-, or 5- component shorthand for the platform. Examples of this would be `x86_64-unknown-linux-gnux32`, `aarch64-apple-darwin14`, and `mips64el-unknown-linux-muslabin32`. This is a standard format called the "[multiarch tuple](https://wiki.debian.org/Multiarch/Tuples)", as [pioneered by autoconf](https://www.gnu.org/software/autoconf/manual/autoconf-2.65/html_node/System-Type.html#System-Type), [disambiguated to create multiarch](https://wiki.debian.org/Multiarch/Tuples#Used_solution), and adopted by LLVM. In the 5-part form, this corresponds to `[cpu]-[vendor]-[os]-[libc][abi]`; note that there is no hyphen separating the `[libc]` field from the `[abi]` field. This format is strictly more informative than the "Nix host double", as the previous format could analogously be termed. This needs a better name than `config`! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd honestly leave Multiarch out of this. It is yet another system that has not necessary anything to do with us, since it is what Debian does. autoconf OTOH is used for figuring out builtins.currentSystem
, so it makes sense why it is a point of reference for us.
Another problem about multiarch is that we actually don't do multiarch (in the sense of the directory layout) and getting a multiarch (i.e. multilib) gcc to work is a struggle (if it's even possible?), so using the term may lead to confusion.
Note also that autoconf
itself calls its (1), 2, 3 or 4 component platform strings target triplets, canonical system type or canonical name.
Considering the string to be up to 5 components is a bad idea, especially for documentation purposes. Both nixpkgs and autoconf only allow up to 4 dash-separated components. While autoconf surely globs on the 4th components (in fact it probably does on all of them), considering it as (sometimes) multifielded is a stretch. Nixpkgs treats it as an opaque string and this is, in my opinion, the correct interpretation, even though some of those strings relate to each other via some structure. The used libc is derived from the ABI in nixpkgs and never directly derived from the platform string.
Nix host double is a bad term, it'd be better to call them Nix systems or Nix system doubles for consistency. It's also not the “previous format”, but very much used today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're interested, I have a note somewhere on my laptop from when I researched this topic, maybe that's helpful for your purposes.
Yes, I am interested in that.
I'd honestly leave Multiarch out of this.
Okay.
Note also that
autoconf
itself calls its (1), 2, 3 or 4 component platform strings target triplets
I think is slightly bonkers to call a four-component thing a "triple".
Both nixpkgs and autoconf only allow up to 4 dash-separated components
Unfortunately the last two components are not separated by a dash.
The used libc is derived from the ABI in nixpkgs
This is not true! Both mips64el-linux-gnuabin32
and mips64el-linux-muslabin32
use the same ABI: n32
yet they use different libcs. If the libc were derived from the ABI then this would be impossible!
This is exactly the sort of confusion that comes from people pretending that the last two components are really one big component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nix host double is a bad term, it'd be better to call them Nix systems or Nix system doubles for consistency. It's also not the “previous format”, but very much used today.
FWIW I didn't write that sentence, it was preexisting text. I have however updated it with your recommendation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not true! Both
mips64el-linux-gnuabin32
andmips64el-linux-muslabin32
use the same ABI:n32
yet they use different libcs. If the libc were derived from the ABI then this would be impossible!This is exactly the sort of confusion that comes from people pretending that the last two components are really one big component.
As you can see, nixpkgs does not consider these ABIs the same:
nix-repl> :p (lib.systems.elaborate { config = "mips64el-linux-gnuabin32"; }).parsed.a
{ _type = "abi"; abi = "n32"; name = "gnuabin32"; }
nix-repl> :p (lib.systems.elaborate { config = "mips64el-linux-muslabin32"; }).parsed.
{ _type = "abi"; abi = "n32"; name = "muslabin32"; }
The libc
is derived in after parsing the triple itself by looking at the ABI, but you can of course also theoretically override this (lib.systems.elaborate { config = "mips64el-linux-muslabin32"; libc = "glibc"; }
(where the ABI would stay musl and even isMusl == true
)). This is why I think it makes more sense to treat the ABI component as some opaque string which we can sometimes extract more information out of (n32
ABI, musl
“ABI”).
Overall I guess it is a bit of a problem that we conflate ABI and libc in nixpkgs in a weird way which is also not done by autoconf (the third component for autoconf can either be os
or kernel-system
which is much vaguer (e.g. linux-musl
)).
I think is slightly bonkers to call a four-component thing a "triple".
Yes, but I'm not sure about a good alternative that is also in use; LLVM's triples are also not always three component strings.
Yes, I am interested in that.
https://gist.github.com/sternenseemann/a00d91b8e58cca3e18792771483b4c25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you can see, nixpkgs does not consider these ABIs the same:
That is an egregious nixpkgs bug which really should be fixed.
n32
is an ABI and so is x32
; gnuabin32
, muslabin32
, gnuabix32
, and muslabix32
are not ABIs.
musl
“ABI”).
Musl isn't an ABI... they maintain a list of ABIs they support and links to what they use as the definition for each ABI.
ABIs exist even when there is no libc around:
- A statically-linked
no_std
rust binary will have no libc involved at all (neither "musl" nor "gnu" nor anything else), yet it still has to pick an ABI if it wants to make system calls into the kernel. - When enabling seccomp (for example for the nix sandbox) a process needs to declare what ABIs it will use, in order to enable system call filtering. This is totally independent of libc choice.
file
can easily detect the ABI of a statically-linked ELF binary (it's part of the magic bytes), but not which libc (if any) it includes
(the third component for autoconf can either be
os
orkernel-system
which is much vaguer (e.g.linux-musl
)).
As far as I can tell autoconf considers any part matching the regex e?abi[^-]*$
to be a comment field. As I mentioned elsewhere, arm-unknown-linux-eabicrazypants
is a valid autoconf-name.
I think is slightly bonkers to call a four-component thing a "triple".
Yes, but I'm not sure about a good alternative that is also in use
Co-authored-by: sternenseemann <[email protected]>
… it is relevant (arm32)
…he format used by the nix interpreter"
Got this closed by accident? I think we should still do this, I just don't have a lot of capacity to think about this atm. |
Sure; I'm still interested in this I just no longer thing it will get resolved with a single PR.
Totally understand. When you do have a bit of time I would really appreciate it if you could take a look at #180964. It looks big and complicated but it's pretty easy to review. There are simple instructions for verifying that it does not change any behavior, so reviewing basically boils down to deciding it if it is a net improvement in readability (which I hope is obvious, but maybe not). If #180964 merges then there are a bunch of one-liner corrections (which do change behavior) to that logic that I would like to submit. Those can be submitted/reviewed independently and concurrently. The control-flow restructuring in #180964 makes the need for those corrections (and their effects) much more obvious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides my one comment looks goods. Thanks for opening this, and thank you @sternenseemann for taking care of the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly recommend using the correct data types for the things you want to describe. Use callouts, use lists if you have multiple (similar) items. This is much easier to read.
Avoid long blocks of text. The manuals are already very dense, which makes them to read. Reference manuals are for looking things up, and information should be arranged accordingly to make that as easy as possible.
Always link to definitions of terms, otherwise readers will have to either guess or do their own research to find out what you mean.
Here are some general guidelines for writing documentation in the Nix ecosystem.
|
||
This field should be *canonicalized*. The rules for canonicalizing a tuple are kept in the `config.sub` file in the source code for `gnu-config`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this means. Is "should be canonicalized" a recommendation to the user? What is gnu-config
and where is this config.sub
file? I suppose you mean this file?
Unless you want to clarify this, I'd just leave it out and fix later if the need arises.
This field should be *canonicalized*. The rules for canonicalizing a tuple are kept in the `config.sub` file in the source code for `gnu-config`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you moved that line to the end, which seems sensible. I still don't know what it means, so once you address the original comment, for me this is ready to merge.
Co-authored-by: Valentin Gagarin <[email protected]>
Description of changes
This commit adjusts the documentation for the
config
field of{host,build,target}Platform
in the following ways:It clarifieds that stdenv uses multiarch tuples, which are the same thing as autoconf tuples except for two cases involving 32-bit architectures where autoconf tuples are ambiguous.
It mentions and gives examples of the 5-field form for tuples, which include both a
libc
and anabi
.It documents that this field should be canonicalized, and explains where to find the canonicalization rules.
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)nixos/doc/manual/md-to-db.sh
to update generated release notes