-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arm: release worker centos7-arm64 uses GCC4.8 #1542
Comments
As a result ATM we are not building nightlies for ARM7 P.P.S. I would fix it, but I don't have access to release machines. |
/CC @mhdawson @joaocgreis @rvagg @jbergstroem P.S. it seems like the Ansible playbook is all set up, just need to run it on |
I can run it, would just like @rvagg to confirm that he is comfortable that we just need to run the playbook. |
In test I see that we have a number of different types of machines: centos7-arm64-gcc48 Not sure which versions of the compiler are used for From the building docs minimum levels are: GCC 4.9.4 for 8.X and higher My key concern is that I think we may still need to build 6.x on 4.8.5 on the release machines and I don't see any compiler selection logic for ARM and we only have a single machine in the release CI for arm64 |
AFAIK these are "virtual" tags that are used to select different version of devtoolset |
@gdams is looking at the ansible scripts to get on top of what compiler level is installed for centos7-arm64 based on the updated ansible scripts. |
@refack good to know, but next question is if we have the equivalent for 4.9.4. I can see test-packetnet-centos7-arm64-1 has those 2 tags but don't see any 4.9.4 equivalent. |
so we are running this to add gcc 4.9.4 on rhel7.x, Maybe we need the same logic on Centos7.x? |
I think that might have been for IBM platforms since we did not believe there as a version of the redhat developer toolset available. It seems like there is a version on arm based on the tags and this in the ARM64 job in test if [[ "$nodes" =~ centos[67]-(arm)?64-gcc6 ]]; then
exec_cmd=". /opt/rh/devtoolset-6/enable; $exec_cmd"
fi |
so the following are installed on centos7: centos7: [
'ccache,gcc-c++,devtoolset-6,sudo',
], |
@gdams does the ansible script for centOS ARM 64 ensure we have both devtoolset-6 as well as the 4.8.4 compiler? |
AFAIK devtoolset is backward and forward compatible (#809 (comment)), so I think that if we have gcc4.8 for node < 11 (status quo) and new gcc6 for >= we solved the issue. |
I do think that the dev toolset is more compatible so what @refack says could be correct. |
This is what is in the selector script
Current tag in release is centos7-arm64 which does not cause selection on either of those. |
Not quite sure why 10 was chosen as the cutover above since the change is in fact between 6.x and 8.x |
Comments in this issue #1203 indicate Rod was concerned that installing the new dev toolset might affect existing builds. |
It also adds some insight on why the switch might have been at 9 versus 8 (problems we'd seen on 8.x). |
So what I'd like as next steps are:
In terms of risk/reward I think it's better not to have nightlies on ARM64 for a little while versus potentially breaking our 6.X and 8.X LTS releases. Based on that I'd like to take the time so that we can answer the questions above and get input from @rvagg. |
My preference is adding a new machine. I'll look to see if it is documented how I could add a new machine. |
@nodejs/build if people are ok with giving @gdams infra level access to at least the packet hosting infrastructure, we could probably get the new machines spun up more quickly. |
In addition to steps from above, we probably also need to validate that release binaries built on devtoolset-6 run ok on a centos7 machine with a 4.9.4 compiler. |
I think this needs to be discussed in a wider context (no offence intended George). |
@refack I'm ok with discussing in wider context, just thought I'd mention since it might take me a bit of time to get a new machine going. |
I logged into the packet infra and when I go to add a new machine, none of the options are ARM. Not sure we'd have the ok to add another one anyway as they are pretty big machines with 96 cores and 32G or RAM. So I think its back to figuring if we are comfortable having both installed on the same machine. Would like input from @rvagg on that front. |
Another option is using docker on the existing machine so we can add a new machine sharing the resources without the potential for messing up the release machine. We'd probably have to prove that out in test first through. |
@rvagg want to make sure this is still on your radar. |
Yeah, sorry, this is a big one that takes some brain space so I've not got to it quicker. So here's what I'm thinking our core problem is: ci-release doesn't have the gcc48/gcc6 label split, it's only got the one centos7-arm64 machine with a single label This is in ci for node-test-commit-arm: if [[ "$nodes" =~ centos[67]-(arm)?64-gcc6 ]]; then
exec_cmd=". /opt/rh/devtoolset-6/enable; $exec_cmd"
fi Our VersionSelectorScript.groovy has: [ /centos[67]-(arm)?(64|32)-gcc48/, anyType, gte(10) ],
[ /centos[67]-(arm)?(64|32)-gcc6/, anyType, lt(10) ], So on ci, it's selecting gcc6 for >=10 and that's working fine so we have all green. On ci-release, the version selector has no impact on arm64 because the label So, given that, here's my proposed solution:
Then, optionally, on the next Node 10 and 11 release announcements, we could state that we've switched build environment for ARM64 so upgrades should proceed with caution. However, as we've already discovered, Red Hat are doing funky stuff with devtoolset such that it doesn't appear to have an impact on the ability of the binaries to run on library level of the system they are built on. So systems with libc (and libstdc++, I think) versions of at least as new as Cent OS 7 should be fine. So we could just cross our fingers and hope nobody is impacted. The usage level of ARM64 is pretty low (one of my next jobs is to get the download numbers working again, so I don't have current numbers). |
Sound great. It works on the public CI 💯 |
@rvagg that all sounds reasonable, but the key question was if we felt comfortable in running the ansible script to add devtoolset-6 onto the release machine as it does not current exist there. |
If the answer is yes then we just need to run the ansible script and update labels, invoker in the release infra. |
Aside: we have explicit guarantee form glibc fork, that |
done, last nightly rebuilt all green https://nodejs.org/download/nightly/v12.0.0-nightly201811153212f77ac6/ running test builds for 10, 8 and 6 now just to make sure it's doing the right thing. |
All good, but we're now getting an error on v8-canary builds running gcc 6 on centos 7 arm64. This appears to be the same error generated by ppcle-ubuntu1404-release-64 on v8-canary builds too. The plain x64 centos 6 gcc 6 is fine with it, though.
@nodejs/v8 is this a known issue with V8 7.2? Should we be concerned and/or preparing build resources to make it work properly? It's the same that's occurring on CI for v8-canary @ https://ci.nodejs.org/job/node-test-commit-arm/20000/nodes=centos7-arm64-gcc6/ On our x64 machine: gcc (GCC) 6.3.1 20170216 (Red Hat 6.3.1-3) Not reopening because original issue is solved, without hearing back I'm just going to assume the V8 folks are on to this. |
Some new code was added in that exposed an stdlibc++4.8 bug that was fixed in 4.9
https://ci-release.nodejs.org/job/iojs+release/3855/nodes=centos7-arm64/
(Node minimal compiler has been gcc4.9 since node9)
/CC @nodejs/release @nodejs/build-infra
The text was updated successfully, but these errors were encountered: