Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: add LUCI linux-s390x builder #67307

Open
dmitshur opened this issue May 10, 2024 · 8 comments
Open

x/build: add LUCI linux-s390x builder #67307

dmitshur opened this issue May 10, 2024 · 8 comments
Assignees
Labels
arch-s390x Issues solely affecting the s390x architecture. Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. new-builder OS-Linux
Milestone

Comments

@dmitshur
Copy link
Contributor

There currently isn't a LUCI builder that tests the linux/s390x port (other than the misc-compile builder, which tests only that the port compiles). This is the tracking issue for it.

The next steps that a builder owner will need to follow to make progress here are documented https://go.dev/wiki/DashboardBuilders#luci-builders.

@dmitshur dmitshur added OS-Linux Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. new-builder arch-s390x Issues solely affecting the s390x architecture. labels May 10, 2024
@dmitshur dmitshur added this to the Unreleased milestone May 10, 2024
@srinivas-pokala srinivas-pokala self-assigned this Aug 15, 2024
@dmitshur dmitshur moved this to In Progress in Go Release Aug 16, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/617359 mentions this issue: crypto/internal/fips/sha3: reduce s390x divergence

@srinivas-pokala
Copy link
Contributor

Hostname for the builder: linux-s390x-ibm
csr is attached:
linux-s390x-ibm.csr.txt

@dmitshur
Copy link
Contributor Author

dmitshur commented Oct 4, 2024

Thanks. CC @mknyszek.

@mknyszek mknyszek assigned mknyszek and unassigned srinivas-pokala Oct 4, 2024
@mknyszek
Copy link
Contributor

mknyszek commented Oct 4, 2024

Here's the certificate:
linux-s390x-ibm-1728069494.cert.txt

@mknyszek mknyszek assigned srinivas-pokala and unassigned mknyszek Oct 4, 2024
cpu pushed a commit to cpu/go that referenced this issue Oct 16, 2024
It's a little annoying, but we can fit the IBM instructions on top of
the regular state, avoiding more intrusive interventions.

Going forward we should not accept assembly that replaces the whole
implementation, because it doubles the work to do any refactoring like
the one in this chain.

Also, it took me a while to find the specification of these
instructions, which should have been linked from the source for the next
person who'd have to touch this.

Finally, it's really painful to test this without a LUCI TryBot, per golang#67307.

For golang#69536

Change-Id: I90632a90f06b2aa2e863967de972b12dbaa5b2ae
gopherbot pushed a commit that referenced this issue Oct 28, 2024
It's a little annoying, but we can fit the IBM instructions on top of
the regular state, avoiding more intrusive interventions.

Going forward we should not accept assembly that replaces the whole
implementation, because it doubles the work to do any refactoring like
the one in this chain.

Also, it took me a while to find the specification of these
instructions, which should have been linked from the source for the next
person who'd have to touch this.

Finally, it's really painful to test this without a LUCI TryBot, per #67307.

For #69536

Change-Id: I90632a90f06b2aa2e863967de972b12dbaa5b2ae
Reviewed-on: https://go-review.googlesource.com/c/go/+/617359
LUCI-TryBot-Result: Go LUCI <[email protected]>
Auto-Submit: Filippo Valsorda <[email protected]>
Reviewed-by: Carlos Amedee <[email protected]>
Reviewed-by: Daniel McCarney <[email protected]>
Reviewed-by: Roland Shoemaker <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/636055 mentions this issue: crypto/internal/cryptotest: skip TestAllocations on s390x

gopherbot pushed a commit that referenced this issue Dec 13, 2024
TestXAESAllocations fails like #70448, and crypto/rand's fails in FIPS
mode. We can't keep chasing these without even a LUCI builder.

Updates #67307

Change-Id: I5d0edddf470180a321dec55cabfb018db62eb940
Reviewed-on: https://go-review.googlesource.com/c/go/+/636055
Auto-Submit: Filippo Valsorda <[email protected]>
Reviewed-by: Roland Shoemaker <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Carlos Amedee <[email protected]>
@srinivas-pokala
Copy link
Contributor

@mknyszek I am facing issue while following steps mentioned above for LUCI builder setup. After step-03, I have done as follow.

Note: Default builder machines are running under user(linux1)

Step-04 : Created cron job for luci_machine_tokend using new user(a2)

[a2@go-s390x01 ~]$ crontab -l
*/10 * * * * luci_machine_tokend -backend luci-token-server.appspot.com -cert-pem /home/a2/linux-s390x-ibm-1728069494.cert.txt -pkey-pem /home/a2/linux-s390x-ibm.key -token-file=/var/lib/luci_machine_tokend/token.json

Step-05: Created system service to run "bootstrapswarm" as below using same user(a2).

Description=Bootstrapswarm Service
After=network.target

[Service]
User=a2
Group=a2
ExecStart=/home/a2/go/bin/bootstrapswarm -hostname linux-s390x-ibm
Restart=always
RestartSec=5
Environment="PATH=/usr/local/go/bin:/usr/bin:/bin:/home/a2/go/bin"

[Install]
WantedBy=multi-user.target

After this when I verify bot's start-up log, I am encountering below verbose

[a2@go-s390x01 ~]$ sudo journalctl -u bootstrapswarm -f
[sudo] password for a2: 
-- Logs begin at Mon 2025-01-20 04:52:31 EST. --
Jan 20 09:22:32 go-s390x01 systemd[1]: Stopped Bootstrapswarm Service.
Jan 20 09:22:32 go-s390x01 systemd[1]: Started Bootstrapswarm Service.
Jan 20 09:22:32 go-s390x01 systemd[1]: bootstrapswarm.service: Main process exited, code=exited, status=203/EXEC
Jan 20 09:22:32 go-s390x01 systemd[1]: bootstrapswarm.service: Failed with result 'exit-code'.
Jan 20 09:22:38 go-s390x01 systemd[1]: bootstrapswarm.service: Service RestartSec=5s expired, scheduling restart.
Jan 20 09:22:38 go-s390x01 systemd[1]: bootstrapswarm.service: Scheduled restart job, restart counter is at 3102.
Jan 20 09:22:38 go-s390x01 systemd[1]: Stopped Bootstrapswarm Service.
Jan 20 09:22:38 go-s390x01 systemd[1]: Started Bootstrapswarm Service.
Jan 20 09:22:38 go-s390x01 systemd[1]: bootstrapswarm.service: Main process exited, code=exited, status=203/EXEC
Jan 20 09:22:38 go-s390x01 systemd[1]: bootstrapswarm.service: Failed with result 'exit-code'.

But when I tried manually running "/home/a2/go/bin/bootstrapswarm -hostname linux-s390x-ibm" I am getting status code 401 error(authentication). I tried inspecting/addressing the issue but I could not make much progress.
Can you please help me in troubleshooting this issue.

@dmitshur
Copy link
Contributor Author

@srinivas-pokala Thanks for working on this. I looked at the logs on our end for error details.

I'm seeing 403s that are failing because the bot ID being reported is "go-s390x01" instead of the expected "linux-s390x-ibm", which results in an "Bot ID doesn't match the token used" error. Can you check if the bootstrapswarm binary you're using is the latest version available in x/build? Looking at its code, if the -hostname linux-s390x-ibm is being propagated correctly, it should be sent to the server. As unlikely as it is, maybe you can check that the metadata.OnGCE() path isn't being taken somehow, since that does override hostname?

A 401 that I saw failed with the error that the token was expired 4 hrs earlier; perhaps /var/lib/luci_machine_tokend/token.json stopped being refreshed at the time you tried to run bootstrapswarm?

wyf9661 pushed a commit to wyf9661/go that referenced this issue Jan 21, 2025
TestXAESAllocations fails like golang#70448, and crypto/rand's fails in FIPS
mode. We can't keep chasing these without even a LUCI builder.

Updates golang#67307

Change-Id: I5d0edddf470180a321dec55cabfb018db62eb940
Reviewed-on: https://go-review.googlesource.com/c/go/+/636055
Auto-Submit: Filippo Valsorda <[email protected]>
Reviewed-by: Roland Shoemaker <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Carlos Amedee <[email protected]>
@srinivas-pokala
Copy link
Contributor

@dmitshur Thank's for the reply.

Can you check if the bootstrapswarm binary you're using is the latest version available in x/build?

Yes I have checked it. It's latest only.

Looking at its code, if the -hostname linux-s390x-ibm is being propagated correctly, it should be sent to the server. As unlikely as it is, maybe you can check that the metadata.OnGCE() path isn't being taken somehow, since that does override hostname?
I have traced back metadata.OnGCE() which invoking testOnGCE() in which connectivity of GCE instance failing. I have cross checked the ping on this address, surprisingly packets are getting loss(100%).

[a2@go-s390x01 ~]$ ping google.com
PING google.com(lga15s49-in-x0e.1e100.net (2607:f8b0:4006:80d::200e)) 56 data bytes
^C
--- google.com ping statistics ---
11 packets transmitted, 0 received, 100% packet loss, time 10434ms

So, I am suspecting this could be causing bootstarpswarm to fail and not taking into if case of metadata.OnGCE() path.

`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-s390x Issues solely affecting the s390x architecture. Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. new-builder OS-Linux
Projects
Status: In Progress
Development

No branches or pull requests

4 participants