Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use upstream prow agent images #219

Closed
mkumatag opened this issue Aug 24, 2021 · 16 comments
Closed

Use upstream prow agent images #219

mkumatag opened this issue Aug 24, 2021 · 16 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@mkumatag
Copy link
Member

Recently there is a code merge that happened to build prow agents for the ppc64le platform, this is the right time to switch our prow to use the upstream agents instead of our own built.

ref: kubernetes/test-infra#23293 (comment)

/cc @Rajalakshmi-Girish

@mkumatag
Copy link
Member Author

/assign @Rajalakshmi-Girish

@Rajalakshmi-Girish
Copy link
Collaborator

Rajalakshmi-Girish commented Aug 30, 2021

As the ppc64le images are only recently available in gcr.io/k8s-prow I supposed we need to upgrade our prow version and then use the ppc64le tag while pointing to utility images.

Hope change of values to something like gcr.io/k8s-prow/clonerefs:v20210827-98f54bde95-ppc64le at https://github.com/ppc64le-cloud/test-infra/blob/master/config/prow/config.yaml#L72 should work.

@mkumatag Please let me know if I shall upgrade prow infra on IKS cluster?

@mkumatag
Copy link
Member Author

As the ppc64le images are only recently available in gcr.io/k8s-prow I supposed we need to upgrade our prow version and then use the ppc64le tag while pointing to utility images.

Hope change of values to something like gcr.io/k8s-prow/clonerefs:v20210827-98f54bde95-ppc64le at https://github.com/ppc64le-cloud/test-infra/blob/master/config/prow/config.yaml#L72 should work.

@mkumatag Please let me know if I shall upgrade prow infra on IKS cluster?

Let's upgrade the prow controller to latest available version and with that you can use the utility images from upstream itself(no need to mention the ppc64le tag in it, they are all fat manifest images)

@Rajalakshmi-Girish
Copy link
Collaborator

The ppc64le images(gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le) from upstream are throwing below error:

Warning  Failed     65s   kubelet            Error: failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/initupload": stat /initupload: no such file or directory: unknown

We don't see this with the images that we built on ppc64le. (quay.io/powercloud/initupload:v20210830-fcbb70627f)
@mkumatag Please let me know if this is to be reported upstream.

Details of the pod with pod utilities pointing to upstream images
failed_pod_utility_ppc64le_image.log
For the sake of testing, I used the tag v20210830-fcbb70627f-ppc64le (As there is no fat manifest upstream).

@mkumatag
Copy link
Member Author

mkumatag commented Sep 1, 2021

We don't see this with the images that we built on ppc64le. (quay.io/powercloud/initupload:v20210830-fcbb70627f)
@mkumatag Please let me know if this is to be reported upstream.

sure, see if you can manually run that image and check if that error throws or not.

@Rajalakshmi-Girish
Copy link
Collaborator

We don't see this with the images that we built on ppc64le. (quay.io/powercloud/initupload:v20210830-fcbb70627f)
@mkumatag Please let me know if this is to be reported upstream.

sure, see if you can manually run that image and check if that error throws or not.

@mkumatag When I manually try running a single container using that image, I get no error.

[root@test-calico-dwnload-ratelimit ~]# docker run --entrypoint "/bin/sh" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
/app/prow/cmd/initupload/app-ppc64le.binary.runfiles/io_k8s_test_infra # ls
external  prow
/app/prow/cmd/initupload/app-ppc64le.binary.runfiles/io_k8s_test_infra # cd /
/ # ls
app         dev         home        lib         mnt         proc        run         srv         tmp         var
bin         etc         initupload  media       opt         root        sbin        sys         usr
/ #

@mkumatag
Copy link
Member Author

mkumatag commented Sep 2, 2021

We don't see this with the images that we built on ppc64le. (quay.io/powercloud/initupload:v20210830-fcbb70627f)
@mkumatag Please let me know if this is to be reported upstream.

sure, see if you can manually run that image and check if that error throws or not.

@mkumatag When I manually try running a single container using that image, I get no error.

[root@test-calico-dwnload-ratelimit ~]# docker run --entrypoint "/bin/sh" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
/app/prow/cmd/initupload/app-ppc64le.binary.runfiles/io_k8s_test_infra # ls
external  prow
/app/prow/cmd/initupload/app-ppc64le.binary.runfiles/io_k8s_test_infra # cd /
/ # ls
app         dev         home        lib         mnt         proc        run         srv         tmp         var
bin         etc         initupload  media       opt         root        sbin        sys         usr
/ #

did you try running the command what it tried launching, e.g: initupload?

@Rajalakshmi-Girish
Copy link
Collaborator

Rajalakshmi-Girish commented Sep 2, 2021

@mkumatag Yes, we get the same error as prow while /initupload is run.
Note: I see an executable by name initupload in / when into the container using /bin/sh

[root@rajalakshmi-workspace1 ~]# docker run --entrypoint "/initupload" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
Unable to find image 'gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le' locally
v20210830-fcbb70627f-ppc64le: Pulling from k8s-prow/initupload
0ff902055236: Already exists
6b6905b8b6cc: Already exists
f89ce5ebaa03: Already exists
3148998b117c: Pull complete
8e5b170ec95b: Pull complete
Digest: sha256:bf39195bec69b18b70967862f8e1820f03246be06960e7730e160bc58a863365
Status: Downloaded newer image for gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/initupload": stat /initupload: no such file or directory: unknown.

The same error is noticed when we try to use the wrong architecture image on amd machine.

[root@afferent1 ~]# docker run --entrypoint "/initupload" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-arm64
Unable to find image 'gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-arm64' locally
v20210830-fcbb70627f-arm64: Pulling from k8s-prow/initupload
fd3acdcea568: Pull complete
c870bdb2572e: Pull complete
e0ed68c04014: Pull complete
66b25a3e7ac1: Pull complete
8e5b170ec95b: Pull complete
Digest: sha256:556e6459c5e0a7ab59848a99bad062cd5cd016be0232a90769c17ea90ed4f52d
Status: Downloaded newer image for gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-arm64
WARNING: The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64) and no specific platform was requested
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/initupload": stat /initupload: no such file or directory: unknown.

Hence I suppose the upstream ppc64le tagged images are built on a different architecture from ours that is causing the error.
Hope we can report what we are facing on our ppc64le machine.

@mkumatag
Copy link
Member Author

mkumatag commented Sep 2, 2021

Digest: sha256:bf39195bec69b18b70967862f8e1820f03246be06960e7730e160bc58a863365
Status: Downloaded newer image for gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go

what file /initupload command says?

@Rajalakshmi-Girish
Copy link
Collaborator

Digest: sha256:bf39195bec69b18b70967862f8e1820f03246be06960e7730e160bc58a863365
Status: Downloaded newer image for gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go

what file /initupload command says?

Did you mean to try this?

[root@rajalakshmi-workspace1 ~]# docker run --entrypoint "file /initupload" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "file /initupload": stat file /initupload: no such file or directory: unknown.
ERRO[0001] error waiting for container: context canceled
[root@rajalakshmi-workspace1 ~]#

@mkumatag
Copy link
Member Author

mkumatag commented Sep 2, 2021

Digest: sha256:bf39195bec69b18b70967862f8e1820f03246be06960e7730e160bc58a863365
Status: Downloaded newer image for gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go

what file /initupload command says?

Did you mean to try this?

[root@rajalakshmi-workspace1 ~]# docker run --entrypoint "file /initupload" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "file /initupload": stat file /initupload: no such file or directory: unknown.
ERRO[0001] error waiting for container: context canceled
[root@rajalakshmi-workspace1 ~]#

Not sure whether it will allow you to run that or not, you can get into the container and check the binary architecture using file command, and also check with ldd command to see the whats the shared library dependency.

@Rajalakshmi-Girish
Copy link
Collaborator

@mkumatag The ppc64le image isn't able to fetch the initupload binary details. But the amd image is able to.

Image: gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le

[root@rajalakshmi-workspace1 ~]# docker run --entrypoint "/bin/sh" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
/app/prow/cmd/initupload/app-ppc64le.binary.runfiles/io_k8s_test_infra # cd /
/ # file initupload
/bin/sh: file: not found
/ # apk add file
fetch https://dl-cdn.alpinelinux.org/alpine/v3.14/main/ppc64le/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.14/community/ppc64le/APKINDEX.tar.gz
(1/2) Installing libmagic (5.40-r1)
(2/2) Installing file (5.40-r1)
Executing busybox-1.33.1-r3.trigger
OK: 35 MiB in 33 packages
/ # file initupload
initupload: broken symbolic link to /app/prow/cmd/initupload/app.binary
/ # file /app/prow/cmd/initupload/app.binary
/app/prow/cmd/initupload/app.binary: cannot open `/app/prow/cmd/initupload/app.binary' (No such file or directory)
/ # ldd /initupload
/lib/ld-musl-powerpc64le.so.1: cannot load /initupload: No such file or directory
/ #

Image: gcr.io/k8s-prow/initupload:v20210830-fcbb70627f

[root@afferent1 ~]# docker run --entrypoint "/bin/sh" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f
/app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra # cd /
/ # file --help
/bin/sh: file: not found
/ # apk add file
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/2) Installing libmagic (5.38-r0)
(2/2) Installing file (5.38-r0)
Executing busybox-1.31.1-r16.trigger
OK: 12 MiB in 17 packages
/ # file /initupload
/initupload: symbolic link to /app/prow/cmd/initupload/app.binary
/ # file /app/prow/cmd/initupload/app.binary
/app/prow/cmd/initupload/app.binary: symbolic link to /app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra/prow/cmd/initupload/app.binary_/app.binary
/ # file /app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra/prow/cmd/initupload/app.binary_/app.binary
/app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra/prow/cmd/initupload/app.binary_/app.binary: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=redacted, not stripped
/ #  ldd /initupload
/lib/ld-musl-x86_64.so.1: /initupload: Not a valid dynamic program
/ #

@mkumatag
Copy link
Member Author

mkumatag commented Sep 2, 2021

@mkumatag The ppc64le image isn't able to fetch the initupload binary details. But the amd image is able to.

Image: gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le

[root@rajalakshmi-workspace1 ~]# docker run --entrypoint "/bin/sh" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f-ppc64le
/app/prow/cmd/initupload/app-ppc64le.binary.runfiles/io_k8s_test_infra # cd /
/ # file initupload
/bin/sh: file: not found
/ # apk add file
fetch https://dl-cdn.alpinelinux.org/alpine/v3.14/main/ppc64le/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.14/community/ppc64le/APKINDEX.tar.gz
(1/2) Installing libmagic (5.40-r1)
(2/2) Installing file (5.40-r1)
Executing busybox-1.33.1-r3.trigger
OK: 35 MiB in 33 packages
/ # file initupload
initupload: broken symbolic link to /app/prow/cmd/initupload/app.binary
/ # file /app/prow/cmd/initupload/app.binary
/app/prow/cmd/initupload/app.binary: cannot open `/app/prow/cmd/initupload/app.binary' (No such file or directory)
/ # ldd /initupload
/lib/ld-musl-powerpc64le.so.1: cannot load /initupload: No such file or directory
/ #

Image: gcr.io/k8s-prow/initupload:v20210830-fcbb70627f

[root@afferent1 ~]# docker run --entrypoint "/bin/sh" -it gcr.io/k8s-prow/initupload:v20210830-fcbb70627f
/app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra # cd /
/ # file --help
/bin/sh: file: not found
/ # apk add file
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/2) Installing libmagic (5.38-r0)
(2/2) Installing file (5.38-r0)
Executing busybox-1.31.1-r16.trigger
OK: 12 MiB in 17 packages
/ # file /initupload
/initupload: symbolic link to /app/prow/cmd/initupload/app.binary
/ # file /app/prow/cmd/initupload/app.binary
/app/prow/cmd/initupload/app.binary: symbolic link to /app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra/prow/cmd/initupload/app.binary_/app.binary
/ # file /app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra/prow/cmd/initupload/app.binary_/app.binary
/app/prow/cmd/initupload/app.binary.runfiles/io_k8s_test_infra/prow/cmd/initupload/app.binary_/app.binary: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=redacted, not stripped
/ #  ldd /initupload
/lib/ld-musl-x86_64.so.1: /initupload: Not a valid dynamic program
/ #

Let's file an issue in the k/test-infra

@Rajalakshmi-Girish
Copy link
Collaborator

Rajalakshmi-Girish commented Sep 2, 2021

I shall just check other utility images clonerefs sidecar entrypoint too and file the issue.

@Rajalakshmi-Girish
Copy link
Collaborator

@mkumatag Please take a look kubernetes/test-infra#23449.

@mkumatag mkumatag added kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Dec 10, 2021
@mkumatag
Copy link
Member Author

This is fixed now, hence closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

2 participants