Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manylinux architecture / 1.4.0 regression ? #336

Closed
jlaine opened this issue May 4, 2020 · 27 comments
Closed

manylinux architecture / 1.4.0 regression ? #336

jlaine opened this issue May 4, 2020 · 27 comments

Comments

@jlaine
Copy link
Contributor

jlaine commented May 4, 2020

Hello!

First of all, thanks again for all the hard work that goes into cibuildwheel, I'm using it successfully to build wheels for a growing number of packages (aiortc, aioquic, PyAV, pylibsrtp, pylsqpack) and it's a lifesaver.

I noticed my Linux builds have been failing for the past couple of days which seems to coincide with the release of cibuildwheel 1.4.0. To be more specific, the i686 builds fail when linking against third-party libraries, which seems to be due to an architecture mismatch. The following output puzzles me, it looks like the i686 build is in fact using an x86_64 Python?

Look at the name of the docker image quay.io/pypa/manylinux2010_i686 vs the build directory build/lib.linux-x86_64-3.6:

2020-05-03T22:11:49.7883626Z Status: Downloaded newer image for quay.io/pypa/manylinux2010_i686:2020-04-29-0e1afc5
2020-05-03T22:11:58.3519923Z 5143d57f1cac28cce074d70d375b2601829ae6ae4600a11deeef4ffd6f938512
2020-05-03T22:11:58.3529495Z + docker cp . cibuildwheel-9a166a0e-1e64-4228-af7f-ebfe503b747f:/project
2020-05-03T22:11:58.6905700Z + docker start cibuildwheel-9a166a0e-1e64-4228-af7f-ebfe503b747f
2020-05-03T22:11:58.9714592Z cibuildwheel-9a166a0e-1e64-4228-af7f-ebfe503b747f
2020-05-03T22:11:58.9727256Z + docker exec -i cibuildwheel-9a166a0e-1e64-4228-af7f-ebfe503b747f sh -c 'cat > /constraints.txt'
2020-05-03T22:11:59.0792362Z + docker exec -i cibuildwheel-9a166a0e-1e64-4228-af7f-ebfe503b747f /bin/bash
2020-05-03T22:11:59.1929182Z     + mkdir -p /output
2020-05-03T22:11:59.1946204Z     + cd /project
2020-05-03T22:11:59.1953500Z     + PYBIN=/opt/python/cp36-cp36m/bin
2020-05-03T22:11:59.1954499Z     + export PATH=/opt/python/cp36-cp36m/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2020-05-03T22:11:59.1955373Z     + PATH=/opt/python/cp36-cp36m/bin:/opt/rh/devtoolset-7/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2020-05-03T22:11:59.1955905Z     + export CFLAGS=-I/tmp/vendor/include
2020-05-03T22:11:59.1956213Z     + CFLAGS=-I/tmp/vendor/include
2020-05-03T22:11:59.1956480Z     + export LDFLAGS=-L/tmp/vendor/lib
2020-05-03T22:11:59.1956722Z     + LDFLAGS=-L/tmp/vendor/lib
2020-05-03T22:11:59.1967182Z      + which pip
2020-05-03T22:11:59.1981753Z     + '[' /opt/python/cp36-cp36m/bin/pip '!=' /opt/python/cp36-cp36m/bin/pip ']'
2020-05-03T22:11:59.1983200Z      + which python
2020-05-03T22:11:59.1995385Z     + '[' /opt/python/cp36-cp36m/bin/python '!=' /opt/python/cp36-cp36m/bin/python ']'
2020-05-03T22:11:59.1996868Z     + '[' '!' -z 'python scripts/fetch-vendor /tmp/vendor' ']'
2020-05-03T22:11:59.1997570Z     + sh -c 'python scripts/fetch-vendor /tmp/vendor'
2020-05-03T22:11:59.4140202Z INFO:root:Creating directory /tmp/vendor
2020-05-03T22:11:59.4143925Z INFO:root:Downloading https://github.com/aiortc/aiortc-codecs/releases/download/1.1/codecs-manylinux_x86_64.tar.gz
2020-05-03T22:11:59.7676461Z INFO:root:Extracting codecs-manylinux_x86_64.tar.gz
2020-05-03T22:11:59.8596887Z     + rm -rf /tmp/built_wheel
2020-05-03T22:11:59.8607520Z     + mkdir /tmp/built_wheel
2020-05-03T22:11:59.8620846Z     + pip wheel /project/. -w /tmp/built_wheel --no-deps
2020-05-03T22:12:00.9943483Z Processing /project
2020-05-03T22:12:03.1490135Z Building wheels for collected packages: aiortc
2020-05-03T22:12:03.1509285Z   Building wheel for aiortc (setup.py): started
2020-05-03T22:12:03.7937791Z   ERROR: Command errored out with exit status 1:
2020-05-03T22:12:03.7938617Z   Building wheel for aiortc (setup.py): finished with status 'error'
2020-05-03T22:12:03.7939613Z    command: /opt/_internal/cpython-3.6.10/bin/python3.6 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/project/setup.py'"'"'; __file__='"'"'/project/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-q448gh23
2020-05-03T22:12:03.7939919Z        cwd: /project/
2020-05-03T22:12:03.7941876Z   Complete output (53 lines):
2020-05-03T22:12:03.7944005Z   running bdist_wheel
2020-05-03T22:12:03.7944123Z   Running setup.py clean for aiortc
2020-05-03T22:12:03.7944537Z   running build
2020-05-03T22:12:03.7944771Z   running build_py
2020-05-03T22:12:03.7944980Z   creating build
2020-05-03T22:12:03.7945460Z   creating build/lib.linux-x86_64-3.6
2020-05-03T22:12:03.7945930Z   creating build/lib.linux-x86_64-3.6/aiortc
@Czaki
Copy link
Contributor

Czaki commented May 4, 2020

Hm. I understand that it works on cibuildwheel=1.3.0?
Could You try with other version of manylinux image https://quay.io/repository/pypa/manylinux2010_i686?tag=latest&tab=tags

set with CIBW_MANYLINUX_I686_IMAGE

@jlaine
Copy link
Contributor Author

jlaine commented May 4, 2020

Hi @Czaki !

Thanks for the tip, I just tried using the same i686 image which was used for my last successful build: CIBW_MANYLINUX_I686_IMAGE: quay.io/pypa/manylinux2010_i686:2020-04-29-0e1afc5. However the build still fails in the same way.

the last successful build: https://github.com/aiortc/aiortc/runs/637823721?check_suite_focus=true
the failed build forcing the image: https://github.com/aiortc/aiortc/runs/642304761?check_suite_focus=true

On the other hand, forcing cibuildwheel 1.3.0 does succeed:

https://github.com/aiortc/aiortc/runs/642307642?check_suite_focus=true

@YannickJadoul
Copy link
Member

2020-05-03T22:11:59.4143925Z INFO:root:Downloading https://github.com/aiortc/aiortc-codecs/releases/download/1.1/codecs-manylinux_x86_64.tar.gz

How are you getting this, because that doesn't seem like 32-bit either?

I agree with @Czaki: the weird thing is that we hardly add anything extra for Linux. Most tools are in the manylinux images. But so, something seems off if 1.3.0 does work.

@YannickJadoul
Copy link
Member

YannickJadoul commented May 4, 2020

Somehow, os.uname() is returning x86_64 (cfr. a test run on my fork). I'm not yet sure how the version of cibuildwheel influences this. Any ideas, @Czaki ?

@Czaki
Copy link
Contributor

Czaki commented May 4, 2020

/opt/python/cp36-cp36m/bin/python posix.uname_result(sysname='Linux', nodename='22d9365b800b', release='5.0.0-1035-azure', version='#37-Ubuntu SMP Wed Mar 18 11:21:35 UTC 2020', machine='x86_64')

It looks like it get uname of host system (not docker)...

In docker it should be Centos.

@YannickJadoul
Copy link
Member

In docker it should be Centos.

Should it? I get Linux 59b6d2efee12 4.15.0-99-generic #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020 i686 i686 i386 GNU/Linux when I run uname -a in the i686 image locally. So that's still the host platform name, but at least it's 32-bit?

@Czaki
Copy link
Contributor

Czaki commented May 4, 2020

@YannickJadoul Only difference which I see is is thain in 1.3.0 build is executed with docker start and in 1.4.0 start is executed earlier and then wheel is created using docker exec

@jlaine Maybe use approach from https://github.com/aiortc/aiortc/blob/master/scripts/fetch-vendor#L19 in linux is best workaround (check size of structure instead of uname)?

I search a little about uname in docker and it looks like it returns host system in some scenario.

@YannickJadoul
Copy link
Member

Seems to be a docker issue: run vs. exec

~$ docker create --name test -it quay.io/pypa/manylinux2010_i686:2020-04-29-0e1afc5 /bin/bash
2d55e01b26aa8b7077a427f77fe83eb71cd3e4d2f48999e7f9648970f3e83c65
~$ docker start test
test
~$ docker exec -it test /bin/bash
[root@2d55e01b26aa /]# uname -a
Linux 2d55e01b26aa 4.15.0-99-generic #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@2d55e01b26aa /]# exit
~$ docker run -it --rm quay.io/pypa/manylinux2010_i686:2020-04-29-0e1afc5 /bin/bash
[root@403cafaa07bb /]# uname -a
Linux 403cafaa07bb 4.15.0-99-generic #100-Ubuntu SMP Wed Apr 22 20:32:56 UTC 2020 i686 i686 i386 GNU/Linux
[root@403cafaa07bb /]# exit

Do we have any Docker specialists? @joerick made these changes, I think; do you know more about Docker, @joerick?

@joerick
Copy link
Contributor

joerick commented May 4, 2020

I wonder if I somehow have overridden the ENTRYPOINT set in https://github.com/pypa/manylinux/blob/master/docker/Dockerfile-i686#L26

It looks like linux32 is a way to 'fake' an architecture: https://stackoverflow.com/questions/26490935/how-to-fake-cpu-architecture-in-docker-container

@YannickJadoul
Copy link
Member

Aaaaah, maybe this ENTRYPOINT is not executed when used through exec?

@Czaki
Copy link
Contributor

Czaki commented May 4, 2020

I wonder if I somehow have overridden the ENTRYPOINT set in https://github.com/pypa/manylinux/blob/master/docker/Dockerfile-i686#L26

It looks like linux32 is a way to 'fake' an architecture: https://stackoverflow.com/questions/26490935/how-to-fake-cpu-architecture-in-docker-container

I also think we do this. I now prepare patch.

@YannickJadoul
Copy link
Member

@Czaki No, wait! What kind of patch?

@YannickJadoul
Copy link
Member

YannickJadoul commented May 4, 2020

I think I would prefer to use run again (like in 1.3.0) over sprinkling linux32 everywhere.

@joerick
Copy link
Contributor

joerick commented May 4, 2020

Agree on referencing linux32 - if we can use the container's config to run within linux32 that would be preferable - I'd rather not have to deal with that within cibuildwheel.

I'll check the PR now.

@Czaki
Copy link
Contributor

Czaki commented May 4, 2020

I think I would prefer to use run again (like in 1.3.0) over sprinkling linux32 everywhere.

but this remove control from python to big bash script (because run create new container)

and in 1.3.0 the start is used. not run

@joerick
Copy link
Contributor

joerick commented May 4, 2020

Yeah, it looks like we won't be able to use ENTRYPOINT and exec. If I remember correctly, exec is quicker than run because the container stays running.

@YannickJadoul
Copy link
Member

Yeah, OK, now we're talking in 2 separate threads. But what about just reverting to start? Seems you changed this in #256, @joerick (c4f54c6#diff-22075e43d4ff5efd0d10f3dcffdb93aa), for faster builds, but how much faster is it? Is it worth the extra complexity?

@YannickJadoul
Copy link
Member

(Ah, also, before I forget: a PR should probably also add a test on the architecture inside the Docker image? Though I'm not sure how easy that'll be?)

@Czaki
Copy link
Contributor

Czaki commented May 4, 2020

move from start ( or run) to exec is big step in unify code between platforms.

Maybe treat #338 as bugfix for release 1.4.1 and rethink whole idea.

Mabe best option is copy python file in docker and call it instead of bash. there will be control flow in python (which allow unify in future) but can be executed in single run?

@YannickJadoul
Copy link
Member

move from start ( or run) to exec is big step in unify code between platforms.

How? I don't see a big difference between start (as we did before, and as was working before c4f54c6) and exec?

Mabe best option is copy python file in docker and call it instead of bash. there will be control flow in python (which allow unify in future) but can be executed in single run?

That's a future thing, but too much for a quick bugfix now.

@joerick
Copy link
Contributor

joerick commented May 4, 2020

Yeah, OK, now we're talking in 2 separate threads. But what about just reverting to start? Seems you changed this in #256, @joerick (c4f54c6#diff-22075e43d4ff5efd0d10f3dcffdb93aa), for faster builds, but how much faster is it? Is it worth the extra complexity?

I was substantial, as I remember. I think that long-term exec is going to serve our needs better, as @Czaki mentioned. @Czaki's solution will work as a bug fix for now, though I'd like a test for this approach too (printing the output of platform.machine() in CIBW_BEFORE_BUILD, and checking this with capfd should do the trick).

@YannickJadoul
Copy link
Member

I'm mostly worried 1) about other platforms or entrypoints (what about the non-intel manylinuxes; do these need things that affect uname? And not in the future either? Just checking for platform_tag.endswith("i686") seems too much of a quick hack, somehow) and 2) about forgetting linux32 when we change things/add docker commands.

I must admit that the changes in #338 are smaller than expected, but it still feels like there should be a better way :-(

@Czaki
Copy link
Contributor

Czaki commented May 4, 2020

I must admit that the changes in #338 are smaller than expected, but it still feels like there should be a better way :-(

My idea is to create standalone python file to control build flow in docer containre and copy it inside instead of using bash script.

other option is to extract entrypoint using docker inspect --format='{{.Config.Entrypoint}}' container_name and use it

@YannickJadoul
Copy link
Member

My idea is to create standalone python file to control build flow in docer containre and copy it inside instead of using bash script.

That idea has been around for a long time. (You don't even need to copy it in; you could probably just stream it through stdin.) But as I said before, that's too much for now. And I don't know if it will help a lot.

other option is to extract entrypoint using docker inspect --format='{{.Config.Entrypoint}}' container_name and use it

Maybe, yes. That feels very complex, though, so I don't know.

I'm mostly annoyed that Docker doesn't have some kind of alternative for ENTRYPOINT on exec. This shouldn't be our problem to manually work around by adding this entrypoint ourselves (which is why I initially preferred run/start, and still kind of find that an attractive option).

jlaine added a commit to jlaine/aiortc that referenced this issue May 4, 2020
jlaine added a commit to jlaine/aioquic that referenced this issue May 4, 2020
jlaine added a commit to jlaine/aiortc that referenced this issue May 4, 2020
jlaine added a commit to aiortc/aiortc that referenced this issue May 4, 2020
jlaine added a commit to aiortc/aioquic that referenced this issue May 4, 2020
@joerick
Copy link
Contributor

joerick commented May 4, 2020

Fixed in v1.4.1! Could you confirm this works for you @jlaine ?

@joerick joerick closed this as completed May 4, 2020
@jlaine
Copy link
Contributor Author

jlaine commented May 4, 2020

Yes, I can confirm it works @joerick :

https://github.com/aiortc/aiortc/runs/643791124?check_suite_focus=true

Thanks to all for the very fast turnaround!

@joerick
Copy link
Contributor

joerick commented May 4, 2020

Thanks @Czaki and @YannickJadoul for all the help!

PAN-Chuwen pushed a commit to PAN-Chuwen/StreamPose that referenced this issue Sep 15, 2023
volodymyrtuta added a commit to volodymyrtuta/ai-alorithm that referenced this issue Dec 25, 2023
17852833820 pushed a commit to 17852833820/AioRTC that referenced this issue Jan 20, 2024
mametaro99 pushed a commit to mametaro99/image-recognition that referenced this issue May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants