-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python 3.13: test_xml_etree fails on expat 2.2.5 (RHEL8 / Rocky linux 8 / Ubuntu 22.04) #125067
Comments
Are you able to test with pre releases, or attempt git bisect to help us narrow down the issue? A |
Yes - for the pre-releases, are there specific versions I should attempt? |
I would attempt a manual bisection -- start with eg eg beta 1, then iterate through depending on if the failures are present. |
I've reproduced the Oddly, the |
The failing tests ( |
Have you tried with a more recent version of expat (2.6+)? Do the tests succeed? |
Just did - |
+1 to @JacobHenner – I have the exact same issue with test_hashlib. For 3.13.0 in CI, tests are failing both on rhel and debian. However, for all versions from 3.8 to 3.12, the tests pass without errors. |
What are you using to build your containers? Docker, podman, Kaniko? |
I'm building with Kaniko. However, there is an important difference from your case: only |
I suspect this is related. We're also building with Kaniko in the cases where it's failing. I'm analyzing this further now.
Which distro are you using, and with which version of expat? |
We're seeing test_set fail in Docker building on Alpine 3.19 w/ expat 2.6.3 test_memoryview is also failing |
I think my previous comments about For comparison, the
Can you share the specific cases that are failing? |
RHEL 8 (and so I assume Rocky Linux 8) has not updated expat, but it has patched it instead. The failing tests are only failing because they check for expat version: cpython/Lib/test/test_pyexpat.py Line 801 in eafd14f
Line 1253 in eafd14f
cpython/Lib/test/test_xml_etree.py Line 1740 in eafd14f
|
I'm gradually trimming the Dockerfile down to the minimal example. I've stopped at the latest version of debian:bookworm, but the problem also reproduces on The presence or absence of expat also does not affect the reproducibility of the error with |
Here's my Dockerfile in case anyone wants to verify the failing tests: FROM alpine:3.19
ARG INSTALL_VERSION=3.13.0
ENV PATH="/usr/local/bin:${PATH}"
ENV LANG=C.UTF-8
# if this is called "PIP_VERSION", pip explodes with "ValueError: invalid truth value '<VERSION>'"
# https://pypi.org/project/pip/
ARG PYTHON_PIP_VERSION="24.2"
ARG SETUPTOOLS_VERSION="75.1.0"
ARG PYTHONDONTWRITEBYTECODE=1
SHELL ["/bin/ash", "-eo", "pipefail", "-c"]
# hadolint ignore=DL3003
RUN --mount=type=cache,target=/var/cache/apk \
--mount=type=cache,target=/tmp \
--mount=type=cache,target=/usr/src/python \
# Install fetch dependencies
apk add --no-cache --virtual .fetch-deps \
tar~=1.35 \
xz~=5.4.5 \
\
&& mkdir -p /usr/src/python \
# Fetch installation
&& wget -q -O /tmp/python.tar.xz "https://www.python.org/ftp/python/${INSTALL_VERSION%%[a-z]*}/Python-$INSTALL_VERSION.tar.xz" \
&& tar -xJC /usr/src/python --strip-components=1 -f /tmp/python.tar.xz \
\
# Delete fetch dependencies
&& apk del --no-network .fetch-deps && \
\
# Install build dependencies
apk add --no-cache --virtual .build-deps \
bluez-dev~=5.70 \
bzip2-dev~=1.0.8 \
coreutils~=9.4 \
dpkg-dev~=1.22.1 \
dpkg~=1.22 \
expat-dev~=2.6.3 \
findutils~=4.9.0 \
gcc~=13.2.1 \
gdbm-dev~=1.23 \
gnupg~=2.4.4 \
libc-dev~=0.7.2 \
libffi-dev~=3.4.4 \
libnsl-dev~=2.0.1 \
make~=4.4 \
ncurses-dev~=6.4 \
openssl-dev~=~=3.1.7 \
pax-utils~=1.3.7 \
readline-dev~=8.2 \
sqlite-dev~=3.44 \
tcl-dev~=8.6.13 \
tk-dev~=8.6.13 \
tk~=8.6.13 \
xz-dev~=5.4.5 && \
\
# Install dependencies
apk add --no-cache \
expat~=2.6.3 \
# CVE-2022-1304
libcom_err~=1.47 \
libuuid~=2.39 \
openssl~=3.1.7 \
ssl_client~=1.36 \
\
# Build Python
&& cd /usr/src/python \
&& gnuArch="$(dpkg-architecture --query DEB_BUILD_GNU_TYPE)" \
&& ./configure \
--build="$gnuArch" \
--enable-loadable-sqlite-extensions \
--enable-optimizations \
--with-lto \
--enable-option-checking=fatal \
--enable-shared \
--with-system-expat \
--without-ensurepip \
&& make -j "$(nproc)" \
# set thread stack size to 1MB so we don't segfault before we hit sys.getrecursionlimit()
# https://github.com/alpinelinux/aports/commit/2026e1259422d4e0cf92391ca2d3844356c649d0
EXTRA_CFLAGS="-DTHREAD_STACK_SIZE=0x100000" \
LDFLAGS="-Wl,--strip-all" \
&& make install \
\
# Install run dependencies
&& find /usr/local -type f -executable -not \( -name '*tkinter*' \) -exec scanelf --needed --nobanner --format '%n#p' '{}' ';' \
| tr ',' '\n' \
| sort -u \
| awk 'system("[ -e /usr/local/lib/" $1 " ]") == 0 { next } { print "so:" $1 }' \
| xargs -rt apk add --no-cache --virtual .python-rundeps \
\
# Delete build dependencies
&& apk del --no-network .build-deps \
\
# Clean up installation
&& find /usr/local -depth \
\( \
\( -type d -a \( -name test -o -name tests -o -name idle_test \) \) \
-o \
\( -type f -a \( -name '*.pyc' -o -name '*.pyo' \) \) \
\) -exec rm -rf '{}' + \
\
# Test for proper python installation
&& python3 --version | grep "${INSTALL_VERSION}" \
\
# Perform linking
&& ln -s /usr/local/bin/idle3 /usr/local/bin/idle \
&& ln -s /usr/local/bin/pydoc3 /usr/local/bin/pydoc \
&& ln -s /usr/local/bin/python3 /usr/local/bin/python \
&& ln -s /usr/local/bin/python3-config /usr/local/bin/python-config \
\
# Install PIP
&& wget -q -O /tmp/get-pip.py "https://bootstrap.pypa.io/get-pip.py" \
&& python /tmp/get-pip.py --disable-pip-version-check --no-cache-dir --no-compile \
"setuptools==${SETUPTOOLS_VERSION}" \
"pip==${PYTHON_PIP_VERSION}"
CMD ["python3"]
HEALTHCHECK NONE |
It turns out that the |
I'm seeing These tests pass for us on Ubuntu 20.04 and 24.04, just not 22.04. We build with
Example failing build log: The Dockerfile/scripts used to build: |
I now believe the title of this issue is misleading. I've just confirmed that The specific test cases failures should probably be tracked separately. It's possible that users will see new build failures in other environments where the tests had always been failing, but not breaking the build.
|
Strange, we have Debian and RHEL buildbots and the whole test suite pass there. |
The version checks not matching Ubuntu's backported versions appear to be the cause in one case (the failure in For failure output and links to the full logs, see: Also, cross-linking some prior history in this area: |
In the meantime we've had to resort to disabling the three affecting tests on Ubuntu 22.04 with Python 3.13.0: |
What are the next steps here? It would be great to have a fix for this for Python 3.13.1 so we don't need custom patches to build from source with PGO enabled :-) |
Can someone try if reverting e5e4033 fix the issue? |
Ah, RHEL8 buildbots don't use When using
Reverting the change doesn't change anything for expat 2.2.5. Before this change, test_simple_xml_chunk_1() and test_simple_xml_chunk_5() were skipped on expat 2.6 and newer. |
Please open a separated issue for test_generators and test_hashlib. |
@serhiy-storchaka: Would you mind to have a look at this expat issue? Tests fail on old expat 2.2.5. |
There is no way to know whether reparse deferral was enabled in Expat. I do not see how we can fix this on our side without skipping some tests on vanilla Expat < 2.6.0. I think that if you use a patched Expat, you should also patch Python (at least tests) and all user code which depends on the old Expat behavior. @hartwork, I afraid this issue will haunt us for years. |
expat 2.2.5 was released at November 1, 2017. expat 2.6.0 was released at February 6, 2024. What do you think of skipping the 3 failing tests on expat < 2.6.0? |
As far as I understand, this is not a vanilla Expat 2.2.5, but Expat 2.2.5 patched with some changes from 2.6. We do not want to skip tests on platforms that use Expat < 2.6 which does not include such changes (it may be a majority of computer for now). I suggest to only skip them on platforms where the patched old Expat is used. We cannot know what changes were backported in the particular distribution, so we should leave this for maintainers of these distributions. |
Full ack, with both my upstream and my downstream hat on. |
I suggest to close this issue as "WONT FIX". |
Please could someone edit the title of this issue to add "Ubuntu 22.04" to the affected distros list?
If that's the outcome, would you recommend we (independent maintainers of Linux builds of Python that need to run on Linux, but that are not a distro package, so we really don't want to be in the patching-packages game) stop using It sounds like at the moment that |
Done. |
@edmorley |
@hartwork The problem is that we don't really have another choice though :-) I'm not a distro maintainer (edit: that is, a "maintainer of a distribution of the Linux OS"), I've never personally used the As a temporary workaround to our Python 3.13 PGO builds failing I had to disable the affected tests: My concerns are now:
Given that:
...then stopping using The Docker Hub official Python image maintainers seem to have come to the same conclusion: |
@edmorley from what I read you're not "just building binaries": to me it sounds like you're a distributor and you're facing the cost of the backports approach (that saves you cost elsewhere, a tradeoff) and of being a distributor. I think you can pick from:
All other approaches I can think of put the effort to CPython upstream where it does not belong. |
@edmorley , I have to agree with @hartwork , yes, you are a distributor. If you deal with the problem of backporting to different version of libexpat, welcome on the board. And yes, I could add openSUSE Leap 15.6 to the list of failing distros, for the same reason as what’s described in #125067 (comment) : our We were dealing with these errors to the various degrees of success by this patch https://build.opensuse.org/projects/devel:languages:python:Factory/packages/python314/files/CVE-2023-52425-libexpat-2.6.0-backport-15.6.patch?expand=1 (basically skipping the problematic tests). |
Thank you for the replies. However, there's been a bit of a misunderstanding. By "distro maintainer" I was referring to my not being a "maintainer of a distribution of the Linux OS", and not anything about being a "distributor of Python binaries". Ideally I wouldn't have to be building Python binaries at all, however:
Perhaps things will improve in the future, given recent news about Given our use-case, it seems likely we'll pick:
...especially given that this is the configuration that CPython defaults to, and tests in its own CI. |
Do you also distribute Python tests? If yes, then you should distribute modified tests which ignore such failure (on selected platforms or always). If no, then you should not worry at all. |
No we don't distribute tests to reduce resultant container image size (we build with
The failing tests with Ubuntu
(from #125067 (comment)) |
I believe this picture is missing that CPython compiled against backported system Expat will use less of the recent Expat API internally (because the CPython build system assumes that API to be missing based on its version, which is legit) and so then even outside of tests pull requests like #115623 may not work as expected any more because system Expat is behaving differently than what CPython is rightfully expecting from the given Expat version number. So if I'm not missing something it would need patching where binaries are built against backported Expat, and not just the tests but also C code (or even just C code). A cheap trick that may work is to fake |
Bug report
Bug description:
I'm attempting to build Python 3.13.0 for Rocky Linux 8, and several tests are failing:
These tests are required to succeed in order for
--enable-optimizations
to work as expected.I am using the following build options:
3.9-3.12.6 do not exhibit this behavior when built with the same commands (aside from the Python version).
I'm curious whether there might be a regression for certain versions of expat, similar to #117187.
expat-devel version is
expat-2.2.5-15.el8
.The specific test failures are:
CPython versions tested on:
3.13
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: