Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-115399: Upgrade bundled libexpat to 2.6.0 #115431

Merged
merged 2 commits into from
Feb 14, 2024

Conversation

sethmlarson
Copy link
Contributor

@sethmlarson sethmlarson commented Feb 13, 2024

Closes #115399

  • Upgrades libexpat to 2.6.0
  • Adds XML_GE definition to LIBEXPAT_CFLAGS since 2.6.0 requires us to choose one, I set it to 1 to keep general entities support. Our test suite didn't fail either way with this setting but I'm assuming we want to keep support?

cc @hartwork

configure.ac Outdated
@@ -3720,7 +3720,7 @@ AS_VAR_IF([with_system_expat], [yes], [
LIBEXPAT_LDFLAGS=${LIBEXPAT_LDFLAGS-"-lexpat"}
LIBEXPAT_INTERNAL=
], [
LIBEXPAT_CFLAGS="-I\$(srcdir)/Modules/expat"
LIBEXPAT_CFLAGS="-I\$(srcdir)/Modules/expat -DXML_GE=1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's that?

Copy link
Contributor Author

@sethmlarson sethmlarson Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to make a choice, error message from the Windows builds that are now failing cuz I need to update them as well:

XML_GE (for general entities) must be defined, non-empty, either 1 or 0 (0 to disable, 1 to enable; 1 is a common default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably update the default value in line 3719 as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's that?

@ambv "GE" is short for general entities. The flag is documented in detail at https://libexpat.github.io/doc/api/latest/#XML_GE .

@sethmlarson I would have expected (meaning considered likely) a new line #define XML_GE 1 to go into CPython's own expat_config.h. Did you avoid that file on purpose? Its sibling XML_DTD seems to live there.

You should probably update the default value in line 3719 as well.

For anyone else wondering where (or what the line content is), that seems to be:

LIBEXPAT_CFLAGS=${LIBEXPAT_CFLAGS-""}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartwork Thank you for this, I don't know how but my grep XML_DTD turned up no results. That's definitely where it should go.

@hartwork
Copy link
Contributor

[..] I set it to 1 to keep general entities support. Our test suite didn't fail either way with this setting but I'm assuming we want to keep support?

@sethmlarson good call, yes. On a sidenote XML_DTD requires XML_GE so it's the only choice here with XML_DTD active anyway. Good 👍

@sethmlarson
Copy link
Contributor Author

Thanks for the guidance @hartwork! I rearranged the configuration to do as you said.

I also split the update itself into a separate commit from the SBOM update (which removes expat/expat_config.h from being a part of the expat project) so it's easier to backport to older branches. This PR can directly be backported to 3.12 but 3.11 and earlier will need only a011b5f

Comment on lines 67 to 71

/* Namespace external symbols to allow multiple libexpat version to
co-exist. */
#include "pyexpatns.h"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sethmlarson could it be that this removal is accidental?

PS: Here's the Dockerfile that made me spot this, based on earlier version #98742 (review):

# Copyright (c) 2022-2024 Sebastian Pipping <[email protected]>
# Licensed under the Apache License version 2.0

FROM alpine
RUN apk add --update \
            diffutils \
            git \
            sed \
        && \
    git clone --depth 1 https://github.com/python/cpython cpython-main \
        && \
    ( cd cpython-main && git rev-parse HEAD ) \
        && \
    git clone --depth 1 --branch libexpat-2.6.0 https://github.com/sethmlarson/cpython cpython-pr \
        && \
    ( cd cpython-pr && git rev-parse HEAD ) \
        && \
    git config --global advice.detachedHead false \
        && \
    git clone --depth 1 --branch R_2_5_0 https://github.com/libexpat/libexpat libexpat_2_5_0 \
        && \
    git clone --depth 1 --branch R_2_6_0 https://github.com/libexpat/libexpat libexpat_2_6_0 \
        && \
    diff -r -u libexpat_2_5_0/expat/lib/ cpython-main/Modules/expat/ | tee 2-5-0.diff \
        && \
    diff -r -u libexpat_2_6_0/expat/lib/ cpython-pr/Modules/expat/ | tee 2-6-0.diff \
        && \
    sed -e '/^Only in /d' -e '/^\(+++\|---\) /d' -e '/^diff /d' -i 2-5-0.diff 2-6-0.diff \
        && \
    diff -u 2-5-0.diff 2-6-0.diff \
        && \
    echo 'Diff is good.'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hugovk It feels like now we have to automate this diff comparison. Do you have any good ideas on when to trigger the workflow smartly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corona10 did you mean me or @hugovk?

To automate this, you'd need to extract an old and a new version from the diff between the push HEAD and the base commit and so on. I'm not sure if Expat releases are frequent enough to build automation like that and keep it working.

Pull request #100539 is about something related but different, but maybe still of interest to you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corona10 PS: maybe a cheap version to cover the core need (unless misunderstood) would be to make CI assert that that file still includes file pyexpatns.h using grep? That would be orders of magnitude simpler 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this, I've updated my commit to not remove these lines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hartwork
I meant @hugovk since he has good insight into automation workflow in CPython.
But in any case, the link you introduced looks cool to me, I think that it would be worth to take a look at :)

@sethmlarson sethmlarson added the needs backport to 3.12 bug and security fixes label Feb 14, 2024
@Yhg1s Yhg1s enabled auto-merge (squash) February 14, 2024 15:45
@sethmlarson
Copy link
Contributor Author

Backport to 3.11 and prior of this PR: #115468

3.12 will be handled automatically once this PR merges.

@Yhg1s Yhg1s merged commit 4b2d178 into python:main Feb 14, 2024
44 checks passed
@miss-islington-app
Copy link

Thanks @sethmlarson for the PR, and @Yhg1s for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Feb 14, 2024
(cherry picked from commit 4b2d178)

Co-authored-by: Seth Michael Larson <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Feb 14, 2024

GH-115469 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Feb 14, 2024
@sethmlarson sethmlarson deleted the libexpat-2.6.0 branch February 14, 2024 18:08
@gpshead gpshead added the type-security A security issue label Feb 14, 2024
gpshead pushed a commit that referenced this pull request Feb 14, 2024
)

gh-115399: Upgrade bundled libexpat to 2.6.0 (GH-115431)
(cherry picked from commit 4b2d178)

Co-authored-by: Seth Michael Larson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-security A security issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please upgrade bundled Expat to 2.6.0 (e.g. for the fix to CVE-2023-52425)
7 participants