Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in Fedora pybind11 package caused by 7954a0514ba7de40dba6c598af830fd1b7a8bf0c #119507

Open
tstellar opened this issue Dec 11, 2024 · 6 comments
Assignees

Comments

@tstellar
Copy link
Collaborator

This could be the same issue as #119099. One of the test cases in the pybind11 package is failing and I've bisected it back to 7954a05. Here is a container file to reproduce the failure:

FROM registry.fedoraproject.org/fedora:41

RUN dnf install -y cmake ninja-build git binutils-devel clang

WORKDIR /root/

RUN git clone https://github.com/llvm/llvm-project

WORKDIR /root/llvm-project

RUN git checkout 7954a0514ba7de40dba6c598af830fd1b7a8bf0c

RUN cmake -G Ninja -B build -S llvm/ -DLLVM_ENABLE_PROJECTS=clang -DLLVM_TARGETS_TO_BUILD=Native -DLLVM_BINUTILS_INCDIR=/usr/include -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang

RUN ninja -C build install-clang install-clang-resource-headers install-LLVMgold install-llvm-ar install-llvm-ranlib

WORKDIR /root/

RUN dnf builddep -y pybind11

RUN git clone https://github.com/pybind/pybind11

WORKDIR /root/pybind11

RUN git checkout v2.13.6

ENV CFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' \
    CXXFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' \
    FFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' \
    FCFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS  -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' \
    LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -flto=thin -ffat-lto-objects -Wl,--build-id=sha1  ' \
    CC=clang \
    CXX=clang++

RUN /usr/bin/cmake -S . -B redhat-linux-build -DCMAKE_C_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_Fortran_FLAGS_RELEASE:STRING=-DNDEBUG -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF -DCMAKE_INSTALL_PREFIX:PATH=/usr -DINCLUDE_INSTALL_DIR:PATH=/usr/include -DLIB_INSTALL_DIR:PATH=/usr/lib64 -DSYSCONF_INSTALL_DIR:PATH=/etc -DSHARE_INSTALL_PREFIX:PATH=/usr/share -DLIB_SUFFIX=64 -DBUILD_SHARED_LIBS:BOOL=ON -B python3 -DCMAKE_BUILD_TYPE=Debug -DPYTHON_EXECUTABLE=/usr/bin/python3 -DPYBIND11_INSTALL=TRUE -DUSE_PYTHON_INCLUDE_DIR=FALSE

RUN /usr/bin/make -O -j$(nproc) V=1 VERBOSE=1 -C python3

RUN /usr/bin/python3 setup.py build '--executable=/usr/bin/python3 -sP'

RUN make -C python3 check -j$(nproc)
@thesamesam
Copy link
Member

For completeness and to aid searching, could you include the details of the failure here as well like you did for adcli? Thanks.

@tstellar
Copy link
Collaborator Author

Here is the output from the failing tests:

=================================== FAILURES ===================================
________________________________ test_vectorize ________________________________

    def test_vectorize():
        n = 3
        array = m.create_rec_simple(n)
        values = m.f_simple_vectorized(array)
>       np.testing.assert_array_equal(values, [0, 10, 20])

array      = array([(False, 0, 0. , -0. ), ( True, 1, 1.5, -2.5),
       (False, 2, 3. , -5. )],
      dtype={'names': ['bool_', 'uint_', 'float_', 'ldbl_'], 'formats': ['?', '<u4', '<f4', '<f16'], 'offsets': [0, 4, 8, 16], 'itemsize': 32})
n          = 3
values     = array([0, 0, 0], dtype=uint32)

../../tests/test_numpy_dtypes.py:382:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([0, 0, 0], dtype=uint32), [0, 10, 20])
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not equal
E
E           Mismatched elements: 2 / 3 (66.7%)
E           Max absolute difference: 20
E           Max relative difference: 1.
E            x: array([0, 0, 0], dtype=uint32)
E            y: array([ 0, 10, 20])

args       = (<built-in function eq>, array([0, 0, 0], dtype=uint32), [0, 10, 20])
func       = <function assert_array_compare at 0x7f0c0bfaee80>
kwds       = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}
self       = <contextlib._GeneratorContextManager object at 0x7f0c0c86d390>

/usr/lib64/python3.13/contextlib.py:85: AssertionError
___________________________ test_vectorized_noreturn ___________________________

    def test_vectorized_noreturn():
        x = m.NonPODClass(0)
        assert x.value == 0
        m.add_to(x, [1, 2, 3, 4])
>       assert x.value == 10
E       assert 4 == 10
E        +  where 4 = <pybind11_tests.numpy_vectorize.NonPODClass object at 0x7f0c0c12feb0>.value

x          = <pybind11_tests.numpy_vectorize.NonPODClass object at 0x7f0c0c12feb0>

../../tests/test_numpy_vectorize.py:264: AssertionError

@fhahn
Copy link
Contributor

fhahn commented Dec 11, 2024

@tstellar I am trying to build this with TypeSanitizer, but I am hitting the following error. Do you know where to add the sanitizer library?

make -C python3 check -j$(nproc)
....
ImportError: /root/build/pybind11/python3/tests/pybind11_tests.cpython-313-x86_64-linux-gnu.so: undefined symbol: __tysan_shadow_memory_address
ImportError while loading conftest '/root/build/pybind11/tests/conftest.py'.
../../tests/conftest.py:22: in <module>
    import pybind11_tests
E   ImportError: /root/build/pybind11/python3/tests/pybind11_tests.cpython-313-x86_64-linux-gnu.so: undefined symbol: __tysan_shadow_memory_address

@tstellar
Copy link
Collaborator Author

@fhahn Did you add it to the LDFLAGS? Sometimes the python modules use the same flags that were used to link the interpreter so it is hard to change them. How can I test this out myself? I tried pulling from users/fhahn/tysan-a-type-sanitizer-runtime-library, but the compile jobs seem to hang with -fsantitize=type.

@fhahn
Copy link
Contributor

fhahn commented Dec 17, 2024

Unfortunately the instrumentation currently is quite heavy, so it might run into a case where it is stuck generating and optimizing a lot of IR for a large input file.

@gbMattN has been working on improvements to move more of the checks to the runtime library, reducing the added additional IR for instrumentation (https://github.com/gbMattN/llvm-project/tree/users/gbmattn/tysan-reduce-inlining)

Now that the initial patches started landing, hopefully we can try it on that case, assuming it didn't finish eventually.

@tstellar
Copy link
Collaborator Author

I was able to get the type sanitizer to work. It looks like there is a type-aliasing violation:

==19617==ERROR: TypeSanitizer: type-aliasing-violation on address 0x555e505acd98 (pc 0x555e503fad90 bp 0x7fff5de32690 sp 0x7fff5de32638 tid 19617)
READ of size 8 at 0x555e505acd98 with type p1 _ZTSSt9type_info accesses an existing object of type std::array<std::type_info const*, 1ul>
    #0 0x555e503fad8f in pybind11::cpp_function::initialize_generic(std::unique_ptr<pybind11::detail::function_record, pybind11::cpp_function::InitializingFunctionRecordDeleter>&&, char const*, std::type_info const* const*, unsigned long) /root/rpmbuild/BUILD/pybind11-2.13.6-build/pybind11-2.13.6/python3/mock_install/include/pybind11/pybind11.h:484:68

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants