Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in Fedora adcli package caused by 7954a0514ba7de40dba6c598af830fd1b7a8bf0c #119099

Open
tstellar opened this issue Dec 7, 2024 · 10 comments
Assignees
Labels
regression TBAA Type-Based Alias Analysis / Strict Aliasing

Comments

@tstellar
Copy link
Collaborator

tstellar commented Dec 7, 2024

I'm seeing a test case (test_ldap) for the adcli-0.9.2 Fedora package segfaulting with current main (342c8db). I've bisected the failure to commit 7954a05

The backtrace for the segfault is:

#0  0x00007eff72a6c65c in __strlen_evex () from /lib64/libc.so.6
#1  0x00007eff7299a1a3 in strdup () from /lib64/libc.so.6
#2  0x0000555b19a7095a in seq_dup (sequence=0x7ffe96d9bd40, length=<optimized out>, copy=<optimized out>)
    at /root/rpmbuild/BUILD/adcli-0.9.2-build/adcli-0.9.2/library/seq.c:323
#3  _adcli_strv_dup (strv=<optimized out>) at /root/rpmbuild/BUILD/adcli-0.9.2-build/adcli-0.9.2/library/adutil.c:171
#4  _adcli_ldap_mod_new (mod_op=0, values=0x7ffe96d9bd40, type=<optimized out>) at adldap.c:396
#5  _adcli_ldap_mod_new1 (mod_op=0, type=<optimized out>, value=<optimized out>) at adldap.c:409
#6  test_new1 () at adldap.c:491
#7  0x0000555b19a71bf9 in test_run (argc=<optimized out>, argv=<optimized out>) at test.c:231
#8  0x00007eff72915248 in __libc_start_call_main () from /lib64/libc.so.6
#9  0x00007eff7291530b in __libc_start_main_impl () from /lib64/libc.so.6
#10 0x0000555b19a70315 in _start ()

The source code for the project is here

It seems like it could be a case of UB, but I'm not sure. What other information would be helpful for reproducing?

@fhahn
Copy link
Contributor

fhahn commented Dec 8, 2024

@tstellar could you provide more details on how this project is built, i.e. how to configure the build and what flags are used?

@tstellar
Copy link
Collaborator Author

tstellar commented Dec 9, 2024

Here is a Dockerfile that will reproduce the issue:

FROM registry.fedoraproject.org/fedora:41

RUN dnf install -y cmake ninja-build git binutils-devel clang

WORKDIR /root/

RUN git clone https://github.com/llvm/llvm-project

WORKDIR /root/llvm-project

RUN git checkout 7954a0514ba7de40dba6c598af830fd1b7a8bf0c

RUN cmake -G Ninja -B build -S llvm/ -DLLVM_ENABLE_PROJECTS=clang -DLLVM_TARGETS_TO_BUILD=Native -DLLVM_BINUTILS_INCDIR=/usr/include -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang

RUN ninja -C build install-clang install-clang-resource-headers install-LLVMgold install-llvm-ar install-llvm-ranlib

WORKDIR /root/

RUN dnf builddep -y adcli

RUN git clone https://gitlab.freedesktop.org/realmd/adcli

WORKDIR /root/adcli

RUN git checkout 0.9.2

ENV CFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' \
    CXXFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer ' \
    FFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' \
    FCFLAGS='-O2 -flto=thin -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS  -fstack-protector-strong   -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' \
    LDFLAGS='-Wl,-z,relro -Wl,--as-needed  -Wl,-z,pack-relative-relocs -Wl,-z,now -flto=thin -ffat-lto-objects -Wl,--build-id=sha1  ' \
    CC=clang \
    CXX=clang++

RUN ./autogen.sh  --build=x86_64-redhat-linux --host=x86_64-redhat-linux --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --runstatedir=/run --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --disable-static --disable-silent-rules AR=llvm-ar RANLIB=llvm-ranlib || cat config.log

RUN make -j$(nproc) V=1
RUN make -j$(nproc) V=1 check

@fhahn
Copy link
Contributor

fhahn commented Dec 11, 2024

The issue boils down to a strict-aliasing violation https://clang.godbolt.org/z/6PKqE4b88

We have

static void                                                                     <<<
test_new_free (void)
{
        const char *values[] = { "value", "two", "three", NULL };
        LDAPMod *mod;

        mod = _adcli_ldap_mod_new (LDAP_MOD_ADD, "test", values);
        assert (mod != NULL);

        assert (mod->mod_op == LDAP_MOD_ADD);
        assert_str_eq (mod->mod_type, "test");
        assert_num_eq (seq_count (mod->mod_vals.modv_strvals), 3);
        assert_str_eq (mod->mod_vals.modv_strvals[0], "value");
        assert_str_eq (mod->mod_vals.modv_strvals[1], "two");
        assert_str_eq (mod->mod_vals.modv_strvals[2], "three");
        assert (mod->mod_vals.modv_strvals[3] == NULL);

        _adcli_ldap_mod_free (mod);
}

Here we define values as char ** and then pass it to
_adcli_ldap_mod_new which passes it to seq_count (inlined via ThinLTO),

seq_count then casts the pointer to void ** and tries to read the data as void *.

 int
seq_count (seq_voidp sequence)
{
        void **seq = sequence;
        int count;
        for (count = 0; seq && seq[count]; count++);
        return count;
}

@tstellar
Copy link
Collaborator Author

Is there a flag or sanitizer I can use to catch these, so I don't end up filing more erroneous bugs?

@fhahn
Copy link
Contributor

fhahn commented Dec 11, 2024

There's a work-in-progress type sanitizer (#76261), which is getting close to landing.

This particular case is currently missed by it though unfortunately due to missing instrumentation for the case

fhahn added a commit to fhahn/llvm-project that referenced this issue Jan 8, 2025
While there are no special rules in the standards regarding
void pointers and strict aliasing, emitting distinct tags for void
pointers break some common idioms and there is no good alternative to
re-write the code without strict-aliasing violations. An example is
to count the entries in an array of pointers:

    int count_elements(void * values) {
      void **seq = values;
      int count;
      for (count = 0; seq && seq[count]; count++);
      return count;
    }

https://clang.godbolt.org/z/8dTv51v8W

An example in the wild is from
llvm#119099

This patch avoids emitting distinct tags for void pointers, to avoid
those idioms causing mis-compiles for now.
fhahn added a commit to fhahn/llvm-project that referenced this issue Jan 8, 2025
While there are no special rules in the standards regarding
void pointers and strict aliasing, emitting distinct tags for void
pointers break some common idioms and there is no good alternative to
re-write the code without strict-aliasing violations. An example is
to count the entries in an array of pointers:

    int count_elements(void * values) {
      void **seq = values;
      int count;
      for (count = 0; seq && seq[count]; count++);
      return count;
    }

https://clang.godbolt.org/z/8dTv51v8W

An example in the wild is from
llvm#119099

This patch avoids emitting distinct tags for void pointers, to avoid
those idioms causing mis-compiles for now.
@fhahn
Copy link
Contributor

fhahn commented Jan 8, 2025

I put up #122116 to not emit distinct TBAA tags for void * pointers, to avoid breaking common idioms using void *

@thesamesam
Copy link
Member

thesamesam commented Jan 9, 2025

I worry a bit about that leading to more confusion (propagating myths about void*). Can't we recommend people use the may_alias attribute instead? That's what I've done in previous cases where people have used void* incorrectly, it was too awkward to fix cleanly.

Also, this is AFAICT the only reported instance so far. I would say it should be reported upstream to adcli first and see what they say.

EDIT: That said, see Andrew Pinski's comment about GCC's behaviour at #122116 (comment).

@fhahn
Copy link
Contributor

fhahn commented Jan 9, 2025

I agree that ideally we wouldn't have to add a special carveout for void* and I'd be happy to use what the standard allows to the fullest.

But it might be worth being pragmatic here (and follow GGC's behavior). The PR is not specifically to fix adcli, as per-project this can be worked around easily by disabling pointer TBAA (-fno-pointer-tbaa),but to try reduce the potential pain to our users, at least initially.

@fhahn
Copy link
Contributor

fhahn commented Jan 9, 2025

(also might be good to use the #122116 to discuss the change itself :) )

@thesamesam
Copy link
Member

(yes, quite fair - sorry!)

fhahn added a commit to fhahn/llvm-project that referenced this issue Jan 9, 2025
While there are no special rules in the standards regarding
void pointers and strict aliasing, emitting distinct tags for void
pointers break some common idioms and there is no good alternative to
re-write the code without strict-aliasing violations. An example is
to count the entries in an array of pointers:

    int count_elements(void * values) {
      void **seq = values;
      int count;
      for (count = 0; seq && seq[count]; count++);
      return count;
    }

https://clang.godbolt.org/z/8dTv51v8W

An example in the wild is from
llvm#119099

This patch avoids emitting distinct tags for void pointers, to avoid
those idioms causing mis-compiles for now.
@EugeneZelenko EugeneZelenko added the TBAA Type-Based Alias Analysis / Strict Aliasing label Jan 9, 2025
fhahn added a commit to fhahn/llvm-project that referenced this issue Jan 9, 2025
While there are no special rules in the standards regarding
void pointers and strict aliasing, emitting distinct tags for void
pointers break some common idioms and there is no good alternative to
re-write the code without strict-aliasing violations. An example is
to count the entries in an array of pointers:

    int count_elements(void * values) {
      void **seq = values;
      int count;
      for (count = 0; seq && seq[count]; count++);
      return count;
    }

https://clang.godbolt.org/z/8dTv51v8W

An example in the wild is from
llvm#119099

This patch avoids emitting distinct tags for void pointers, to avoid
those idioms causing mis-compiles for now.
fhahn added a commit to fhahn/llvm-project that referenced this issue Jan 28, 2025
While there are no special rules in the standards regarding
void pointers and strict aliasing, emitting distinct tags for void
pointers break some common idioms and there is no good alternative to
re-write the code without strict-aliasing violations. An example is
to count the entries in an array of pointers:

    int count_elements(void * values) {
      void **seq = values;
      int count;
      for (count = 0; seq && seq[count]; count++);
      return count;
    }

https://clang.godbolt.org/z/8dTv51v8W

An example in the wild is from
llvm#119099

This patch avoids emitting distinct tags for void pointers, to avoid
those idioms causing mis-compiles for now.
fhahn added a commit to fhahn/llvm-project that referenced this issue Jan 28, 2025
While there are no special rules in the standards regarding
void pointers and strict aliasing, emitting distinct tags for void
pointers break some common idioms and there is no good alternative to
re-write the code without strict-aliasing violations. An example is
to count the entries in an array of pointers:

    int count_elements(void * values) {
      void **seq = values;
      int count;
      for (count = 0; seq && seq[count]; count++);
      return count;
    }

https://clang.godbolt.org/z/8dTv51v8W

An example in the wild is from
llvm#119099

This patch avoids emitting distinct tags for void pointers, to avoid
those idioms causing mis-compiles for now.
fhahn added a commit to fhahn/llvm-project that referenced this issue Jan 29, 2025
While there are no special rules in the standards regarding
void pointers and strict aliasing, emitting distinct tags for void
pointers break some common idioms and there is no good alternative to
re-write the code without strict-aliasing violations. An example is
to count the entries in an array of pointers:

    int count_elements(void * values) {
      void **seq = values;
      int count;
      for (count = 0; seq && seq[count]; count++);
      return count;
    }

https://clang.godbolt.org/z/8dTv51v8W

An example in the wild is from
llvm#119099

This patch avoids emitting distinct tags for void pointers, to avoid
those idioms causing mis-compiles for now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression TBAA Type-Based Alias Analysis / Strict Aliasing
Projects
None yet
Development

No branches or pull requests

4 participants