Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TySan doesn't catch strict-aliasing violation when using struct initializers #119615

Open
fhahn opened this issue Dec 11, 2024 · 5 comments
Open
Assignees
Labels
compiler-rt:tysan Type sanitizer

Comments

@fhahn
Copy link
Contributor

fhahn commented Dec 11, 2024

The initial version of TySan (#76261) doesn't properly report the first strict aliasing violation in the test case below

#include <stdio.h>

__attribute__((noinline))
void print(char**v) {
    printf("%s\n", v[0]);
}

int main() {
  char *values[2] = { "value", 0};
  void **curr = ((void *) values);
  int count;
  for (count = 0; curr && curr[count]; count++);

  values[0] = "test";
  print(values);
  return count;
}

Produces the output below with TySan. It looks like it is missing setting the initial type of values and the first reported violation is values[0] = "test";, while it should already report the read of void * for curr[count] from memory with char *.

> bin/clang -fsanitize=type test.c -O1  -g -isysroot $MACOS_SYSROOT
>  ./a.out
==81689==ERROR: TypeSanitizer: type-aliasing-violation on address 0x00016b16b1c0 (pc 0x000104c97dac bp 0x00016b16b1b0 sp 0x00016b16a940 tid 18058865)
WRITE of size 8 at 0x00016b16b1c0 with type p1 omnipotent char accesses an existing object of type p1 void
    #0 0x000104c97da8 in main test.c:14

==81689==ERROR: TypeSanitizer: type-aliasing-violation on address 0x00016b16b1c0 (pc 0x000104c97a68 bp 0x00016b16b180 sp 0x00016b16a910 tid 18058865)
READ of size 8 at 0x00016b16b1c0 with type p1 omnipotent char accesses an existing object of type p1 void
    #0 0x000104c97a64 in print test.c:5

test
@fhahn fhahn added the compiler-rt:tysan Type sanitizer label Dec 11, 2024
@fhahn
Copy link
Contributor Author

fhahn commented Dec 11, 2024

I think this is the underlying issue of why TySan doesn't detect the strict-aliasing violation in #119099

@gbMattN
Copy link
Contributor

gbMattN commented Jan 8, 2025

Hi @fhahn, I could look into fixing this if you aren't already?

@fhahn
Copy link
Contributor Author

fhahn commented Jan 8, 2025

@gbMattN that would be great thanks. I've not looked into it yet, but IIRC it looks like Clang doesn't emit TBAA metadata for the initializer

@gbMattN
Copy link
Contributor

gbMattN commented Jan 13, 2025

The following values exist in the IR pre-tysan pass

@.str.1 = private unnamed_addr constant [6 x i8] c"value\00", align 1
@__const.main.values = private unnamed_addr constant [2 x ptr] [ptr @.str.1, ptr null], align 16

char *values[2] = { "value", 0}; is internally implimented as a memcpy from @__const.main.values to values. We instrument this with __tysan_instrument_mem_inst, which copies the shadow memory of the source to the shadow memory of the target. But the shadow memory for @__const.main.values or @.str.a was never set.

Despite being global values, as they are not global variables, the instrumentGlobals function doesn't find them. You are also correct about them not having TBAA metadata, which will make this a bit tricky! Their IR definitions do include type information though.
I'll look more into a way to get this to work. Hopefully the solution will help with future TBAA related issues TySan has.

@gbMattN
Copy link
Contributor

gbMattN commented Jan 15, 2025

The approach I've gone with is to record a map of llvm::Type* to GlobalVariable*, being the type of the instruction and the TD generated from its TBAA metadata respectively. Then, for these global values with no TBAA metadata, you can assign it a TD by looking at this mapping.

The intuition behind this idea is that if this global value is every used correctly, that usage would have the correct TBAA, and thus create the correct TD. Therefore for any TBAA-data-less global value you use, the TD would exist. I have a branch on my fork where I implement this and it acts correctly in the test case above. This approach could be extended for any other issues that arise due to TBAA limitations.

@fhahn, The code has some hacks in it currently which I would want to refactor before raising it as a pull request, but I'd like to check if this approach sounds good to you first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler-rt:tysan Type sanitizer
Projects
None yet
Development

No branches or pull requests

2 participants