Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux: libclang does not discover Clang-specific system headers #201

Closed
PathogenDavid opened this issue Jul 6, 2021 · 4 comments
Closed
Labels
Area-Translation Issues concerning the translation from libclang into Biohazrd Platform-Linux Issues specific to Linux

Comments

@PathogenDavid
Copy link
Member

Unlike on Windows, libclang is not automagically finding certain system include paths like it does if you invoke clang from the terminal.

EG: #include <stddef.h> fails because stddef.h is missing. Looking at clang -E, it looks like stddef.h gets provided by Clang on my machine via /usr/lib/llvm-10/lib/clang/10.0.0/include/stddef.h (which matches this file in llvm-project.)

It's not all headers since <string> is able to be found.

It seems that on Linux Clang finds these headers relative to where it's installed. I think Clang might have these Clang-specific embedded on Windows? Need to look into whether that's an option that can be enabled when we build libclang or something.

For now I'm just gonna manually add builder.AddCommandLineArguments("-isystem/usr/lib/llvm-10/lib/clang/10.0.0/include/");

@PathogenDavid PathogenDavid added Area-Translation Issues concerning the translation from libclang into Biohazrd Platform-Linux Issues specific to Linux labels Jul 6, 2021
@PathogenDavid
Copy link
Member Author

Here's the cc1 command as reported by clang -### IncludeTest.cpp for this file:

#include <stddef.h>
#include <stdio.h>

int main()
{
    size_t lol = 100;
    printf("Hello, world!");
}
Click to expand...
/usr/lib/llvm-10/bin/clang
-cc1
-triple x86_64-pc-linux-gnu
-emit-obj
-mrelax-all
-disable-free
-disable-llvm-verifier
-discard-value-names
-main-file-name IncludeTest.cpp
-mrelocation-model static
-mthread-model posix
-mframe-pointer=all
-fmath-errno
-fno-rounding-math
-masm-verbose
-mconstructor-aliases
-munwind-tables
-target-cpu x86-64
-dwarf-column-info
-fno-split-dwarf-inlining
-debugger-tuning=gdb
-resource-dir /usr/lib/llvm-10/lib/clang/10.0.0
-internal-isystem /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9
-internal-isystem /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9
-internal-isystem /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9
-internal-isystem /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward
-internal-isystem /usr/local/include
-internal-isystem /usr/lib/llvm-10/lib/clang/10.0.0/include
-internal-externc-isystem /usr/include/x86_64-linux-gnu
-internal-externc-isystem /include
-internal-externc-isystem /usr/include
-fdeprecated-macro
-fdebug-compilation-dir /home/pathogendavid/Playground
-ferror-limit 19
-fmessage-length 0
-fgnuc-version=4.2.1
-fobjc-runtime=gcc
-fcxx-exceptions
-fexceptions
-fdiagnostics-show-option
-fcolor-diagnostics
-faddrsig
-o /tmp/IncludeTest-277653.o
-x c++
IncludeTest.cpp
/usr/bin/ld
-z relro
--hash-style=gnu
--build-id
--eh-frame-hdr
-m elf_x86_64
-dynamic-linker /lib64/ld-linux-x86-64.so.2
-o a.out
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crt1.o
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtbegin.o
-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9
-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu
-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../lib64
-L/lib/x86_64-linux-gnu
-L/lib/../lib64
-L/usr/lib/x86_64-linux-gnu
-L/usr/lib/../lib64
-L/usr/lib/x86_64-linux-gnu/../../lib64
-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../..
-L/usr/lib/llvm-10/bin/../lib
-L/lib
-L/usr/lib
/tmp/IncludeTest-277653.o
-lgcc
--as-needed
-lgcc_s
--no-as-needed
-lc
-lgcc
--as-needed
-lgcc_s
--no-as-needed
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtend.o
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o

Unfortunately -### does not work with libclang. (Probably because it suppresses the actual compilation.) (It might be trying to output it and getting swallowed by the test driver, did not check.)

@PathogenDavid PathogenDavid changed the title libclang does not discover Clang-specific system headers on Linux Linux: libclang does not discover Clang-specific system headers Jul 6, 2021
@PathogenDavid
Copy link
Member Author

PathogenDavid commented Jul 6, 2021

This is a helpful hint and partially confirms what I was suspecting: https://lists.llvm.org/pipermail/cfe-dev/2010-December/012751.html

You need to set HeaderSearchOptions::ResourceDir so that Clang can find its own header files including stddef.h.

I noticed that -resource-dir cc1 option and thought it might be related. (Unfortunately like most cc1 options there's no official documentation for it as far as I know.) I tried specifying it with -Xclang and it either does nothing or results in CXError_ASTReadError.

Also relevant: https://clang.llvm.org/docs/LibTooling.html#builtin-includes

Clang tools need their builtin headers and search for them the same way Clang does. Thus, the default location to look for builtin headers is in a path $(dirname /path/to/tool)/../lib/clang/3.3/include relative to the tool binary. This works out-of-the-box for tools running from llvm’s toplevel binary directory after building clang-resource-headers, or if the tool is running from the binary directory of a clang install next to the clang binary.

Tips: if your tool fails to find stddef.h or similar headers, call the tool with -v and look at the search paths it looks through.

The real question is why this isn't an issue with libclang on Windows because I'm pretty sure it does the same thing. (Pretty sure it was the root cause of #98)

It does appear to use the resource directory:

https://github.com/InfectedLibraries/llvm-project/blob/6d5c430eb3c0bd49f6f5bda4b0d2d8aa79b0fa3f/clang/lib/Driver/ToolChains/MSVC.cpp#L1233-L1236

This implies there's no funny business going on but then I wonder how it works at all:

https://github.com/InfectedLibraries/llvm-project/blob/d611e27a0f17a5775907fa048a3680fa29e22022/clang/lib/Driver/Driver.cpp#L108-L112

That function is used here in libclang:

https://github.com/InfectedLibraries/llvm-project/blob/d611e27a0f17a5775907fa048a3680fa29e22022/clang/tools/libclang/CIndexer.cpp#L137

The Windows logic is basically to find the location of libclang.dll (and it isn't modern maxpath aware >:[ )

This ends up used in clang_parseTranslationUnit_Impl and clang_indexSourceFile_Impl. Biohazrd basically uses clang_parseTranslationUnit2 which calls clang_parseTranslationUnit2FullArgv which calls the former. Definitely need to get back on Windows to try and figure what is actually happening here.

I suspect that in the typical case it's expected that Linux apps will consume a distro-provided shared library and Windows apps will use their own so it may be that libclang on Windows defaults to some embedding magic I haven't found.

PathogenDavid added a commit that referenced this issue Jul 9, 2021
Workaround for MochiLibraries/ClangSharp.Pathogen#1 uses a pre-built libclag-pathogen.so
Workaround for #201 hard-codes a -isystem option to Clang resource directory for tests.
PathogenDavid added a commit that referenced this issue Jul 9, 2021
Workaround for MochiLibraries/ClangSharp.Pathogen#1 uses a pre-built libclag-pathogen.so
Workaround for #201 hard-codes a -isystem option to Clang resource directory for tests.
@PathogenDavid
Copy link
Member Author

It looks like the include portion of the Clang resource directory is ending up in our build products under ClangSharp.Pathogen/build-linux/lib/clang/10.0.0/.

Comparing it to the resource directory on Lovebuntu, it is missing two things:

  • The lib folder (as the name implies, contains a bunch of static and dynamic libraries for supporting Clang runtime stuff.)
  • The share folder (contains some filter lists for the address sanitizer.)

In theory neither of those are important for Biohazrd operation so we can probably do without.

The folder totals 7.7 MB, which is reasonably small enough that we could feasibly embed it in the ClangSharp.Runtime NuGet package.

If we discovered we needed those libraries for some reason, that balloons to 45 MB which is quite a bit less palatable.

@PathogenDavid
Copy link
Member Author

PathogenDavid commented Aug 26, 2021

Turns out on Windows it's just getting it from the UCRT. I'll just package the resource directory in the runtime NuGet package for use on Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Translation Issues concerning the translation from libclang into Biohazrd Platform-Linux Issues specific to Linux
Projects
None yet
Development

No branches or pull requests

1 participant