Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault during i686 windows bootstrap #18162

Closed
Keno opened this issue Aug 20, 2016 · 11 comments
Closed

Segfault during i686 windows bootstrap #18162

Keno opened this issue Aug 20, 2016 · 11 comments
Labels
system:windows Affects only Windows upstream The issue is with an upstream dependency, e.g. LLVM

Comments

@Keno
Copy link
Member

Keno commented Aug 20, 2016

May or may not be related to #18123, but I'm still seeing this with binutils 2.27, so it's not fixed yet by that version:

    JULIA usr/lib/julia/sys.o
Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_PRIV_INSTRUCTION at 0x2365da9a -- unknown function (ip: 2365DA9A)
while loading sysimg.jl, in expression starting on line 6
@Keno Keno added the system:windows Affects only Windows label Aug 20, 2016
@tkelman
Copy link
Contributor

tkelman commented Aug 20, 2016

doesn't happen on gcc 4.9, does happen with 5.4.0 and 6.1.0

@Keno
Copy link
Member Author

Keno commented Aug 21, 2016

Looks like a miscompile of some sort to me.

Stopped in function Most likely __ZN12_GLOBAL__N_113SLPVectorizer14tryToVectorizeEPN4llvm14BinaryOperatorERNS_7BoUpSLPE
=> 0x0000000001f32ad6<+0>:  retl    $12
   0x0000000001f32ad9<+3>:  nop
   0x0000000001f32ada<+4>:  pushl   %ebp
1|debug > reg esp
0x0000000000c5e820

julia> icxx"$(sess)->add_watchpoint($(current_task(sess)), $(RemotePtr{Void}(0x0000000000c5e820)), 4, rr::WATCH_READWRITE);"

1|debug > rc
Stopped in function Most likely __ZN4llvm19SmallPtrSetImplBaseC2EPPKvRKS0_
=> 0x00000000017d2688<+0>:  popl    %esi
   0x00000000017d2689<+1>:  popl    %edi
   0x00000000017d268a<+2>:  popl    %ebp
1|debug >

@Keno
Copy link
Member Author

Keno commented Aug 22, 2016

Investigated and and filed upstream as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77333.

@tkelman tkelman added the upstream The issue is with an upstream dependency, e.g. LLVM label Oct 28, 2016
@vchuravy
Copy link
Member

vchuravy commented Dec 5, 2016

Assuming this is reproducible by cross-compiling from Linux it seems to be fixed with 6.2.

cat Make.user
LLVM_VER=3.9.0
LLVM_ASSERTIONS=1
XC_HOST = i686-w64-mingw32
i686-w64-mingw32-gcc --version
i686-w64-mingw32-gcc (GCC) 6.2.1 20160830
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@tkelman
Copy link
Contributor

tkelman commented Dec 5, 2016

I certainly saw this in a cross compile with 6.2. I didn't try LLVM 3.9, so it's possibly not triggering the gcc bug in the same way that 3.7 was.

@tkelman
Copy link
Contributor

tkelman commented Dec 6, 2016

I can confirm that this doesn't happen when using LLVM 3.9 with i686-w64-mingw32-gcc 5.4 from cygwin, but it does happen with LLVM 3.7 and the same compiler version. So something that changed in LLVM is no longer triggering the GCC bug, but that doesn't inspire a lot of confidence that it won't come back at some point. And we need to keep the buildbots capable of building release-0.5 which won't be upgrading LLVM, so we need to have a compiler version installed there that can successfully build with LLVM 3.7 for some time.

@tkelman
Copy link
Contributor

tkelman commented Jan 2, 2017

Apparently the commit that caused this to stop happening on the llvm side was llvm-mirror/llvm@225dd82d634ca277 - which looks pretty small and not hugely consequential?

@Keno
Copy link
Member Author

Keno commented Jan 2, 2017

Yeah, that gcc bug should still be fixed.

@tkelman
Copy link
Contributor

tkelman commented Jan 2, 2017

Any ideas how we get a gcc dev to triage it? We could attempt to run creduce on everything that goes into opt.exe maybe to get a smaller test case, but that's a lot of source.

@Keno
Copy link
Member Author

Keno commented Jan 2, 2017

Maybe ping the developer whose commit caused the regression?

@tkelman
Copy link
Contributor

tkelman commented Jul 18, 2017

Fixed upstream in gcc 7.1, 6.4, and 5.5 (if there will ever be a 5.5) - the cygwin cross compilers are at 5.4 and they're carrying a patch that isn't quite what got applied in the end, but seemed to fix the issue (who knows if any other side effects though).

@tkelman tkelman closed this as completed Jul 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:windows Affects only Windows upstream The issue is with an upstream dependency, e.g. LLVM
Projects
None yet
Development

No branches or pull requests

3 participants