Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird crash under Linux when trying to deallocate SSLContext (and also linking with BoringSSL) #16

Closed
MrMage opened this issue May 28, 2018 · 7 comments

Comments

@MrMage
Copy link
Contributor

MrMage commented May 28, 2018

I'm using SwiftNIO and gRPC in one binary, which means that I need to link with both openssl (for SwiftNIO) and BoringSSL (for gRPC). (I know, linking with two variants of openssl at the same time is a terrible idea, but I can't avoid that right now :-/)

Unfortunately, since 8e0aa76, I am encountering crashes under Linux when any SSLContext instance is deallocated (see https://github.com/MrMage/openssl-crash-test/blob/master/Sources/openssl-crash-test/main.swift for example code). I have also confirmed that this works when removing the SSL_CTX_ctrl(ctx, SSL_CTRL_MODE, SSL_MODE_RELEASE_BUFFERS | SSL_MODE_AUTO_RETRY, nil) line as in master...MrMage:disable-release-buffers, and my service indeed used to work fine. Setting only either SSL_MODE_RELEASE_BUFFERS or SSL_MODE_AUTO_RETRY both still crash.

Unfortunately, I was not yet able to extract a usable stack trace — I just get a "Segmentation Fault" error. Weirdly enough, all other SSL_CTX_ctrl calls cause no crashes later on.

I am aware that this is a very fringe case, but any ideas you might have would be appreciated.

FYI, I have prepared a sample repository at https://github.com/MrMage/openssl-crash-test — simply clone it and run docker build --no-cache . (requires Docker installed of course). This will compile the sample project and run it right away, at which point you can reproduce the crash.

@normanmaurer
Copy link
Member

Looking...

@normanmaurer
Copy link
Member

I think I know what's wrong... stay tuned

@Lukasa
Copy link
Contributor

Lukasa commented May 29, 2018

I'll chase this this morning. I suspect this is fundamentally related to having two copies of libssl linked into the program, but I should try to validate that that's the case.

@normanmaurer
Copy link
Member

@Lukasa thanks... as discussed offline my idea did turn out to be wrong :(

@MrMage
Copy link
Contributor Author

MrMage commented May 29, 2018

I'll chase this this morning. I suspect this is fundamentally related to having two copies of libssl linked into the program, but I should try to validate that that's the case.

That's my hunch as well. Thank you both for looking into this, especially given that this linking two copies of libssl is a fairly unusual situation!

P.S.: @normanmaurer, what was your idea? :-)

@Lukasa
Copy link
Contributor

Lukasa commented May 29, 2018

Yeah, so my hunch is looking likely. Here's the backtrace:

* thread #1, name = 'openssl-crash-t', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
  * frame #0: 0x00005555556f0e14 openssl-crash-test`ssl_cert_clear_certs(cert=0x0000555555e10a74) at ssl_cert.c:232
    frame #1: 0x00005555556f0da2 openssl-crash-test`ssl_cert_free(c=0x0000555555e10a74) at ssl_cert.c:248
    frame #2: 0x00005555556f8a54 openssl-crash-test`SSL_CTX_free(ctx=0x0000555555e10ff0) at ssl_lib.c:356
    frame #3: 0x0000555555990a04 openssl-crash-test`SSLContext.deinit(self=<unavailable>) at SSLContext.swift:254
    frame #4: 0x0000555555990ae1 openssl-crash-test`SSLContext.__deallocating_deinit(self=<unavailable>) at SSLContext.swift:0
    frame #5: 0x00007ffff7682aab libswiftCore.so`_swift_release_dealloc + 11
    frame #6: 0x00005555559d0f27 openssl-crash-test`main at main.swift:0
    frame #7: 0x00007ffff53a6f45 libc.so.6`__libc_start_main + 245
    frame #8: 0x00005555555feeb9 openssl-crash-test`_start + 41

In particular, note that ssl_cert_free is pretty clearly in the binary (its address is very close to SSLContext.deinit).

You can see where this problem comes from when I set some breakpoints. In particular, if I run br set -n SSL_CTX_ctrl and br set -n SSL_CTX_new, I end up with the following two breakpoints:

2: name = 'SSL_CTX_ctrl', locations = 1, resolved = 1, hit count = 0
  2.1: where = libssl.so.1.0.0`SSL_CTX_ctrl, address = 0x00007ffff6380fe0, resolved, hit count = 0

3: name = 'SSL_CTX_new', locations = 2, resolved = 2, hit count = 0
  3.1: where = openssl-crash-test`SSL_CTX_new + 12 at ssl_lib.c:231, address = 0x00005555556f839c, resolved, hit count = 0
  3.2: where = libssl.so.1.0.0`SSL_CTX_new, address = 0x00007ffff63819d0, resolved, hit count = 0

Note that SSL_CTX_ctrl is present only in libssl, while SSL_CTX_new is present in both the binary itself and in libssl. Running through the program itself reveals that we are in fact using BoringSSL's SSL_CTX structure, constructed from and returned by SSL_CTX_new, but OpenSSL's SSL_CTX_ctrl function to mutate that structure. It's not really a surprise at all that this doesn't work out so well: there's no reason to assume that each library has the same assumption about how a SSL_CTX is laid out, so SSL_CTX_ctrl is silently corrupting the data structure.

Unfortunately there's an extremely limited amount we can do here because of the way BoringSSL is being used by grpc-swift. As BoringSSL is compiled directly into the binary with visible symbols, the Linux linker will always attempt to resolve our search for those symbols within the binary. However, BoringSSL's headers are not available to us (without declaring a dependency on grpc-swift, which we don't want to do). That means we have no option but to compile against the system's OpenSSL headers, which are highly unlikely to match what BoringSSL wants (in fact they're certain to not match).

As a further concern, I should note that libFoundation.so on Linux also links the system OpenSSL. That makes using BoringSSL at all in a Swift program on Linux a bit fraught. The example of how it becomes fraught is exactly the situation we have here: you end up with a headers/symbol mismatch in your program, where parts of the program compile against headers that do not match the implementation they are actually linked to.

This is a problem we have encountered before. The original design for this module called for using BoringSSL directly, and we shelved that plan temporarily because of the discovery that libFoundation.so already links the system OpenSSL. (Incidentally this is also why we don't support OpenSSL 1.1 at this time: all the Swift.org libFoundation.sos link OpenSSL 1.0.) At this time with the SwiftPM ecosystem as it is I do not believe it is possible to use both BoringSSL and Swift on Linux in a general case.

I'm actively investigating other options for supporting this model. One option is potentially to enhance SwiftPM to let us mix Swift and C in the same module, and then compile the C code with -fvisibility=hidden. That will cause BoringSSL not to export its symbols from the binary, allowing us to entirely hide it. However currently SwiftPM supports none of this, and I need to further investigate to see if it even works as I want it to.

Another option is to try to excise the OpenSSL dependency from Foundation on Linux. That would allow the community to standardise on BoringSSL as a default TLS implementation and all use that at the root of our dependency tree. This is a very tricky thing to arrange, as Foundation includes URLSession, and I can see no obvious route out of needing some TLS library there to provide the necessary functionality. I do not have high hopes on this particular option.

In the short term, my TL;DR is that BoringSSL is entirely incompatible with Swift on Linux. I recommend investigating whether the grpc-swift community are willing to investigate linking against the system libssl on Linux, which should the multiple-dependency issue. However, I don't believe this can be fixed on our end.

@Lukasa Lukasa closed this as completed May 29, 2018
MrMage added a commit to Timing-GmbH/grpc-swift that referenced this issue May 29, 2018
Under Linux, `libFoundation.so` already links with the system OpenSSL; so we need to use that one.
Using two `libssl`-style libraries in the same binary can cause all sorts of mayhem; see apple/swift-nio-ssl#16 (comment) for details.
MrMage added a commit to Timing-GmbH/grpc-swift that referenced this issue May 29, 2018
Under Linux, `libFoundation.so` already links with the system OpenSSL; so we need to use that one.
Using two `libssl`-style libraries in the same binary can cause all sorts of mayhem; see apple/swift-nio-ssl#16 (comment) for details.
@MrMage
Copy link
Contributor Author

MrMage commented May 29, 2018

@Lukasa thank you for your work investigating this! I have filed grpc/grpc-swift#238 that migrates SwiftGRPC to openssl on Linux. That was much easier than expected — I had expected GRPC to be more strongly tied to BoringSSL specifically.

IceRocky pushed a commit to IceRocky/grpc-swift that referenced this issue May 28, 2024
Under Linux, `libFoundation.so` already links with the system OpenSSL; so we need to use that one.
Using two `libssl`-style libraries in the same binary can cause all sorts of mayhem; see apple/swift-nio-ssl#16 (comment) for details.
teskobif7 added a commit to teskobif7/grpc-swift that referenced this issue Aug 14, 2024
Under Linux, `libFoundation.so` already links with the system OpenSSL; so we need to use that one.
Using two `libssl`-style libraries in the same binary can cause all sorts of mayhem; see apple/swift-nio-ssl#16 (comment) for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants