-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault on Ruby 3.3 with RUBY_MN_THREADS=1
#33
Comments
Could you provide the full backtrace? There is a known incompatibility with MaNy that is Vernier and other GVL instrumentation consumer have no way to know what the Ruby thread is. ruby/ruby#8885 solves that but isn't merged yet. |
Here's the full backtrace:
|
@jpcamara Thanks for the report. I haven't been able to reproduce (on ruby/ruby@e8ab3f7). What version of Ruby 3.3 are you using.
|
@jhawthorn dang, thanks for giving it a try! Unfortunate it wasn't an easy repro for you! I was able to repro again using the latest commit as of this message (ruby/ruby@fabf5be). I brought in your example as well from https://github.com/jhawthorn/vernier/blob/main/examples/threaded_http_requests.rb and hit the same error. Here's some context on my setup:
Here's the backtrace from the
Thread.new do # <= test.rb:23
threads.each(&:join)
received.close
end |
Thanks! I've been able to reproduce and think I'm approaching a solution. The segfault occurs because we're trying to sample a M-N native thread which isn't running any code, and at that time Ruby sets EC to NULL, which is the same issue stackprof is seeing in tmm1/stackprof#221. Vernier should be able to deal with this, and we shouldn't be sampling a thread which is suspended. However I think the new M-N thread implementation isn't feeding us the same GVL instrumentation events which traditional threads do, which causes us to mistake a suspended thread as running and attempt to sample it from a native thread which actually has no thread running. Traditional thread Correctly recorded as spending most of its time suspended M-N thread Incorrectly recorded as "running" most of the time (all threads have this behaviour in this profile). This was recorded by adding a failsafe upstream to just record nothing if we try to sample a thread with no EC. |
So the GVL instrumentation API is broken with M-N threads? |
I've seen this as well, yea. It still fires all the correct events, but in the wrong order. I was just about to open an issue around it - started always occurs after ready and resumed, and exited fires before suspended. There are probably others but those are the simplest to reproduce. It also happens even when not running MN threads - the order was broken by refactoring the overall pthread code. |
Aaron mentioned something like that. Happy to dig into it and try to fix it if you have some repros. |
Sounds great - I'll provide some today |
@casperisfine I'll open a Ruby issue as well, but here's a draft PR with a spec i'd been working on to verify the order: ruby/ruby#9019 |
Nice! I'll try to make it pass tomorrow (Well except MaNy cause I don't know much about it). Thank you very much. |
ruby/ruby#9029 should fix this (not sure about M:N though). |
Fixed by ruby/ruby#9073 and I added a CI build in #49 |
Probably not a huge surprise, but
vernier
segfaults if run with MaNy threads turned on. I've compiled Ruby source locally and the followingtest.rb
can trigger it:Without
RUBY_MN_THREADS=1
, it runs fine. With it, I just get a genericSegmentation fault
. Run withgdb-ruby
I get:Happy to provide more info and steps if there's interest. But I can also understand if there's no interest in supporting such a nascent feature.
The text was updated successfully, but these errors were encountered: