Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid memory access when running multiple Ruby script calls at once #27

Open
chendo opened this issue Mar 4, 2023 · 4 comments
Open
Labels
bug Something isn't working

Comments

@chendo
Copy link

chendo commented Mar 4, 2023

Under high load when running a simple program on my M1X Macbook Pro, I occasionally get an Invalid memory access (signal 11) crash.

Reproduction repo: https://github.com/chendo/crystal-anyolite-crash-repro

Tested with Crystal 1.7.2, anyolite main (efe3337).

Interestingly enough, I could not reproduce the same issue inside an amd64 docker container (Rosetta).

Full crash log:

Invalid memory access (signal 11) at address 0x0
[0x1042f4d80] *Exception::CallStack::print_backtrace:Nil +104 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x1042a93c8] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil)@/opt/homebrew/Cellar/crystal/1.7.2/share/crystal/src/signal.cr:127 +320 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x1a8f2c2a4] _sigtramp +56 in /usr/lib/system/libsystem_platform.dylib
[0x104481068] mrb_vm_exec +10496 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x10447e608] mrb_vm_run +148 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x1044d5ea4] mrb_load_exec +880 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x10446b3e4] execute_script_line +80 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x104413658] *Anyolite::RbInterpreter#execute_script_line<String>:struct.Anyolite::RbCore::RbValue +124 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x1042dfa10] ~procProc(HTTP::Server::Context, Nil)@src/bouncer.cr:20 +60 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x10442bba4] *HTTP::Server::RequestProcessor#process<IO+, IO+>:Nil +880 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x10442a700] *HTTP::Server#handle_client<IO+>:Nil +1756 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x1042e0364] ~procProc(Nil)@/opt/homebrew/Cellar/crystal/1.7.2/share/crystal/src/http/server.cr:468 +32 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x104354d3c] *Fiber#run:(IO::FileDescriptor | Nil) +84 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
[0x1042a8ec4] ~proc2Proc(Fiber, (IO::FileDescriptor | Nil))@/opt/homebrew/Cellar/crystal/1.7.2/share/crystal/src/fiber.cr:98 +12 in /Users/chendo/.cache/crystal/crystal-run-bouncer.tmp
@Hadeweka
Copy link
Contributor

Hadeweka commented Mar 4, 2023

The program from the linked repository runs perfectly fine on WSL on my Surface Pro X (which also has ARM64), as well as on my AMD64 on both Windows (using hey) and WSL.

How often do you encounter this error?

Maybe there is some problem with mruby or Crystal on Mac, which triggers this, but I don't really have any options to test this.
Based on the error message the crash seems to happen somewhere in the execution of the Ruby program, but the backtrace doesn't really go any further than that.

Perhaps running the Crystal program with valgrind could provide a bit more insight on where exactly the invalid memory access happens.

@chendo
Copy link
Author

chendo commented Mar 5, 2023

I was getting it every 2nd or 3rd run, although I did just have one where it didn't crash until the 5th run. It reads like to me that the crash itself is printing the backtrace, although I could be wrong there.

If I enable preview_mt with crystal run -D preview_mt repro.cr, then run wrk, I sometimes get the same crash almost immediately, although I also often see other types of crashes as Anyolite doesn't support multithreading currently.

Unfortunately it seems valgrind is Linux-only, at least according to the Homebrew cask. I only have arm64 on on my Mac at the moment. I'll try to come up with a more reliable repro.

@chendo
Copy link
Author

chendo commented Mar 5, 2023

Alright, I've improved the repro so it's 100% reliable for me now, even on amd64. I've added a sleep 0.1 within the Crystal code that is being called from mruby. It looks like the HTTP::Server spawns multiple threads even without -D preview_mt so I assume there's some shenanigans there. I do want to use anyolite for a high throughput/concurrency scenario though, so not really sure how to progress.

@Hadeweka
Copy link
Contributor

Hadeweka commented Mar 5, 2023

Yep, now I can reproduce it as well.

The problem is apparently that two script calls interfere with each other, since they both use the same interpreter.
In that case, putting a mutex lock around the script line call should fix the problem, but it might limit the number of requests per second a bit, depending on the performance of the Ruby scripts (at least I see no significant difference between both cases if I use an empty handler function).

Anyway, I could add some trivial guards to Anyolite to make it thread-safe for now. The limitation here is that mruby itself isn't thread-safe (and will probably never be).

Another solution will of course be support for multiple interpreters in parallel, but I will open a separate discussion thread for that (#28).

@Hadeweka Hadeweka added the bug Something isn't working label Mar 5, 2023
@Hadeweka Hadeweka changed the title Invalid memory access within mrb_vm_exec on arm64 Invalid memory access when running multiple Ruby script calls at once Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants