Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on closing julia #511

Closed
schlichtanders opened this issue Nov 21, 2023 · 7 comments
Closed

Segmentation fault on closing julia #511

schlichtanders opened this issue Nov 21, 2023 · 7 comments

Comments

@schlichtanders
Copy link

Hello,
I just want to report that I run into a segmentation fault when closing Julia again (aftr using RCall).

[1847] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /usr/local/julia/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
unknown function (ip: 0x7f95e9ce41c9)
__libc_start_main at /lib/x86_[64](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:65)-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 32734049 (Pool: 32[70](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:71)1[76](https://github.com/jolin-io/JolinWorkspaceTemplate/actions/runs/6945307045/job/18894379623?pr=54#step:11:77)1; Big: 32288); GC: 42

I have no minimal example, but I guess it could have something todo with me using an R function inside an async julia task, which somehow is not correctly finalized or prevents some other part from finalizing.

The above error occurs on a docker container build on top julia:1.9, while when I run it on my local laptop, the same code does not throw an error, but hangs infinitely.

@schlichtanders
Copy link
Author

A similar segmentation fault was already reported to julialang JuliaLang/julia#43556

it is about that switching tasks and calling Base.iolock_end() don't work well together , but I couldn't find iolock_end. Maybe some related function is nevertheless called, or some similar unexpected task switching happens.

@schlichtanders
Copy link
Author

schlichtanders commented Nov 21, 2023

I was able to replicate the segmentation fault it is combination out of three components:

  • a julia object
  • an R function which might return this Julia object
  • an async task which calls this R function EDIT: this is actually not needed

Everything is fine until the julia session is closed - then the same segmentation fault is thrown

julia> struct SingletonType end

julia> Singleton=SingletonType()
SingletonType()

julia> using RCall

R> library(JuliaCall)

R> r_singleton = julia_eval("Singleton")
┌ Warning: RCall.jl: Julia version 1.9.3 at location /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/bin will be used.
│ Loading setup script for JuliaCall...
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172
┌ Warning: RCall.jl: Finish loading setup script for JuliaCall.
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172

julia> rf = reval("""function(){
               if (runif(1) > 0.9){
                       r_singleton
               } else {
                       rnorm(1)
               }
       }""")
RObject{ClosSxp}
function () 
{
    if (runif(1) > 0.9) {
        r_singleton
    }
    else {
        rnorm(1)
    }
}
julia> rf()
RObject{RealSxp}
[1] 1.055938


julia> 

[2389910] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 8519504 (Pool: 8511578; Big: 7926); GC: 13

[2389910] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 8519504 (Pool: 8511578; Big: 7926); GC: 13
[1]    2389910 segmentation fault (core dumped)  julia --project

@palday
Copy link
Collaborator

palday commented Nov 21, 2023

Does this also happen when you start R directly and not via RCall? IIRC JuliaCall works by creating a latent Julia session and then opening RCall within that nested session. I don't know what happens when that JuliaCall session is already nested in an RCall session...

@schlichtanders
Copy link
Author

I will test soon, whether I can circumvent this by starting it via R directly.

I further simplified the failing example - it is only about getting some julia value to R. Boom.

julia> using RCall

R> library(JuliaCall)

R> ftype = julia_eval("Function")
┌ Warning: RCall.jl: Julia version 1.9.3 at location /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/bin will be used.
│ Loading setup script for JuliaCall...
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172
┌ Warning: RCall.jl: Finish loading setup script for JuliaCall.
└ @ RCall ~/.julia/packages/RCall/gOwEW/src/io.jl:172

julia> 

[2406346] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 7218572 (Pool: 7211588; Big: 6984); GC: 10

[2406346] signal (11.1): Segmentation fault
in expression starting at none:0
ijl_eh_restore_state at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/rtutils.c:265
_atexit at ./initdefs.jl:416
jfptr__atexit_46096.clone_1 at /nix/store/n2mf5wwcjasd5wlxinrz36y0g6l0w7q8-julia-bin-1.9.3/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
ijl_atexit_hook at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/init.c:280
jl_repl_entrypoint at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:718
main at julia (unknown line)
__libc_start_call_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
__libc_start_main at /nix/store/aw2fw9ag10wr9pf0qk4nk5sxi0q0bn56-glibc-2.37-8/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 7218572 (Pool: 7211588; Big: 6984); GC: 10

@schlichtanders
Copy link
Author

schlichtanders commented Nov 22, 2023

Does this also happen when you start R directly and not via RCall? IIRC JuliaCall works by creating a latent Julia session and then opening RCall within that nested session. I don't know what happens when that JuliaCall session is already nested in an RCall session...

A first try fails because I cannot find how to use a certain julia environment via JuliaCall.
When first starting julia and then using RCall, it picks up the same julia session, in standalone I couldn't find any documentation about it.

EDIT: I found it. You need to set environment variable JULIA_PROJECT="..."

@schlichtanders
Copy link
Author

schlichtanders commented Nov 22, 2023

I tested the examples now and it seems to work without Segfault if it is directly started via R.
Looks like a good workaround for me.

Still, it is natural to expect that JuliaCall works inside RCall. In the python world PythonCall and JuliaCall also work together.
It would be great if this Segfault could be solved. It is only the final exiting of julia - everything else works already.

@palday
Copy link
Collaborator

palday commented Nov 22, 2023

I know this has been my mantra lately ... but I'm wondering if JuliaCall needs to check to see whether there's an existing RCall session before creating a new one. (Why do I think it's JuliaCall's responsibility and not RCall's? Because JuliaCall depends on RCall but not vice versa. If there were a straightforward change we could make in RCall to make this easier, I would support it, but big changes in RCall tend to get stuck by very limited maintainer bandwidth.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants