Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add specialized codegen for jl_get_current_task #32812

Merged
merged 1 commit into from
Mar 10, 2020
Merged

Conversation

JeffBezanson
Copy link
Member

The code_llvm for this seems to be nastier than I'd expect:

julia> @code_llvm current_task()

;  @ task.jl:95 within `current_task'
define nonnull %jl_value_t addrspace(10)* @jsys1_current_task_15548(%jl_value_t addrspace(10)*, %jl_value_t addrspace(10)**, i32) #0 {
top:
  %3 = alloca %jl_value_t addrspace(10)**, align 8
  store volatile %jl_value_t addrspace(10)** %1, %jl_value_t addrspace(10)*** %3, align 8
  %thread_ptr = call i8* asm "movq %fs:0, $0", "=r"()
  %ptls_i8 = getelementptr i8, i8* %thread_ptr, i64 -15712
  %ptls = bitcast i8* %ptls_i8 to %jl_value_t***
  %4 = getelementptr %jl_value_t**, %jl_value_t*** %ptls, i64 826
  %5 = bitcast %jl_value_t*** %4 to %jl_value_t addrspace(10)**
  %6 = load %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %5, align 8
  ret %jl_value_t addrspace(10)* %6
}

Why is this using a less efficient calling convention, and what's with the volatile store?

@JeffBezanson JeffBezanson added the compiler:codegen Generation of LLVM IR and native code label Aug 6, 2019
@yuyichao
Copy link
Contributor

yuyichao commented Aug 6, 2019

The signature was #11306 (IIRC the logic looks for non jl_value_t* types and isn't triggered on empty argument) The logic there was to reduce allocation without considering jlcall frame allocation.

I believe the volatile store is there to help the debugger work by making sure the jlcall frame address is always available.

@JeffBezanson JeffBezanson added multithreading Base.Threads and related functionality performance Must go faster labels Aug 10, 2019
@chethega
Copy link
Contributor

Bump. FWIW, my patch in #34852 generates

julia> @code_llvm current_task()

;  @ task.jl:120 within `current_task'
define nonnull %jl_value_t addrspace(10)* @julia_current_task_16641() {
top:
  %thread_ptr = call i8* asm "movq %fs:0, $0", "=r"()
  %ptls_i8 = getelementptr i8, i8* %thread_ptr, i64 -15712
  %ptls = bitcast i8* %ptls_i8 to %jl_value_t***
  %0 = getelementptr %jl_value_t**, %jl_value_t*** %ptls, i64 826
  %1 = bitcast %jl_value_t*** %0 to %jl_value_t addrspace(10)**
  %2 = load %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %1, align 8
  ret %jl_value_t addrspace(10)* %2
}

I have no idea why, since your patch looks almost the same as mine, maybe something else got rid of the store volatile?

@JeffBezanson
Copy link
Member Author

Rebasing on master fixes it, so something else must have changed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code multithreading Base.Threads and related functionality performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants