Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inline keyword? #1106

Closed
timholy opened this issue Aug 2, 2012 · 19 comments
Closed

inline keyword? #1106

timholy opened this issue Aug 2, 2012 · 19 comments
Milestone

Comments

@timholy
Copy link
Member

timholy commented Aug 2, 2012

What are the thoughts on having an "inline" keyword? I've found a situation that could benefit from one:

abstract BoundaryCondition
type BCnil <: BoundaryCondition; end # ignore edges (error when invalid)
type BCnan <: BoundaryCondition; end  # Return NaN when over the edge
type BCna <: BoundaryCondition; end # Use just the good values, when possible
type BCreflect <: BoundaryCondition; end # Reflecting boundary conditions
type BCperiodic <: BoundaryCondition; end # Periodic boundary conditions
type BCnearest <: BoundaryCondition; end # Return closest edge element
wrap(::Type{BCnil}, pos::Int, len::Int) = pos
wrap(::Type{BCperiodic}, pos::Int, len::Int) = mod(pos-1, len) + 1
wrap(::Type{BCnearest}, pos::Int, len::Int) = pos < 1 ? 1 : (pos > len ? len : pos)
#wrap(::Type{BCreflect}, pos::Int, len::Int) = (tmp = fld(pos-1,len); iseven(tmp) ? pos-tmp*len : (tmp+1)*len-pos+1)
#wrap(::Type{BCreflect}, pos::Int, len::Int) = iseven(fld(pos-1,len)) ? mod(pos-1, len)+1 : len-mod(pos-1,len)
wraprefl(posrem::Int, len::Int) = posrem < len ? posrem+1 : 2*len-posrem
wrap(::Type{BCreflect}, pos::Int, len::Int) = wraprefl(mod(pos-1, 2*len), len)

Reflect is the hardest case to make efficient, because I can't get it to fully inline. If I manually inline, it's almost 3-fold faster than the best of these, which is the one not commented out.

@JeffBezanson
Copy link
Member

It's ugly, but this is something every compiler needs eventually. In this particular case the functions are all small so we can probably do better without a declaration, but there will always be big functions you want inlined even though the compiler would never do it on its own.

@diegozea
Copy link
Contributor

The other day I was thinking would be great a macro @inline but, it's out of my knowledge even for try to sketch up it.

@StefanKarpinski
Copy link
Member

When we've discusses this the notion of using online at the call site came up. I think both a function modifier and a call site hint would be good to be able to do.

@timholy
Copy link
Member Author

timholy commented Mar 7, 2013

Bump. In playing around with iteration via immutables, I noticed just how crucial it is that next and done be inlineable. Would be worth using for the first dimension of Grid.Counter, too. Without an inline keyword or similar, there's no way to do it because you have to increment state[1] and then return the whole state object.

With iteration, since next and done are not usually explicitly written out by the programmer, any macros/call site manipulations would have to be inserted by the parser.

@timholy
Copy link
Member Author

timholy commented Mar 7, 2013

Case in point: https://gist.github.com/timholy/5108548. This uses the brand-new iterator.jl in Images:
https://github.com/timholy/Images.jl/blob/master/src/iterator.jl
You'll either need to remove the @sprofile manually or do "require("Profile"); using SProfile" before running.

In contrast to the 1D case I tested here:
d87efc5#commitcomment-2750827
this 2D case is nowhere near fast enough to be useful. SProfile shows that next is not being inlined despite my attempt to write it so that it would be.

I even turned off gc, which led to a disturbingly-large improvement for something that really should be doing no gc whatsoever (and with it enabled, SProfile captures lots of backtraces inside gc).

@timholy
Copy link
Member Author

timholy commented Mar 7, 2013

Actually there's something really fishy going on: in that gist, if I just comment out the whole try/finally block---skipping the Iterator stuff altogether---then the first part runs 30-fold faster. Compiler bug?

@JeffBezanson
Copy link
Member

I believe the try/finally is forcing s out of a register, to guarantee that the variable has the right value if a catch occurs. This is probably overly pessimistic -- it's supposed to do that only if the variable is used in a catch block, but that doesn't seem to be happening.

@timholy
Copy link
Member Author

timholy commented Mar 7, 2013

That's not it; I can comment out just the try/finally, and it still happens.

If I had to guess, it's having a hard time inferring the output type of ref(A, iter::IteratorState).

@JeffBezanson
Copy link
Member

That's a relief; if that had been the problem it would have been a pain to fix :)

We can try finfer(test,(Int,Int,Int)).

@timholy
Copy link
Member Author

timholy commented Mar 8, 2013

Looks like my guess was wrong too; finfer checked out fine. I'm not sure what was happening there, but I do think there's a clear inlining issue here. I've distilled it down to a simpler file, which I added to that gist (called "testsimple.jl"). If I use the version with Iterator2D, it's 40-fold slower than Iterator1D. There's a little more that an Iterator2D has to do, but it doesn't seem that it should be a factor of 40.

The more telling aspect is the sprofile trace. With the Iterator1D you never see a backtrace inside next(), indicating that this function is inlined. With the Iterator2D the large majority of backtraces are in next():

julia> sprofile_tree(true)
6 /tmp/testsimple.jl; next; line: 42
752 julia; ???; offset: 0x00
752 /lib/x86_64-linux-gnu/libc.so.6; __libc_start_main; offset: 0xed
752 julia; ???; offset: 0x00
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; julia_trampoline; offset: 0x83
752 julia; ???; offset: 0x00
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 ???; ???; offset: 0x00
752 client.jl; _start; line: 341
752 client.jl; run_repl; line: 141
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 ???; ???; offset: 0x00
752 client.jl; eval_user_input; line: 68
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 loading.jl; include_from_node1; line: 89
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; jl_load; offset: 0x6b
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
752 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
732 no file; anonymous; line: 292
732 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
732 ???; ???; offset: 0x00
4 /tmp/testsimple.jl; next; line: 42
1 /tmp/testsimple.jl; test; line: 38
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 string.jl; println; line: 5
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 string.jl; print; line: 4
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 ???; ???; offset: 0x00
1 string.jl; print; line: 3
1 grisu.jl; _show; line: 86
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
4 /tmp/testsimple.jl; test; line: 63
5 /tmp/testsimple.jl; test; line: 64
7 /tmp/testsimple.jl; test; line: 64
654 /tmp/testsimple.jl; test; line: 71
42 /tmp/testsimple.jl; next; line: 42
8 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
3 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
3 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
3 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
4 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
4 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
4 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
7 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
2 /tmp/testsimple.jl; next; line: 42
7 /tmp/testsimple.jl; next; line: 42
464 /tmp/testsimple.jl; next; line: 42
122 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
122 /lib/x86_64-linux-gnu/libc.so.6; __libc_malloc; offset: 0x75
1 /lib/x86_64-linux-gnu/libc.so.6; ???; offset: 0x00
1 /lib/x86_64-linux-gnu/libc.so.6; ???; offset: 0x00
1 /lib/x86_64-linux-gnu/libc.so.6; __default_morecore; offset: 0x09
1 /lib/x86_64-linux-gnu/libc.so.6; __sbrk; offset: 0x45
1 /lib/x86_64-linux-gnu/libc.so.6; brk; offset: 0x0a
1 /lib/x86_64-linux-gnu/libc.so.6; ???; offset: 0x00
1 /lib/x86_64-linux-gnu/libc.so.6; ???; offset: 0x00
1 /lib/x86_64-linux-gnu/libc.so.6; ???; offset: 0x00
107 /lib/x86_64-linux-gnu/libc.so.6; ???; offset: 0x00
10 /lib/x86_64-linux-gnu/libc.so.6; ???; offset: 0x00
7 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
5 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
6 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
32 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
5 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
2 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
4 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
255 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
8 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
11 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /tmp/testsimple.jl; next; line: 42
4 /tmp/testsimple.jl; next; line: 42
16 /tmp/testsimple.jl; next; line: 42
44 /tmp/testsimple.jl; next; line: 42
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
5 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
5 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
4 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
9 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
3 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
5 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
4 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
2 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
5 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
1 /tmp/testsimple.jl; next; line: 42
1 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
3 /tmp/testsimple.jl; next; line: 42
4 /tmp/testsimple.jl; next; line: 42
16 /tmp/testsimple.jl; next; line: 42
2 /tmp/testsimple.jl; next; line: 42
4 /tmp/testsimple.jl; next; line: 42
7 /tmp/testsimple.jl; next; line: 42
4 /tmp/testsimple.jl; next; line: 42
6 /tmp/testsimple.jl; next; line: 42
1 /tmp/testsimple.jl; next; line: 42
2 /tmp/testsimple.jl; next; line: 42
7 /tmp/testsimple.jl; next; line: 42
5 /tmp/testsimple.jl; next; line: 42
1 /tmp/testsimple.jl; next; line: 42
5 /tmp/testsimple.jl; next; line: 42
7 /tmp/testsimple.jl; test; line: 71
5 /tmp/testsimple.jl; test; line: 71
3 /tmp/testsimple.jl; test; line: 71
3 /tmp/testsimple.jl; test; line: 71
1 /tmp/testsimple.jl; test; line: 71
3 /tmp/testsimple.jl; test; line: 73
21 /tmp/testsimple.jl; test; line: 73
1 /tmp/testsimple.jl; test; line: 73
8 /tmp/testsimple.jl; test; line: 73
5 ???; ???; offset: 0x00
20 no file; anonymous; line: 292
20 /home/tim/src/julia/usr/bin/../lib/libjulia-release.so; ???; offset: 0x00
20 librandom.jl; randmtzig_fill_randn!; line: 170
20 /home/tim/src/julia/usr/bin/../lib/librandom.so; randmtzig_fill_randn; offset: 0x1d
2 /home/tim/src/julia/usr/bin/../lib/librandom.so; randmtzig_randn; offset: 0x12
2 /home/tim/src/julia/usr/bin/../lib/librandom.so; randmtzig_randn; offset: 0x33
3 /home/tim/src/julia/usr/bin/../lib/librandom.so; randmtzig_randn; offset: 0x3b
2 /home/tim/src/julia/usr/bin/../lib/librandom.so; randmtzig_randn; offset: 0x44
3 /home/tim/src/julia/usr/bin/../lib/librandom.so; randmtzig_randn; offset: 0xc4
1 /home/tim/src/julia/usr/bin/../lib/librandom.so; dsfmt_genrand_close1_open2; offset: 0x06
1 /home/tim/src/julia/usr/bin/../lib/librandom.so; dsfmt_genrand_close1_open2; offset: 0x0f
1 /home/tim/src/julia/usr/bin/../lib/librandom.so; dsfmt_genrand_close1_open2; offset: 0x2d
1 /home/tim/src/julia/usr/bin/../lib/librandom.so; dsfmt_gen_rand_all; offset: 0x5f

@timholy
Copy link
Member Author

timholy commented Mar 8, 2013

Shoot, with the lack of spacing that's uninterpretable. Here's the flat profile:

Count File Function Line/offset
8084 ...bjulia-release.so ??? 0
752 ...bjulia-release.so jl_load 6b
752 ...bjulia-release.so julia_trampoline 83
1 .../lib/librandom.so dsfmt_gen_rand_all 5f
1 .../lib/librandom.so dsfmt_genrand_close1_open2 6
1 .../lib/librandom.so dsfmt_genrand_close1_open2 f
1 .../lib/librandom.so dsfmt_genrand_close1_open2 2d
20 .../lib/librandom.so randmtzig_fill_randn 1d
2 .../lib/librandom.so randmtzig_randn 12
2 .../lib/librandom.so randmtzig_randn 33
3 .../lib/librandom.so randmtzig_randn 3b
2 .../lib/librandom.so randmtzig_randn 44
3 .../lib/librandom.so randmtzig_randn c4
122 ...nux-gnu/libc.so.6 ??? 0
1 ...nux-gnu/libc.so.6 __default_morecore 9
122 ...nux-gnu/libc.so.6 __libc_malloc 75
752 ...nux-gnu/libc.so.6 __libc_start_main ed
1 ...nux-gnu/libc.so.6 __sbrk 45
1 ...nux-gnu/libc.so.6 brk a
658 /tmp/testsimple.jl next 42
1 /tmp/testsimple.jl test 38
4 /tmp/testsimple.jl test 63
12 /tmp/testsimple.jl test 64
673 /tmp/testsimple.jl test 71
33 /tmp/testsimple.jl test 73
2242 ??? ??? 0
752 client.jl _start 341
752 client.jl eval_user_input 68
752 client.jl run_repl 141
1 grisu.jl _show 86
2256 julia ??? 0
20 librandom.jl randmtzig_fill_randn! 170
752 loading.jl include_from_node1 89
752 no file anonymous 292
1 string.jl print 3
1 string.jl print 4
1 string.jl println 5

@ihnorton
Copy link
Member

ihnorton commented Oct 2, 2014

So can this be closed in light of 2eaf48d?

@StefanKarpinski
Copy link
Member

Did that get merged?

@quinnj
Copy link
Member

quinnj commented Oct 2, 2014

I think that just covered function definition usage and not call site usage, right?

@ihnorton
Copy link
Member

ihnorton commented Oct 2, 2014

yes:

julia/base/expr.jl

Lines 66 to 68 in 2ef8d31

macro inline(ex)
esc(_inline(ex))
end

@quinnj I believe it changes the behavior of inline_worthy which should be call site... [edit: I see, this was about specific call sites, not just blanket inlining]

@quinnj
Copy link
Member

quinnj commented Oct 2, 2014

Yeah, I'm thinking along the lines of

const SHIFTEDMONTHDAYS = [306,337,0,31,61,92,122,153,184,214,245,275]
function totaldays(y,m,d)
    # If we're in Jan/Feb, shift the given year back one
    z = m < 3 ? y - 1 : y
    mdays = SHIFTEDMONTHDAYS[m]
    # days + month_days + year_days
    return d + mdays + 365z + fld(z,4) - fld(z,100) + fld(z,400) - 306
end

function Date(y,m,d)
    return totaldays(y,m,d)
end
function Date1(y,m,d)
    return @inline totaldays(y,m,d)
end

With results:

In  [4]: @code_llvm Date(2013,1,1)

define i64 @"julia_Date;38842"(i64, i64, i64) {
top:
  %3 = call i64 @"julia_totaldays;38843"(i64 %0, i64 %1, i64 %2), !dbg !1615
  ret i64 %3, !dbg !1615
}

In  [5]: @code_llvm Date1(2013,1,1)

define i64 @"julia_Date1;38844"(i64, i64, i64) {
top:
  %3 = call i64 @"julia_totaldays;38843"(i64 %0, i64 %1, i64 %2), !dbg !1618
  ret i64 %3, !dbg !1618
}

@timholy
Copy link
Member Author

timholy commented Oct 2, 2014

Right, currently this only affects function declaration.

@ViralBShah
Copy link
Member

Closing this since we do have @inline now.

@timholy
Copy link
Member Author

timholy commented Apr 16, 2015

I think this was being left open because the 2nd type of inlining, call-site inlining, remains to be implemented.

jrevels referenced this issue in JuliaDiff/ForwardDiff.jl Sep 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants