Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy regarding @inbounds annotations #20469

Closed
nalimilan opened this issue Feb 6, 2017 · 9 comments
Closed

Policy regarding @inbounds annotations #20469

nalimilan opened this issue Feb 6, 2017 · 9 comments
Labels
arrays [a, r, r, a, y, s] performance Must go faster

Comments

@nalimilan
Copy link
Member

This issue was raised at #20421 (comment). We should define clear rules regarding when it's OK to use @inbounds in Base code. Currently the code seems to be inconsistent.

@JeffBezanson mentioned this: "only use @inbounds when you can be certain, from local information, that all accesses are in bounds." In a strict interpretation, that would mean @inbounds for i in eachindex(a) is not correct when a::AbstractArray, since an incorrect array implementation could return invalid indices, which would crash Julia.

This strict interpretation is problematic since (as @stevengj noted) often a generic AbstractArray method is also used for Array: if we cannot use @inbounds there, performance suffers in the common case just to avoid possible crashes in rare cases. A mechanism to enable @inbounds only for trusted types could help, but it would still be too bad that custom array types wouldn't benefit from bounds checking removal even when they implemented everything correctly: that would defeat the goal of making user-defined type as efficient as Base types. So I would say we need to trust custom array types, or at least allow them to opt out of bounds checking by stating that they are safe. Cf. #15291.

@mbauman
Copy link
Member

mbauman commented Feb 6, 2017

The only problem here is that @inbounds has two implementations right now:

  • There's the new swanky generic version. This version only affects AbstractArray implementations that have explicitly opted into bounds check removal with @boundscheck blocks. It only goes down one level of inlining. This is perfect — it allows for us to safely elide bounds checks in generic code and it'll only affect arrays that expect it.
  • But… Array still uses the old version of bounds check removal, implemented in codegen. This version affects all levels of inlining. It's leaky.

This means that custom array types are safe… unless they depend upon Array's bounds checking. The solution here isn't to rip out all support for fast and generic arrays, but rather to fix Array's bounds check elision.

julia> immutable BadVector{T} <: AbstractVector{Int}
           data::T
       end
       Base.size(X::BadVector) = size(X.data)
       Base.getindex(X::BadVector, i::Int) = X.data[i-1]

julia> BadVector(1:3)[:] # Nonscalar indexing is implemented with `@inbounds` and this is safe!
ERROR: BoundsError: attempt to access 3-element UnitRange{Int64} at index [0]
Stacktrace:
 [1] throw_boundserror(::UnitRange{Int64}, ::Int64) at ./abstractarray.jl:365
 [2] getindex at ./range.jl:512 [inlined]
 [3] getindex(::BadVector{UnitRange{Int64}}, ::Int64) at ./REPL[12]:5
 [4] macro expansion at ./multidimensional.jl:414 [inlined]
 [5] macro expansion at ./cartesian.jl:62 [inlined]
 [6] macro expansion at ./multidimensional.jl:411 [inlined]
 [7] _unsafe_getindex! at ./multidimensional.jl:406 [inlined]
 [8] macro expansion at ./multidimensional.jl:400 [inlined]
 [9] _unsafe_getindex(::Base.LinearSlow, ::BadVector{UnitRange{Int64}}, ::Base.Slice{Base.OneTo{Int64}}) at ./multidimensional.jl:393
 [10] macro expansion at ./multidimensional.jl:382 [inlined]
 [11] _getindex at ./multidimensional.jl:378 [inlined]
 [12] getindex(::BadVector{UnitRange{Int64}}, ::Colon) at ./abstractarray.jl:793

julia> BadVector([1,2,3])[:] # But this isn't because it propagates too far
3-element Array{Int64,1}:
 7955998441852384046
                   1
                   2

@nalimilan
Copy link
Member Author

Thanks. So at least we know where we're heading to.

Then I'd say policy should anticipate on this, and we should use @inbounds with any type as long as the indices are built in such a way that they will be valid if the AbstractArray implementation is correct (e.g. when using eachindex). Crashes will only happen if 1) the implementation is incorrect and 2) its author annotated @boundscheck sections. Sounds like a strict enough contract to me.

@andyferris
Copy link
Member

I've been meaning to ask, since I've been writing some LinAlg code lately, exactly how far to push this as well.

Sometimes I'd like to be able to remove multiple checks of array sizes. For example, * might check that two input matrices have compatible sizes and allocate an outpu matrix, but the implementation of A_mul_B! will check the sizes again. I could use @inbounds and @boundscheck to make this happen just once - but would that be an abuse?

(And yes, for e.g. 3x3 matrix multiplication, the overhead of unnecessary things in Base can be significant.)

@nalimilan
Copy link
Member Author

A sub-question (raised in #21402) is whether @inbounds is OK when iterating over any type, or only for AbstractArray. I would advocate for the former, which avoids duplicating methods just to add @inbounds as in #21402.

@mbauman
Copy link
Member

mbauman commented Aug 30, 2017

So it looks like this just needs #20485 and maybe a review of the verbiage in the @inbounds and @boundscheck macros.

mbauman added a commit that referenced this issue Aug 30, 2017
mbauman added a commit that referenced this issue Aug 30, 2017
mbauman added a commit that referenced this issue Aug 31, 2017
mbauman added a commit that referenced this issue Aug 31, 2017
@nalimilan
Copy link
Member Author

@mbauman So do you think it's OK to add @inbounds in Base methods which apply to any iterable, or only to for working with AbstractArray? Cf. #21402.

@mbauman
Copy link
Member

mbauman commented Sep 7, 2017

Ah, I missed that this issue arose from was also connected to a non-AbstractArray use-case. I didn't explicitly answer that in the docstring, but yes, I think that we should be able to trust that an implementation is correct if they've explicitly added their own @boundscheck annotations.

@nalimilan
Copy link
Member Author

OK, thanks. So that means for #21402 I only need to add @inbounds annotations to the general method rather than duplicating it for AbstractArray.

@haampie
Copy link
Contributor

haampie commented Jun 10, 2018

(edit, just realized this issue was closed already!)

In #27384 I looked at the example of extrema, which accepts a generic iterator. To me it seemed rather uncomfortable to add @inbounds to a loop iterating over any iterator.

For iterators in general, my guess is that almost always iteration can be guaranteed to go inbounds by the iterator itself*, at least for concrete implementations. That way we avoid placing awkward @inbounds annotations in functions that have no idea what iterators they're dealing with -- rather have them in the iterator itself.

* there's obviously no guarantee for AbstractArray, for various reasons: (1) the user can provide the iterator an invalid state, (2) the array is mutated during the iteration and (3) the array has a faulty, custom implementation. But:

(1) The iteration state of v::AbstractArray is a tuple of the eachindex(v) iterator and its state, so it takes some work to provide an invalid iteration state.
(2) This is no worse than #25743 where indexing of a view is known to be unsafe and a good solution that retains performance is just not in sight -- we just seem to live with that.
(3) This was countered above :)

My hope is that with the above policy, we could implement the AbstractArray iterator as

function iterate(A::AbstractArray, state=(eachindex(A),))
    y = iterate(state...)
    y === nothing && return nothing
    (@inbounds A[y[1]]), (state[1], tail(y)...)
end

so that we could limit the @inbounds annotations when iterating over generic arrays as much as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants