Don't ask the call site to provide `@inbounds` when iterating over an array #27384

haampie · 2018-06-02T12:39:22Z

This issue came about when I discovered that sort! on a small vector of integers was slower than usual; further inspection shows that extrema is now 65% slower for a vector of 500 integers, and this is a result of bounds checks in each iteration.

I would not dare to fix this by adding @inbounds to the loop of extrema, since the function is defined for iterators in general (why is it extrema's responsibility to signal that the internals of the iterator will do inbounds things?). I'd say an iterator over an array should guarantee to stay inbounds and not require its call site to provide @inbounds.

But the issue is quite subtle. The function iterate(A::Array, ::Int) does something that comes close to bounds checking

function iterate(A::Array, i=1)
    @_propagate_inbounds_meta # why require the parent function to provide inbounds
    i >= length(A) + 1 ? nothing : (A[i], i+1) # when we here (kinda) assert we are inbounds?
end

but it's not entirely bounds checking, because in principle i could overflow, so to be safe an actual check is required to guarantee i > 0.

~~The issue is a result of #27079 (more robust iteration over vectors), but in fact it cheats a bit by requiring the user to always provide @inbounds.~~ Edit (sorry!): without inbounds for x in xs would do bounds checking already.

Possible solutions:

Create a wrapper for x in Mutable(xs) that allows to modify xs on the go; for x in xs is then immutable iteration by convention and can be fast even without @inbounds.
Create a separate implementation of extrema(::AbstractArray) that uses @inbounds (meh)
Just use @inbounds in extrema and document that it does that
Improve the speed of checkbounds(Bool, ::AbstractArray, ::Int) and use that rather than the i >= length(A) + 1 condition, and somehow tell the compiler i cannot overflow, so that the i > 0 check can be optimized away.

Edit, seems like the new iteration protocol is enough to fix things.

The text was updated successfully, but these errors were encountered:

mbauman · 2018-06-02T15:18:40Z

This is quite closely related to #21402, but indeed things are different now that we only have one method — and don't need to assume that the caller did the start/next/done dance correctly.

JeffBezanson · 2018-06-02T21:54:39Z

extrema is now 65% slower for a vector of 500 integers

Slower than v0.6.x?

haampie · 2018-06-02T22:13:13Z

65% slower on 0.7 when comparing bounds checking enabled vs disabled. Full report:

> @benchmark extrema(v) setup = (v = rand(Int, 1000))

884.1 ns on 0.6.2
1.222 μs on 0.7.0-alpha.0
422.8 ns with the proposed fix of #27386 (i.e. no bounds checking)

andyferris · 2018-06-11T01:10:46Z

I have to agree with @haampie's general point here... generic code should be able to be just as fast as concretely-typed code, and the interfaces available for use by functions like extrema(::Any) should be clear, fast and safe. (IMO we should try to avoid as much as possible the situation where a concrete version of perfectly valid generic method is added, with extra @inbounds added for speed).

Have we discussed having an unsafe_iterate (used when lowering for loops) with a default fallback to iterate, which was something already suggested for unsafe_next in #15291 (and before)? The alternative seems to be asserting that all implementations of @inbounds iterate(...) shouldn't be able to crash if the user doesn't mess with the iteration state (and have lowering do this for us?). Tricky, tricky...

mbauman · 2018-06-11T15:11:22Z

I think we can do this without an unsafe_iterate. What would that gain us beyond Harmen's proposal in #20469 (comment)?

nalimilan · 2018-06-11T18:02:02Z

I guess the idea of unsafe_iterate is that for x in X would be lowered to it, which would allow getting rid of bounds checks? But maybe we can achieve the same result by lowering for x in X to @inbounds iterate(...)?

mbauman · 2018-06-11T18:20:36Z

But doesn't the new iteration protocol give the iteration implementors enough information that they can safely add the @inbounds themselves without any additional context from the caller? No longer do we have to trust that the caller has called done — iterate itself does that check. The only thing it doesn't protect ourselves from is someone manually constructing and passing an invalid state that happens to pass the check but it still out of bounds.

The PR over at #27386 demonstrates (or hopefully will demonstrate once Nanosoldier is feeling better) that it's possible to check the entire domain of the state without incurring a performance penalty.

Now, it's harder (impossible?) to check the entire domain with abstract iterator implementations, but #20469 (comment) is arguing that the iterator state is sufficiently complicated that it would require the caller to specifically craft an invalid state such that it passes the "done" check but still yields an out of bounds access. I was initially inclined to agree, but as I was writing this it dawned on me that the easiest way to mess this up would be by iterating over two arrays of different lengths at the same time and inadvertently mixing up the states. Is there another approach we could take here?

I'd really love it if we could avoid either kludge (that is, either unsafe_iterate or @inbounds iterate).

haampie · 2018-06-11T18:39:32Z

inadvertently mixing up the states

I did not think of that, and it looks pretty hard to avoid unfortunately.

andyferris · 2018-06-12T00:24:29Z

The only thing it doesn't protect ourselves from is someone manually constructing and passing an invalid state that happens to pass the check but it still out of bounds.

Exactly. AFAICT the previous policy is that if no-one calls something called unsafe_xxx, uses ccall, or adds any @inbounds, Julia will basically never crash (i.e. segfault).

I am however happy to agree to ditch that policy - but it's worth discussing.

andyferris · 2018-06-12T00:27:20Z

(To be honest the fact that done and next are now together is only a minor mitigation of the fact that users can mess iteration up, since as mentioned they can still pass whatever state that they want).

andyferris · 2018-06-12T00:31:34Z

I'd really love it if we could avoid either kludge (that is, either unsafe_iterate or @inbounds iterate).

Rather silly idea: if we simply renamed iterate to unsafe_iterate then we are basically done (ugly, I know, but brutally honest).

haampie mentioned this issue Jun 2, 2018

Guarantee inbounds iteration over Array{T} #27386

Merged

nalimilan added performance Must go faster arrays [a, r, r, a, y, s] iteration Involves iteration or the iteration protocol labels Jun 2, 2018

haampie mentioned this issue Jun 10, 2018

Policy regarding @inbounds annotations #20469

Closed

haampie mentioned this issue Jun 12, 2018

Use propagate_inbounds in AbstractArray iterator #27535

Closed

nalimilan mentioned this issue Jun 20, 2018

Making sum(skipmissing(x)) faster #27679

Closed

mbauman closed this as completed in #27386 Jun 27, 2018

mbauman mentioned this issue Apr 8, 2021

Annotate iterate(::AbstractArray) with inbounds #40397

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't ask the call site to provide `@inbounds` when iterating over an array #27384

Don't ask the call site to provide `@inbounds` when iterating over an array #27384

haampie commented Jun 2, 2018 •

edited

Loading

mbauman commented Jun 2, 2018

JeffBezanson commented Jun 2, 2018

haampie commented Jun 2, 2018

andyferris commented Jun 11, 2018

mbauman commented Jun 11, 2018

nalimilan commented Jun 11, 2018

mbauman commented Jun 11, 2018

haampie commented Jun 11, 2018 •

edited

Loading

andyferris commented Jun 12, 2018

andyferris commented Jun 12, 2018 •

edited

Loading

andyferris commented Jun 12, 2018

Don't ask the call site to provide @inbounds when iterating over an array #27384

Don't ask the call site to provide @inbounds when iterating over an array #27384

Comments

haampie commented Jun 2, 2018 • edited Loading

mbauman commented Jun 2, 2018

JeffBezanson commented Jun 2, 2018

haampie commented Jun 2, 2018

andyferris commented Jun 11, 2018

mbauman commented Jun 11, 2018

nalimilan commented Jun 11, 2018

mbauman commented Jun 11, 2018

haampie commented Jun 11, 2018 • edited Loading

andyferris commented Jun 12, 2018

andyferris commented Jun 12, 2018 • edited Loading

andyferris commented Jun 12, 2018

Don't ask the call site to provide `@inbounds` when iterating over an array #27384

Don't ask the call site to provide `@inbounds` when iterating over an array #27384

haampie commented Jun 2, 2018 •

edited

Loading

haampie commented Jun 11, 2018 •

edited

Loading

andyferris commented Jun 12, 2018 •

edited

Loading