Check that axes start with 1 for AbstractRange operations #30950

timholy · 2019-02-04T02:58:03Z

Now that we have Base.IdentityUnitRange and Base.Slice, we need to be careful about fallbacks that just use first, step, stop-style properties. First noticed in JuliaArrays/IdentityRanges.jl#8.

The only surprising part about this is that I needed to introduce a trait: ranges like typemin(Int):typemax(Int) trigger an OverflowError if you call axes on them.

Now that we have Base.IdentityUnitRange and Base.Slice, we need to be careful about fallbacks that just use `first`, `step`, `stop`-style properties.

timholy

OK, this ended up being a bigger deal than expected. Easiest to review ignoring whitespace changes due to introducing a @testset in test/offsetarray.jl.

timholy · 2019-02-04T19:10:58Z

base/broadcast.jl

+_axistype(::AxesStart1,   ::AxesStart1,   a, b) = OneTo{Int}(a)
+_axistype(::AxesStartAny, ::AxesStart1,   a, b) = a   # if we get here, b has length 1
+_axistype(::AxesStart1,   ::AxesStartAny, a, b) = error("mismatched axes, or specialize `Broadcast.axistype` for ", typeof(a), " and ", typeof(b), " axes")
+_axistype(::AxesStartAny, ::AxesStartAny, a, b) = error("mismatched axes, or specialize `Broadcast.axistype` for ", typeof(a), " and ", typeof(b), " axes")


Unsure about these erroring methods. Here's an interesting and troubling case:

julia> using OffsetArrays, Test julia> x = [1] 1-element Array{Int64,1}: 1 julia> y = OffsetArray([1], -1:-1) 1-element OffsetArray(::Array{Int64,1}, -1:-1) with eltype Int64 with indices -1:-1: 1 julia> @inferred(x.*y) 1-element OffsetArray(::Array{Int64,1}, -1:-1) with eltype Int64 with indices -1:-1: 1 julia> @inferred(y.*x) ERROR: mismatched axes, or specialize `Broadcast.axistype` for Base.OneTo{Int64} and Base.IdentityUnitRange{UnitRange{Int64}} axes Stacktrace: [1] error(::String, ::Type, ::String, ::Type, ::String) at ./error.jl:42 [2] _axistype(::Base.AxesStart1, ::Base.AxesStartAny, ::Base.OneTo{Int64}, ::Base.IdentityUnitRange{UnitRange{Int64}}) at ./broadcast.jl:497 [3] axistype(::Base.OneTo{Int64}, ::Base.IdentityUnitRange{UnitRange{Int64}}) at ./broadcast.jl:494 [4] _bcs1 at ./broadcast.jl:485 [inlined] [5] _bcs at ./broadcast.jl:479 [inlined] [6] broadcast_shape at ./broadcast.jl:473 [inlined] [7] combine_axes at ./broadcast.jl:468 [inlined] [8] instantiate at ./broadcast.jl:256 [inlined] [9] materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(*),Tuple{OffsetArray{Int64,1,Array{Int64,1}},Array{Int64,1}}}) at ./broadcast.jl:802 [10] _materialize_broadcasted(::Function, ::OffsetArray{Int64,1,Array{Int64,1}}, ::Vararg{Any,N} where N) at /home/tim/src/julia-branch/usr/share/julia/stdlib/v1.2/Test/src/Test.jl:1279 [11] top-level scope at REPL[5]:1

Fascinating. So the larger question is: which axis should be used as the "basis" for the output and which axis is being "extruded" over that basis?

If we have any non-singleton dimension then it's a no-brainer: that's our output axis and all other singleton axes are discarded as they're being "extruded". We demand that the axes values match, so multiple non-singleton dimensions are by definition identical or an error. (We do promote the axis type, though, such that an OffsetArray that happens to be 1-based will still return a similarly non-offset OffsetArray for type stability.)

But when we have multiple singleton dimensions it's not clear. On master, we just use the last-provided axis for the given dimension, which isn't great. It's not really a one-based vs. offset thing — the same happens for combinations of offset arrays:

julia> axes(OffsetArray([0], 1) .+ OffsetArray([0], 2)) (Base.IdentityUnitRange(3:3),) julia> axes(OffsetArray([0], 2) .+ OffsetArray([0], 1)) (Base.IdentityUnitRange(2:2),)

I think the one-based vs. offset axis thing is a red herring — we don't really know which is going to matter in the output. So I say we just find some consistent (symmetric) rule that can generalize to multiple singleton offset axes, too. Some possibilities:

Always use a one-based output for any combination of singleton axes. I don't like this because then OffsetArray(rand(1), -1) .+ OffsetArray(rand(1), -1) would no longer be offset. So let's ax this candidate.

Always demand that combinations of singleton axes match. At first glance I'd think this would be massively breaking. It does help, though, that we completely ignore the pseudo-axes beyond the dimensionality of the array (e.g., we don't care what axes(fill(0), 2) returns, so scalars and zero-dimensional arrays could still broadcast with anything). It may be interesting to try out, but it could make for a really annoying trap where an expression mostly works until you end up with a one-vector instead of a two-vector.

Use some symmetric value-based rule that picks the result based on the axes values themselves. Possible candidates include:

Just pick the axis with the minimum or maximum element

Pick the axis with the element closest to 1 (biasing the comparison to favor either axes below or above 1 so there aren't ties). Perhaps the arc of the Julian universe bends towards 1-based arrays.

Pick the axis with the element farthest from 1 (again, biased to favor either above/below). The reasoning here would be that someone chose to have that offset; they should be allowed to keep it.

Pick the axis that matches the most other axes in the expression. In case of ties (which would be common for binary broadcasts), use one of the other rules.

timholy · 2019-02-04T19:12:42Z

base/range.jl

@@ -140,7 +140,20 @@ abstract type AbstractRange{T} <: AbstractArray{T,1} end
 RangeStepStyle(::Type{<:AbstractRange}) = RangeStepIrregular()
 RangeStepStyle(::Type{<:AbstractRange{<:Integer}}) = RangeStepRegular()

-convert(::Type{T}, r::AbstractRange) where {T<:AbstractRange} = r isa T ? r : T(r)
+AxesStartStyle(::Type{<:AbstractRange}) = AxesStart1()  # opt-out of AxesStart1 for "weird" range types


In some ways I'd prefer "opt in" on this trait, but I think it's too breaking. So I think it's better to mark AbstractRange types that violate the expectation of having unit indexing.

timholy · 2019-02-04T19:13:25Z

base/range.jl

@@ -157,6 +170,8 @@ type can represent values smaller than `oneunit(Float64)`.
 """
 abstract type OrdinalRange{T,S} <: AbstractRange{T} end

+convert(::Type{OrdinalRange{T,S}}, r::OrdinalRange{T,S}) where {T,S} = r


Not strictly necessary for this PR, I think, but I noticed their lack and thought it might be good for disambiguating.

timholy · 2019-02-04T19:14:31Z

base/reducedim.jl

@@ -43,33 +44,24 @@ function reduced_indices0(inds::Indices{N}, d::Int) where N
    end
 end

-function reduced_indices(inds::Indices{N}, region) where N


I initially started modifying this code to fix a problem but then transitioned to the changes above. Still, this seemed like old code in need of some cleanup.

timholy · 2020-06-16T10:37:19Z

Just added a 2.0 tag here so we don't forget to do this. Ideally we'll get this done during the 1.x series, but this particular PR felt like it was careening off in a bad direction. If I or someone gets back to it, I think the best strategy is to look at this to refresh memory on where it found problems, then forget about it and start from scratch, slowly and carefully seeing whether this can be fixed in a backwards-compatible way.

This is definitely one of those bugfixes that is likely to break someone's code, and the oft-linked https://xkcd.com/1172/ is not a perfect metaphor because the old broken behavior is not nearly so crazy. This has gone unfixed for so long in part because, for a long time, being an AbstractRange meant that you had 1-based indexing, and the change had deeper implications than I appreciated at the time.

timholy requested a review from mbauman February 4, 2019 02:58

timholy added 2 commits February 4, 2019 12:41

Check that axes start with 1 for AbstractRange operations

31cb847

Now that we have Base.IdentityUnitRange and Base.Slice, we need to be careful about fallbacks that just use `first`, `step`, `stop`-style properties.

Introduce testset in test/offsetarray.jl

f6034b8

timholy force-pushed the teh/bcast_range branch from a5c4c15 to f6034b8 Compare February 4, 2019 19:02

timholy commented Feb 4, 2019

View reviewed changes

timholy mentioned this pull request Feb 3, 2020

Type-piracy in similar with OffsetAxis JuliaArrays/OffsetArrays.jl#87

Open

timholy mentioned this pull request Mar 23, 2020

Broadcasting with IdOffsetRange seems wrong JuliaArrays/OffsetArrays.jl#104

Closed

timholy added this to the 2.0 milestone Jun 16, 2020

This was referenced Sep 19, 2020

Clarify ==and === for axes tests JuliaArrays/OffsetArrays.jl#120

Open

Indexing is not idempotent using Base.IdentityUnitRange JuliaArrays/OffsetArrays.jl#155

Closed

brenhinkeller added the ranges Everything AbstractRange label Nov 21, 2022

mbauman mentioned this pull request Feb 8, 2024

Restrict vector indexing of views to 1-based ranges #53249

Merged

jishnub mentioned this pull request Jun 13, 2024

bugs involving IdentityUnitRange #54783

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check that axes start with 1 for AbstractRange operations #30950

Check that axes start with 1 for AbstractRange operations #30950

timholy commented Feb 4, 2019 •

edited

Loading

timholy left a comment

timholy Feb 4, 2019

mbauman Feb 4, 2019

timholy Feb 4, 2019

timholy Feb 4, 2019

timholy Feb 4, 2019

timholy commented Jun 16, 2020

Check that axes start with 1 for AbstractRange operations #30950

Are you sure you want to change the base?

Check that axes start with 1 for AbstractRange operations #30950

Conversation

timholy commented Feb 4, 2019 • edited Loading

timholy left a comment

Choose a reason for hiding this comment

timholy Feb 4, 2019

Choose a reason for hiding this comment

mbauman Feb 4, 2019

Choose a reason for hiding this comment

timholy Feb 4, 2019

Choose a reason for hiding this comment

timholy Feb 4, 2019

Choose a reason for hiding this comment

timholy Feb 4, 2019

Choose a reason for hiding this comment

timholy commented Jun 16, 2020

timholy commented Feb 4, 2019 •

edited

Loading