Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework OffsetArrays constructor #148

Merged
merged 9 commits into from
Sep 23, 2020
Merged

rework OffsetArrays constructor #148

merged 9 commits into from
Sep 23, 2020

Conversation

johnnychen94
Copy link
Member

@johnnychen94 johnnychen94 commented Sep 18, 2020

Unfortunately, the current version introduces a deeply hidden ambiguity and it blocks JuliaLang/julia#37629

julia> using Test, OffsetArrays

julia> detect_ambiguities(OffsetArrays)
Tuple{Method,Method}[]

julia> detect_ambiguities(Core, Base, OffsetArrays)
1-element Array{Tuple{Method,Method},1}:
 ((::Type{OffsetArray})(A::AbstractArray, inds::Tuple{Vararg{Union{Colon, AbstractUnitRange, CartesianIndices},N} where N}) in OffsetArrays at /Users/jc/.julia/packages/OffsetArrays/ikTNJ/src/OffsetArrays.jl:135, (::Type{OffsetArray})(A::OffsetArray, offsets::Tuple{Vararg{Int64,N}}) where N in OffsetArrays at /Users/jc/.julia/packages/OffsetArrays/ikTNJ/src/OffsetArrays.jl:155)

This PR simplifies the dispatching routes in constructors to fix this ambiguity, and also add missing supports to OffsetVector/OffsetMatrix.

This PR also renames some of the internal low level utilities with a _ prefix and moves them to a new file utils.jl.

Unsure how this works out, more tests and benchmarks are needed. invalidation test is also needed.

@codecov
Copy link

codecov bot commented Sep 18, 2020

Codecov Report

Merging #148 into master will decrease coverage by 0.41%.
The diff coverage is 98.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #148      +/-   ##
==========================================
- Coverage   94.86%   94.44%   -0.42%     
==========================================
  Files           2        3       +1     
  Lines         253      234      -19     
==========================================
- Hits          240      221      -19     
  Misses         13       13              
Impacted Files Coverage Δ
src/utils.jl 93.75% <93.75%> (ø)
src/OffsetArrays.jl 95.40% <100.00%> (-0.30%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f50ebe9...6469a8c. Read the comment docs.

src/OffsetArrays.jl Outdated Show resolved Hide resolved
$FT(A, indsN)
end
@eval function $FT(A::AbstractArray{T}, inds::Vararg{OffsetAxis,N}) where {T, N}
$FT(A, _uncolonindices(A, _stripCartesianIndices(_splitCartesianIndices(inds))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should just call $FT(A, inds)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you simplifying the dispatch route on purpose, but I wonder if this might make debugging a bit more complicated in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch.

I suppose you simplifying the dispatch route on purpose, but I wonder if this might make debugging a bit more complicated in the future.

Do you mean the general layered routes, or just this method call?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean are you making all the constructors where the axes are specified call OffsetArrays(::AbstractArray{T,N}, ::NTuple{N,Int}) within them?

Copy link
Member Author

@johnnychen94 johnnychen94 Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as I've commented, this is the only way out to the inner constructor.

The idea is to make axis=>offsets conversion more generic with the following ordered steps:

  1. if there are CartesianIndices, convert to NTuple{N, Int} as offsets
  2. if there are Colons, convert to ranges
  3. if the input array is OffsetArray, accumulate its offset
  4. if there are ranges, convert to NTuple{N, Int}

This way we don't need to add more OffsetArrays methods to achieve axis=>offsets conversion.

This can be fragile, though. For example, if there are uncaught cases, a Stackoverflow could happen. I believe by adding more tests this can be avoided.

@jishnub
Copy link
Member

jishnub commented Sep 18, 2020

I'm surprised that the ambiguity doesn't show up in detect_ambiguities(OffsetArrays)

@jishnub
Copy link
Member

jishnub commented Sep 18, 2020

This seems to catch the ambiguity in master, maybe we should switch to this?

julia> cons = collect(methods(OffsetArray));

julia> any(Base.isambiguous(c1, c2) for c1 in cons, c2 in cons)
true

@johnnychen94
Copy link
Member Author

maybe we should switch to this?

It's okay, we could wait for Aqua v0.5 to catch this: JuliaTesting/Aqua.jl#24

src/OffsetArrays.jl Outdated Show resolved Hide resolved
Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a much-needed cleanup 👍

src/OffsetArrays.jl Outdated Show resolved Hide resolved
src/OffsetArrays.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/OffsetArrays.jl Outdated Show resolved Hide resolved
@johnnychen94 johnnychen94 changed the title [WIP] rework OffsetArrays constructor rework OffsetArrays constructor Sep 20, 2020
@johnnychen94
Copy link
Member Author

johnnychen94 commented Sep 20, 2020

A lot of new tests are added. Unfortunately, this still not ready because of the performance issue.

julia> @btime OffsetArray($a, -2:2, -2:2, -2:2, -2:2); # this PR
  225.305 ns (2 allocations: 128 bytes)

julia> @btime OffsetArray($a, -2:2, -2:2, -2:2, -2:2); # master
   16.556 ns (0 allocations: 0 bytes)

I'll need a few more days to investigate it. (Probably this Sunday)

@jishnub
Copy link
Member

jishnub commented Sep 21, 2020

I realized that _uncolonindices does something similar to Base.to_indices. I wonder if we might be able to rewrite using this?

julia> inds1 = OffsetArrays._uncolonindices(zeros(3,4), (2:4, :))
(2:4, Base.OneTo(4))

julia> inds2 = to_indices(zeros(3,4), (2:4, :))
(2:4, Base.Slice(Base.OneTo(4)))

julia> OffsetArray(zeros(3,4), inds1)
3×4 OffsetArray(::Array{Float64,2}, 2:4, 1:4) with eltype Float64 with indices 2:4×1:4:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

julia> OffsetArray(zeros(3,4), inds2)
3×4 OffsetArray(::Array{Float64,2}, 2:4, 1:4) with eltype Float64 with indices 2:4×1:4:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

The advantage is that this will support any type that extends Base.to_indices, not just colons. For example Ellipsis

Edit: This behavior has apparently changed since v1.0.

On 1.0:

julia> to_indices(ones(5), (CartesianIndices((-2:2,)),))
(CartesianIndex{1}[CartesianIndex(-2,), CartesianIndex(-1,), CartesianIndex(0,), CartesianIndex(1,), CartesianIndex(2,)],)

On 1.5

julia> to_indices(ones(5), (CartesianIndices((-2:2,)),))
(-2:2,)

@johnnychen94
Copy link
Member Author

Probably, if you can work on it I could rebase it on top of your changes; otherwise, I'll check it when I get some time.

@jishnub
Copy link
Member

jishnub commented Sep 21, 2020

This seems a bit more complicated than I had anticipated. Will look into this after this PR is merged.

indsN = _uncolonindices(A, _expandCartesianIndices(inds))
$FT(A, indsN)
end
@eval $FT(A::AbstractArray{T}, inds...) where {T, N} = $FT(A, inds)
Copy link
Member Author

@johnnychen94 johnnychen94 Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By reverting b660d86 the performance becomes normal again:

-     @eval $FT(A::AbstractArray{T}, inds...) where {T, N} = $FT(A, inds)
+     @eval $FT(A::AbstractArray{T}, inds::Vararg{OffsetAxis,N}) where {T, N} = $FT(A, inds)
julia> @btime OffsetArray($a, -2:2, -2:2, -2:2, -2:2);
  18.002 ns (0 allocations: 0 bytes) # revert b660d86c # the same as mater
julia> @btime OffsetArray($a, -2:2, -2:2, -2:2, -2:2);
  184.321 ns (2 allocations: 128 bytes) # with b660d86c

Yet I don't know what's the reason here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it's because of the extra type-parameter N that is not being used

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮 that's a large overhead!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnnychen94
Copy link
Member Author

johnnychen94 commented Sep 21, 2020

The benchmark looks good now:

using BenchmarkTools, OffsetArrays

a = zeros(5, 5);
@btime OffsetArray{Float64, 2}(undef, -2:2, -2:2);
@btime OffsetArray{Float64, 2}(undef, CartesianIndex(-2, -2):CartesianIndex(2, 2));
@btime OffsetArray($a, -3, -3);
@btime OffsetArray($a, -2:2, :);
@btime OffsetArray($a, CartesianIndex(-2, -2):CartesianIndex(2, 2));

# This PR
# 37.469 ns (1 allocation: 288 bytes)
# 36.476 ns (1 allocation: 288 bytes)
# 1.696 ns (0 allocations: 0 bytes)
# 4.308 ns (0 allocations: 0 bytes)
# 13.447 ns (0 allocations: 0 bytes)

# master
# 31.802 ns (1 allocation: 288 bytes)
# 40.283 ns (1 allocation: 288 bytes)
# 1.421 ns (0 allocations: 0 bytes)
# 4.313 ns (0 allocations: 0 bytes)
# 13.080 ns (0 allocations: 0 bytes)

This PR also captures some other oversights:

OffsetArray{Float64, 2}(undef, -2:2, -2:2, -2:2)
# StackOverflow in master
# DimensionMismatch in this PR

# OffsetMatrix(rand(4, 3), (-1:2, 0:2)...) works
# OffsetArray(rand(4, 3), (-1:2, 0:2)) works
OffsetMatrix(rand(4, 3), (-1:2, 0:2)) 
# MethodError in master
# works in this PR

@johnnychen94
Copy link
Member Author

johnnychen94 commented Sep 21, 2020

Most of the previous test cases are rewritten more compactly. I've tested it locally that the previous test also passes.

I think this one is ready. In case there are further comments, I'll leave it one day or two before I merge this. After that we could continue #157 and #147

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noted a couple of things that might need to change but this looks basically mergeable. Thanks for rationalizing all this!

function OffsetArray{T, N}(init::ArrayInitializer, inds::NTuple{NT, Union{OffsetAxisKnownLength, CartesianIndices}}) where {T, N, NT}
# NT is probably not the actual dimension of the array; CartesianIndices might contain multiple dimensions
indsN = _expandCartesianIndices(inds)
length(indsN) == N || throw(DimensionMismatch("The number of offsets $(length(indsN)) should equal ndims(A) = $N"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference,

julia> Array{Float32,3}(undef, (2, 3))
ERROR: MethodError: no method matching Array{Float32, 3}(::UndefInitializer, ::Tuple{Int64, Int64})
Closest candidates are:
  Array{T, N}(::UndefInitializer, ::Tuple{Vararg{Int64, N}}) where {T, N} at boot.jl:445
  Array{T, N}(::UndefInitializer, ::Tuple{Vararg{Integer, N}}) where {T, N} at baseext.jl:25
  Array{T, 3}(::UndefInitializer, ::Int64, ::Int64, ::Int64) where T at boot.jl:437
  ...
Stacktrace:
 [1] top-level scope
   @ REPL[1]:1

Personally I like the DimensionMismatch better. If this were to get fixed in Julia I'd suggest deferring to that error, but since it's not good now and won't be fixable on earlier releases, this seem like the right choice.

_indexlength(i::Colon) = Colon()

_offset(axparent::AbstractUnitRange, ax::AbstractUnitRange) = first(ax) - first(axparent)
_offset(axparent::AbstractUnitRange, ax::CartesianIndices) = _offset(axparent, first(ax.indices))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the justification here, and the fact that it's flagged by codecov is interesting.

Copy link
Member Author

@johnnychen94 johnnychen94 Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that _indexlength doesn't support CartesianIndices, this single method seems not possible to be called from anywhere.

I still leave it here for potential future usage.

test/runtests.jl Outdated Show resolved Hide resolved
@johnnychen94 johnnychen94 merged commit a78ee17 into master Sep 23, 2020
@johnnychen94 johnnychen94 deleted the jc/constructor branch September 23, 2020 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants