Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework replace and replace! #26206

Merged
merged 4 commits into from
Apr 13, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 85 additions & 36 deletions base/set.jl
Original file line number Diff line number Diff line change
Expand Up @@ -574,8 +574,14 @@ _copy_oftype(x::AbstractArray{T}, ::Type{T}) where {T} = copy(x)
_copy_oftype(x::AbstractDict{K,V}, ::Type{Pair{K,V}}) where {K,V} = copy(x)
_copy_oftype(x::AbstractSet{T}, ::Type{T}) where {T} = copy(x)

_similar_or_copy(x::Any) = similar(x)
_similar_or_copy(x::Any, ::Type{T}) where {T} = similar(x, T)
# Make a copy on construction since it is faster than inserting elements separately
_similar_or_copy(x::Union{AbstractDict,AbstractSet}) = copy(x)
_similar_or_copy(x::Union{AbstractDict,AbstractSet}, ::Type{T}) where {T} = _copy_oftype(x, T)

# to make replace/replace! work for a new container type Cont, only
# replace!(new::Callable, A::Cont; count::Integer=typemax(Int))
# replace!(new::Callable, res::Cont, A::Cont; count::Integer=typemax(Int))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method of replace! is a nice addition to the API, but you didn't document it... I understand that it is felt that this function has already a too large API, so we could leave it undocumented for now. But I'm uncomfortable making it a method of replace! then, as people will discover it and start to use it, so I'd rather have it be named _replace! to make it clear that it's an internal method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you should update the type and default value for count.

# has to be implemented

"""
Expand All @@ -600,16 +606,17 @@ julia> replace!(Set([1, 2, 3]), 1=>0)
Set([0, 2, 3])
```
"""
replace!(A, old_new::Pair...; count::Integer=typemax(Int)) = _replace!(A, count, old_new)
replace!(A, old_new::Pair...; count::Union{Integer,Nothing}=nothing) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on why you need to introduce nothing here: it seems to make the code slightly more complex, and it's not part of the API (at least you didn't document it)... why not keep typemax(Int) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of nothing is that it has a different type, which allows the compiler to optimize out instructions which are not needed (like the n < count check, which introduces a branch an prevents SIMD). Also, for replace, it allows subtracting singleton types when count=nothing while keeping the function type-stable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks. I'm a little sad to see this added complexity, but I guess losing type stabity in particular would be a deal breaker.

replace!(A, A, count, old_new)

function _replace!(A, count::Integer, old_new::Tuple{Vararg{Pair}})
function replace!(res, A, count::Union{Integer,Nothing}, old_new::Tuple{Vararg{Pair}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not part of the API, so (as in the the other comment) I prefer it to not be a method of replace! (example it also clobbers the output of mehods).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the problem is that it's not great to tell people to override _replace! to implement the replace/replace! methods. I actually wonder whether we should make this replace! method officially public, or at least document it under an # Implementation part in the docstring.

An alternative is to find a better name than _replace!.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to make the method public. But the one above (replace!(res, A, count::Union{Integer,Nothing}, old_new::Tuple{Vararg{Pair}})) is really an internal method which doesn't have to be overriden, so _replace! is a better name for it I think.

@inline function new(x)
for o_n in old_new
isequal(first(o_n), x) && return last(o_n)
end
return x # no replace
end
replace!(new, A, count=count)
replace!(new, res, A, count)
end

"""
Expand All @@ -630,7 +637,7 @@ julia> replace!(isodd, A, 0, count=2)
1
```
"""
replace!(pred::Callable, A, new; count::Integer=typemax(Int)) =
replace!(pred::Callable, A, new; count::Union{Integer,Nothing}=nothing) =
replace!(x -> ifelse(pred(x), new, x), A, count=count)

"""
Expand Down Expand Up @@ -661,9 +668,14 @@ Set([6, 12])
```
"""
function replace!(new::Callable, A::Union{AbstractArray,AbstractDict,AbstractSet};
count::Integer=typemax(Int))
count < 0 && throw(DomainError(count, "`count` must not be negative"))
count != 0 && _replace!(new, A, min(count, typemax(Int)) % Int)
count::Union{Integer,Nothing}=nothing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are using the "new API" to implement the functionality for certain types, so it's fine to use replace!, but then again I think we should keep calling _replace! the specific implementation methods, which have the signature _replace!(new, res, A, count), which really is an implementation detail and is not part of any API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with using _replace! is that custom types would have either to override _replace! (which is unexported and doesn't have a very appealing name), or to reimplement all replace and replace!methods directly. Here that implies duplicating some boilerplate (with some tricks, like checking count === nothing and min(count, typemax(Int)) % Int), which one can easily get wrong). In other places, it even implies duplicating the call to subtract_singletontype, which is an unexported function and which people would better not have to bother with at all.

So I think it's much better to recommend package authors to implement a single low-level method, be it replace! or another unexported method (ideally with a better name than _replace!).

if count === nothing
replace!(new, A, A, nothing)
elseif count < 0
throw(DomainError(count, "`count` must not be negative"))
elseif count != 0
replace!(new, A, A, min(count, typemax(Int)) % Int)
end
A
end

Expand All @@ -686,16 +698,33 @@ julia> replace([1, 2, 1, 3], 1=>0, 2=>4, count=2)
3
```
"""
function replace(A, old_new::Pair...; count::Integer=typemax(Int))
function replace(A, old_new::Pair...; count::Union{Integer,Nothing}=nothing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of Nothing here for type stability is quite subtle; if you re-push to this branch for another change, may be worth a small comment to explain this.

V = promote_valuetype(old_new...)
T = promote_type(eltype(A), V)
_replace!(_copy_oftype(A, T), count, old_new)
if count isa Nothing
T = promote_type(subtract_singletontype(eltype(A), old_new...), V)
replace!(_similar_or_copy(A, T), A, nothing, old_new)
else
U = promote_type(eltype(A), V)
replace!(_similar_or_copy(A, U), A, min(count, typemax(Int)) % Int, old_new)
end
end

promote_valuetype(x::Pair{K, V}) where {K, V} = V
promote_valuetype(x::Pair{K, V}, y::Pair...) where {K, V} =
promote_type(V, promote_valuetype(y...))

# Subtract singleton types which are going to be replaced
@pure issingletontype(::Type{T}) where {T} = isdefined(T, :instance)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not Jameson so I can't comment on the use of @pure here! ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @pure is OK here, although the definition itself is wrong:

@pure issingletontype(T::DataType) = isdefined(T, :instance)
issingletontype(::Type) = false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay, I used @pure correctly! Thanks for the pointer about DataType.

function subtract_singletontype(::Type{T}, x::Pair{K}) where {T, K}
if issingletontype(K) # singleton type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment may seem slightly redundant :)

Core.Compiler.typesubtract(T, K)
else
T
end
end
subtract_singletontype(::Type{T}, x::Pair{K}, y::Pair...) where {T, K} =
subtract_singletontype(subtract_singletontype(T, y...), x)

"""
replace(pred::Function, A, new; [count::Integer])

Expand All @@ -713,9 +742,10 @@ julia> replace(isodd, [1, 2, 3, 1], 0, count=2)
1
```
"""
function replace(pred::Callable, A, new; count::Integer=typemax(Int))
function replace(pred::Callable, A, new; count::Union{Integer,Nothing}=nothing)
T = promote_type(eltype(A), typeof(new))
replace!(pred, _copy_oftype(A, T), new, count=count)
replace!(x -> ifelse(pred(x), new, x), _similar_or_copy(A, T), A,
count === nothing ? nothing : min(count, typemax(Int)) % Int)
end

"""
Expand All @@ -742,7 +772,9 @@ Dict{Int64,Int64} with 2 entries:
1 => 3
```
"""
replace(new::Callable, A; count::Integer=typemax(Int)) = replace!(new, copy(A), count=count)
replace(new::Callable, A; count::Union{Integer,Nothing}=nothing) =
replace!(new, _similar_or_copy(A), A,
count === nothing ? nothing : min(count, typemax(Int)) % Int)

# Handle ambiguities
replace!(a::Callable, b::Pair; count::Integer=-1) = throw(MethodError(replace!, (a, b)))
Expand All @@ -757,36 +789,53 @@ replace(a::AbstractString, b::Pair, c::Pair) = throw(MethodError(replace, (a, b,
askey(k, ::AbstractDict) = k.first
askey(k, ::AbstractSet) = k

function _replace!(new::Callable, A::Union{AbstractDict,AbstractSet}, count::Int)
repl = Pair{eltype(A),eltype(A)}[]
function replace!(new::Callable, res::T, A::T,
count::Union{Int,Nothing}) where T<:Union{AbstractDict,AbstractSet}
c = 0
for x in A
y = new(x)
if x !== y
push!(repl, x => y)
c += 1
if res === A
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth a small comment mentioning the optimization that can't be done in general when res !== x.

repl = Pair{eltype(A),eltype(A)}[]
for x in A
y = new(x)
if x !== y
push!(repl, x => y)
c += 1
end
c == count && break
end
for oldnew in repl
pop!(res, askey(first(oldnew), res))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor comment: as res === A, working with only one of A or res may make it slightly easier to read.

end
for oldnew in repl
push!(res, last(oldnew))
end
else
for x in A
y = new(x)
if x !== y
pop!(res, askey(x, res))
push!(res, y)
c += 1
end
c == count && break
end
c == count && break
end
for oldnew in repl
pop!(A, askey(first(oldnew), A))
end
for oldnew in repl
push!(A, last(oldnew))
end
res
end

### AbstractArray
### replace! for AbstractArray

function _replace!(new::Callable, A::AbstractArray, count::Int)
function replace!(new::Callable, res::AbstractArray, A::AbstractArray,
count::Union{Int,Nothing})
c = 0
for i in eachindex(A)
@inbounds Ai = A[i]
y = new(Ai)
if Ai !== y
@inbounds A[i] = y
c += 1
if count === nothing || c < count
y = new(Ai)
@inbounds res[i] = y
c += (Ai !== y)
else
@inbounds res[i] = Ai
end
c == count && break
end
end
res
end
9 changes: 9 additions & 0 deletions test/sets.jl
Original file line number Diff line number Diff line change
Expand Up @@ -548,6 +548,15 @@ end
x = @inferred replace(x -> x > 1, [1, 2], missing)
@test isequal(x, [1, missing]) && x isa Vector{Union{Int, Missing}}

x = @inferred replace([1, missing], missing=>2)
@test x == [1, 2] && x isa Vector{Int}
x = @inferred replace([1, missing], missing=>2, count=1)
@test x == [1, 2] && x isa Vector{Union{Int, Missing}}
x = @inferred replace([1, missing], missing=>missing)
@test isequal(x, [1, missing]) && x isa Vector{Union{Int, Missing}}
x = @inferred replace([1, missing], missing=>2, 1=>missing)
@test isequal(x, [missing, 2]) && x isa Vector{Union{Int, Missing}}

# test that isequal is used
@test replace([NaN, 1.0], NaN=>0.0) == [0.0, 1.0]
@test replace([1, missing], missing=>0) == [1, 0]
Expand Down