Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Use of Random API #196

Closed
kimikage opened this issue Jul 8, 2020 · 10 comments · Fixed by #206
Closed

[RFC] Use of Random API #196

kimikage opened this issue Jul 8, 2020 · 10 comments · Fixed by #206
Labels
breaking Major breaking change

Comments

@kimikage
Copy link
Collaborator

kimikage commented Jul 8, 2020

Currently, the random number array of the FixedPoint is generated by reinterpreting the random number array of its rawtype.
The generation is very fast, but the use of ReinterpretArray brings poor performance. (cf. #125)

Julia v1.0 and later, the sampler-based Random API is provided. By using the Random API, we can generate random numbers more flexibly. The disadvantage is that the generation of random arrays is much slower.

Current:

julia> @btime rand(N0f8, 1000, 1000); # ReinterpretArray
  139.100 μs (3 allocations: 976.73 KiB)

julia> @btime collect(rand(N0f8, 1000, 1000));
  2.479 ms (5 allocations: 1.91 MiB)

julia> @btime replace!(_->rand(N0f8), Array{N0f8}(undef, 1000, 1000));
  2.744 ms (2 allocations: 976.70 KiB)

Using the newer Random API:

julia> @btime rand(rng, N0f8, 1000, 1000) setup=(rng=Random.MersenneTwister(1234)); # we can specify the RNG
  1.231 ms (2 allocations: 976.70 KiB)

Do you have any thoughts?

@kimikage kimikage added the breaking Major breaking change label Jul 8, 2020
@kimikage
Copy link
Collaborator Author

kimikage commented Jul 8, 2020

Interestingly, the following is not so slow.

julia> @btime reinterpret.(N0f8, rand(UInt8, 1000, 1000));
  219.100 μs (4 allocations: 1.91 MiB)

@kimikage
Copy link
Collaborator Author

kimikage commented Jul 9, 2020

Of course, the following unsafe method is possible, but I don't want to use it.

function Random.rand!(r::Random.MersenneTwister, A::Array{X}, ::Random.SamplerType{X}) where {X <: FixedPoint}
    sp = Random.SamplerType{rawtype(X)}()
    GC.@preserve A Random.rand!(r, Random.UnsafeView{rawtype(X)}(pointer(A), length(A)), sp)
    A
end

(cf. https://github.com/JuliaLang/julia/blob/0e74b3e872f6133704c4aed51fa99a6959ad17b1/stdlib/Random/src/RNGs.jl#L588-L605)

@johnnychen94
Copy link
Collaborator

johnnychen94 commented Jul 9, 2020

Regardless of the performance tradeoff which I absolutely have no idea what's happening under the hood:

👍 for adapting the new API; actually I don't think we should hesitate to keep API updated to wrt Base -- the only reason we didn't do so IMO is that we don't have enough people around.

👍 👍 if this made it possible to use https://github.com/JuliaStats/Distributions.jl

@kimikage
Copy link
Collaborator Author

The Distributions.jl also directly overloads rand and rand!, so the change in FixedPointNumbers should not affect it.

FYI, random FixedPoint numbers according to a specific distribution can be obtained via (implicit) convert.

julia> d = truncated(Normal(0.5, 0.1), typemin(N0f8), typemax(N0f8))
Truncated(Normal{Float64}=0.5, σ=0.1), range=(0.0, 1.0))

julia> rand!(d, Array{N0f8}(undef, 10, 10))
10×10 Array{N0f8,2} with eltype Normed{UInt8,8}:
 0.541  0.576  0.533  0.604  0.376  0.341  0.439  0.518  0.616  0.561
 0.545  0.698  0.38   0.722  0.388  0.624  0.51   0.443  0.549  0.545
 0.369  0.553  0.396  0.604  0.416  0.408  0.639  0.49   0.576  0.494
 0.486  0.373  0.435  0.478  0.4    0.282  0.482  0.471  0.643  0.471
 0.427  0.541  0.643  0.486  0.451  0.573  0.494  0.557  0.533  0.643
 0.608  0.682  0.392  0.424  0.471  0.502  0.522  0.6    0.569  0.376
 0.529  0.647  0.459  0.471  0.612  0.282  0.361  0.502  0.42   0.639
 0.51   0.545  0.565  0.42   0.576  0.467  0.482  0.773  0.553  0.459
 0.459  0.541  0.38   0.439  0.467  0.671  0.549  0.486  0.525  0.529
 0.557  0.416  0.455  0.325  0.486  0.514  0.604  0.471  0.42   0.404

Currently Distributions.jl depends on Float64, and types like N0f8 are not suitable for intermediate calculations. Therefore, the conversion-on-write is the best method for now.

@timholy
Copy link
Member

timholy commented Jul 11, 2020

Really we just need to fix the performance of ReinterpretArrays.

To prevent usage from being slow, though, perhaps it would be better to create as an Array{X,N} where X<:FixedPoint, reinterpret that to the rawtype, call rand! on that, and return the array. This shields the user from receiving a ReinterpretArray.

@kimikage
Copy link
Collaborator Author

kimikage commented Jul 11, 2020

create as an Array{X,N} where X<:FixedPoint, reinterpret that to the rawtype, call rand! on that, and return the array.

Unfortunately, that doesn't work well. Only when the collection is the Array (not AbstractArray), the fast generation is available. (cf. #196 (comment))

@timholy
Copy link
Member

timholy commented Jul 12, 2020

Base.unsafe_wrap should let you circumvent it. Irritating, since the fast path for random number generation is going to use pointers anyway.

@kimikage
Copy link
Collaborator Author

In conclusion, do you think FixedPointNumbers should provide such a fast random number generation?

Given that the problem with ReinterpretArray has been left untouched for a long time, I think that few users really need the fast random number generation.

@timholy
Copy link
Member

timholy commented Jul 13, 2020

There are few people with the knowledge to fix the ReinterpretArray performance, if indeed it can be fixed.

But as you point out, the lack of complaints may indicate that it is not a huge deal.

@kimikage
Copy link
Collaborator Author

Even if the random number generation itself is slow, I think the total performance will not be degraded in practical use.

Performance improvements without API changes are always welcome. Fortunately, the new ColorTypes.jl doesn't have to depend on FixedPointNumbers.jl v0.9 for now (I'm going to add the compatibility measures for the printing FPNs into ColorTypes.jl), so we still have time.

I think the decision we need to make now is whether to stop returning the ReinterpretArray (more precisely, leave the return type to the Random module implementation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Major breaking change
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants