Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is scaledual? #130

Closed
kimikage opened this issue Oct 23, 2019 · 2 comments · Fixed by #132
Closed

What is scaledual? #130

kimikage opened this issue Oct 23, 2019 · 2 comments · Fixed by #132

Comments

@kimikage
Copy link
Collaborator

kimikage commented Oct 23, 2019

I am working on improving the accuracy of the conversions from Normed to Float (#129), and I am interested in scaledual, which seems to be related with the conversions.

The scaledual was introduced in d1087f7.

The introduction of scaledual is a bit speculative, but it can essentially double the speed of certain operations. It has the following property:

bd, ad = scaledual(b, a)
b*a == bd*ad

but the RHS might be faster (particularly for floating-point b and an array a of fixed-point numbers).

Originally posted by @timholy in #2 (comment)

However, in the current codebase, I think scaledual does not have the property above as its test specifies. (a[1] != af8[1])

a = rand(UInt8, 10)
rfloat = similar(a, Float32)
rfixed = similar(rfloat)
af8 = reinterpret(N0f8, a)
b = 0.5
bd, eld = scaledual(b, af8[1])
@assert b*a[1] == bd*eld

I do my best for #129, but a slowdown is inevitable. If scaledual is helpful as a workaround for people who prefer speed over accuracy, I feel relieved.

@timholy , did I not understand that correctly?

@timholy
Copy link
Member

timholy commented Oct 31, 2019

Dang, you're right. Obviously scaledual has not gotten much/any use. (I haven't used it myself even though I should sometimes.)

I will submit a fix, but it will be a breaking change so we'll need to do a minor-version-number bump when we release it.

@kimikage
Copy link
Collaborator Author

kimikage commented Nov 1, 2019

Apart from the appearance:sweat_smile:, the new scaledual(#132) finely works as same as the current floating-point conversion! 🎉

Benchmark

function Vec3{T}(v::Vec3{U}, _) where {T, U}
    x::T = *(scaledual(T, v.x)...)
    y::T = *(scaledual(T, v.y)...)
    z::T = *(scaledual(T, v.z)...)
    Vec3{T}(x, y, z)
end
function Vec4{T}(v::Vec4{U}, _) where {T, U}
    x::T = *(scaledual(T, v.x)...)
    y::T = *(scaledual(T, v.y)...)
    z::T = *(scaledual(T, v.z)...)
    w::T = *(scaledual(T, v.w)...)
    Vec4{T}(x, y, z, w)
end
julia> vec3_n0f8 = rand(Vec3{N0f8}, 64, 64);

julia> @btime Vec3{Float32}.(mat)    setup=(mat=vec3_n0f8);
  2.763 μs (2 allocations: 48.08 KiB)

julia> @btime Vec3{Float32}.(mat, 0) setup=(mat=vec3_n0f8); # scaledual
  2.788 μs (2 allocations: 48.08 KiB)

julia> @btime Vec3{Float64}.(mat)    setup=(mat=vec3_n0f8);
  4.600 μs (2 allocations: 96.08 KiB)

julia> @btime Vec3{Float64}.(mat, 0) setup=(mat=vec3_n0f8); # scaledual
  4.620 μs (2 allocations: 96.08 KiB)
julia> vec4_n0f8 = rand(Vec4{N0f8}, 64, 64);

julia> @btime Vec4{Float32}.(mat)    setup=(mat=vec4_n0f8);
  2.657 μs (2 allocations: 64.08 KiB)

julia> @btime Vec4{Float32}.(mat, 0) setup=(mat=vec4_n0f8); # scaledual
  2.857 μs (2 allocations: 64.08 KiB)

julia> @btime Vec4{Float64}.(mat)    setup=(mat=vec4_n0f8);
  4.200 μs (2 allocations: 128.08 KiB)

julia> @btime Vec4{Float64}.(mat, 0) setup=(mat=vec4_n0f8); # scaledual
  4.166 μs (2 allocations: 128.08 KiB)

Of course, scaledual works well with the user-defined type which is a subtype of AbstractArray, and there is less or no problem in the appearance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants