-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong result with @avx and vector transpose #117
Comments
Ah, thanks. It isn't respecting the transpose and thinks it is still a 1d vector. |
Locally, I now have: julia> using LoopVectorization
[ Info: Precompiling LoopVectorization [bdcacae8-1622-11e9-2a5c-532679323890]
julia> function avx_problem(x::AbstractVector{T}, m) where T
N = length(x)
n = N-m+1
ent = zeros(T, N)
x = x'
@avx for i = 1:N
for j = 1:size(x,1)
ent[i] += x[j,i]*log(x[j,i])
end
end
ent
end
avx_problem (generic function with 1 method)
julia> function no_avx_problem(x::AbstractVector{T}, m) where T
N = length(x)
n = N-m+1
ent = zeros(T, N)
x = x'
for i = 1:N
for j = 1:size(x,1)
ent[i] += x[j,i]*log(x[j,i])
end
end
ent
end
no_avx_problem (generic function with 1 method)
julia> x = rand(12);
julia> e1 = avx_problem(x, 4);
julia> e2 = no_avx_problem(x, 4);
julia> e1'
1×12 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
-0.331849 -0.367094 -0.359068 -0.244346 -0.0461393 -0.323758 -0.297252 -0.329479 -0.309549 -0.367815 -0.307403 -0.275631
julia> e2'
1×12 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
-0.331849 -0.367094 -0.359068 -0.244346 -0.0461393 -0.323758 -0.297252 -0.329479 -0.309549 -0.367815 -0.307403 -0.275631 I'll try and get this branch merged and make new releases within the next few hours. The tricky bit was that transposed vectors have unit stride in two dimensions: julia> e2'[1], e2'[2], e2'[3]
(-0.3318487689615286, -0.3670936384601229, -0.35906795855837714)
julia> e2'[1,1], e2'[1,2], e2'[1,3]
(-0.3318487689615286, -0.3670936384601229, -0.35906795855837714) I originally only supported the former, treating it like a 1-d vector, ignoring higher indexes. I still need to add proper support for this in LoopVectorization. The cost modeling isn't aware of two axes potentially being contiguous. |
Now fixed on master and tested. |
I think
@avx
gets confused by the following code, and produces the wrong result.The text was updated successfully, but these errors were encountered: