Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error dealing with Bools #60

Closed
MasonProtter opened this issue Feb 25, 2020 · 1 comment
Closed

Error dealing with Bools #60

MasonProtter opened this issue Feb 25, 2020 · 1 comment

Comments

@MasonProtter
Copy link
Contributor

MasonProtter commented Feb 25, 2020

I can't get the following code to work on the latest version of LoopVectorization

julia> using LoopVectorization

julia> function foo(x::Vector, y::Vector)
           out = similar(x)
           @avx for i  eachindex(x)
               out[i] = (x[i]*x[i] + y[i]*y[i]) < 1
           end
           out
       end
foo (generic function with 1 method)

julia> foo(rand(10), rand(10))
ERROR: MethodError: no method matching vstore!(::Ptr{Float64}, ::UInt8, ::VectorizationBase._MM{4})
Closest candidates are:
  vstore!(::Ptr{T}, ::Number, ::Integer) where T<:Number at /Users/mason/.julia/packages/VectorizationBase/WulOg/src/vectorizable.jl:202
  vstore!(::Ptr{T}, ::Tuple{Vararg{VecElement{T},N}}, ::I, ::Val{Aligned}, ::Val{Nontemporal}) where {N, T, I, Aligned, Nontemporal} at /Users/mason/.julia/packages/SIMDPirates/r9awO/src/memory.jl:349
  vstore!(::Ptr{T}, ::Union{Tuple{Vararg{VecElement{T},W}}, VectorizationBase.AbstractStructVec{W,T}}, ::VectorizationBase._MM{W}) where {W, T} at /Users/mason/.julia/packages/SIMDPirates/r9awO/src/memory.jl:521
  ...
Stacktrace:
 [1] vstore! at /Users/mason/.julia/packages/VectorizationBase/WulOg/src/vectorizable.jl:200 [inlined]
 [2] foo(::Array{Float64,1}, ::Array{Float64,1}) at ./REPL[5]:3
 [3] top-level scope at REPL[6]:1

julia> versioninfo()
Julia Version 1.4.0-rc1.0
Commit b0c33b0cf5 (2020-01-23 17:23 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.2.0)
  CPU: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 2
@chriselrod
Copy link
Member

chriselrod commented Feb 26, 2020

Thanks for the issue. After commit 51a75c9:

julia> using LoopVectorization

julia> function fooavx(x::Vector, y::Vector)
                  out = similar(x)
                  @avx for i  eachindex(x)
                      out[i] = (x[i]*x[i] + y[i]*y[i]) < 1
                  end
                  out
              end
fooavx (generic function with 1 method)

julia> function foo(x::Vector, y::Vector)
                  out = similar(x)
                  for i  eachindex(x)
                      out[i] = (x[i]*x[i] + y[i]*y[i]) < 1
                  end
                  out
              end
foo (generic function with 1 method)

julia> x, y = rand(10), rand(10); foo(x, y) == fooavx(x, y)
true

I switched from using unsigned integers to represent bit-masks to using a specialized mask type.
While I could figure out the correct size in this particular case, where it's given by the _MM{4}: vstore!(::Ptr{Float64}, ::UInt8, ::VectorizationBase._MM{4}), that isn't true for functions like all, so I'd been meaning to do that anyway.

This also gives me the freedom to add specialized methods to it, so I can make ::Bool & ::Mask behave correctly.
The Mask{W} type acts like a vector of 1-bit integers of length W.

julia> using SIMDPirates

julia> x = rand(8); y = rand(8);

julia> (x .> y)'
1×8 Adjoint{Bool,BitArray{1}}:
 1  0  0  1  0  0  0  1

julia> vload(stridedpointer(x), (_MM{8}(0),)) > vload(stridedpointer(y), (_MM{8}(0),))
Mask{8,Bool}<1, 0, 0, 1, 0, 0, 0, 1>

A version with this fix will be registered in a few minutes.

I've added your example to the testsuite. The linked file has a few examples using conditional statements.

Please file more issues if/when you run into more problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants