Ability to turn on and off scalar indexing for sparse matrices? #189

ChrisRackauckas · 2022-07-17T19:49:55Z

A[i,j] = ... is very slow if A isa SparseMatrixCSC. In the wild, there's another case like this with CUDA.jl. In that package, it allows for doing CUDA.allowscalar(false) so that any A[i,j] operation (getindex and setindex!) causes an error to be thrown. This is helpful for finding unoptimized fallbacks which would otherwise "work" given the AbstractArray interface, but would be detrimental to performance. Without this, finding out which CUDA codes are hitting unwanted fallbacks can only be done by profiling.

It would be nice to have similar functionality available on SparseMatrixCSC. I am not sure I like the idea of having it global though, since then it's global, but if it's not global you need rules for how to carry it forward with broadcast etc. which might be hard to do in general and so the global is the easy option.

The text was updated successfully, but these errors were encountered:

rayegun · 2022-07-17T19:52:58Z

Is the idea to use this only for testing? I can't imagine it would be a good idea to do this in a package like SciML, disabling it for everything else, or to disable it in scripts like with CUDA.

ChrisRackauckas · 2022-07-17T19:54:31Z

Yes, I'd only use it for testing. The main thing would be to ensure we don't hit any fallback that are looping scalar-wise.

ViralBShah · 2022-07-17T19:55:24Z

If you are using with CUDA, you probably want to disable it, right? Packages can opt in to do that, I suppose.

ChrisRackauckas · 2022-07-17T20:02:18Z

Yes, with CUDA its generally the case that you say "if using CUDA, then put CUDA.allowscalar(false) in your script because that will catch common performance pitfalls and turn them into errors". The similar fact is true for sparse matrices that if you're doing A[i,j] operations all over the place then you'll have bad performance, but there isn't a similar way with SparseMatrixCSC to turn performance pitfalls into errors.

Of course it can be used incorrectly, the CUDA.jl one could be used incorrectly as well if a package sets it "for the user" and suddenly non-performance sensitive parts of the code are overly cautious, but I think over the years people have found CUDA.allowscalar(false) useful enough for testing and scripting safety that it would be nice to have a similar feature for sparse matrices.

rayegun · 2022-07-17T20:53:50Z

I definitely prefer the StaticSparseMatrixCSC <: AbstractSparseMatrixCSC route. But it is a little more onerous to implement.

ChrisRackauckas · 2022-07-17T20:56:25Z

What would you do with A::StaticSparseMatrixCSC .+ B::SparseMatrixCSC?

rayegun · 2022-07-17T21:01:15Z

Would that not just be the union of the patterns?

ChrisRackauckas · 2022-07-17T21:02:29Z

Promote to static?

rayegun · 2022-07-17T21:07:51Z

That seems reasonable. Especially if this is mostly for testing.

wnoise · 2022-07-18T00:55:56Z

I would have assumed non-static for mixed things like that -- much like a diagonal matrix plus a general matrix gives a general matrix.

SobhanMP · 2022-07-18T01:24:39Z

would it make sense to just disable it i.e. by redefining base.setindex(::SparseMatrixCSC, ::Any, ::Integer, ::Integer) = error()? specially since it's only for testing. adding any sort of runtime checks seems not ideal, specially for applications that use scalar_indexing a lot.

ViralBShah · 2022-07-18T01:34:25Z

I suppose that would work, but should be done through an API that can disable and re-enable.

ChrisRackauckas · 2022-07-18T01:46:06Z

specially since it's only for testing. adding any sort of runtime checks seems not ideal, specially for applications that use scalar_indexing a lot.

It already has to assume indirect access with scalar indexing so it already has a few branches that block SIMD and already has other branches that will fail branch prediction. This branch will be the most predictable (almost always true) so it would clearly be the lowest overhead one of the bunch. What would it cost, like 2% performance difference in a code path already deemed too slow to be used in cases where performance is required? Cases which care about performance would already never use the scalar indexing and would instead generate code for A.nzvals directly for this reason. I don't understand see how runtime performance could be the hold up.

using SparseArrays
const glob = Ref(true)
A = sprand(1000,1000,0.01)

function f(A,i1,i2,i4,i5,x)
    A[i1,i2] = x
    nothing
end

julia> @btime f(A,500,800,300,450,0.1)
  29.920 ns (0 allocations: 0 bytes)

julia> @btime f(A,500,800,300,450,0.1)
  29.087 ns (0 allocations: 0 bytes)

function f(A,i1,i2,i4,i5,x)
    if glob[]
        A[i1,i2] = x
    end
    nothing
end

julia> @btime f(A,500,800,300,450,0.1)
  30.191 ns (0 allocations: 0 bytes)

julia> @btime f(A,500,800,300,450,0.1)
  30.221 ns (0 allocations: 0 bytes)

SobhanMP · 2022-07-18T02:06:11Z

that's going to make the three argument dot of Symmetric{SparseArraysCSC} a whooping 2% slower.

The real question is what would stop you from using a custom AbstractSparseArraysCSC that doesn't have scalar indexing? A lot of code have a too narrow type definition and use "illegal" accessors. i.e. x.nzval and such. Seem like two birds with one stone to me.

From a design perspective having global states seem less than ideal.

ChrisRackauckas · 2022-07-18T02:13:56Z

A lot of code have a too narrow type definition and use "illegal" accessors. i.e. x.nzval and such. Seem like two birds with one stone to me.

? Of course it wouldn't touch nzval access because that's supposed to be fast. You want to find out what codes are accidentally doing A[i,j] instead of direct nzval accesses.

that's going to make the three argument dot of Symmetric{SparseArraysCSC} a whooping 2% slower.

If that's falling back to using A[i,j], well then this just identified a performance bug that should get fix. The dot, mul!, etc. implementations don't use A[i,j] for a good reason, and if they do, we should have a way of knowing that they are doing that accidentally so we can make them like 10x faster! The purpose of this would be to make it easy to turn those accidental fallback cases into errors to make them easier to identify.

The real question is what would stop you from using a custom AbstractSparseArraysCSC that doesn't have scalar indexing?

You'd have to create an entire mirror type and would not have a good way to guaranteeing that it's hitting exactly the same dispatches as the original path. If someone writes a SparseMatrixCSC overload that fixes a bad A[i,j] path, a separate type would not hit it, so it would be a spurious failure. I would prefer to make it easy for people to write good correct sparse code than to protect tiny bits of performance on already slow code.

SobhanMP · 2022-07-18T02:16:05Z

? Of course it wouldn't touch nzval access because that's supposed to be fast. You want to find out what codes are accidentally doing A[i,j] instead of direct nzval accesses.

you're missing the point, if they didn't have narrow types/did illegal access, it would be pretty easy to cook up a AbstractSparseArraysCSC variant that does what you want. and i t would be free of a global state.

ChrisRackauckas · 2022-07-18T02:17:55Z

it would be pretty easy to cook up a AbstractSparseArraysCSC variant that does what you want. and i t would be free of a global state.

No. Counter example: define a dot overload only on SparseMatrixCSC that avoids an A[i,j] indexing. Now your tests for whether a slow path are hit give the wrong answer if it's only relying on AbstractSparseMatrixCSC dispatches unless you do some fancy AbstractInterpreter thing to automatically expand all SparseMatrixCSC dispatches or something.

SobhanMP · 2022-07-18T02:28:39Z

i see, it's unavoidable if you use structures as you'll need concrete types but it can be still be generic? It still comes back to the fact that having functions that explicitly take SparseMatrixCSCs is a bad idea. (and fairly common)

SobhanMP · 2022-08-07T20:03:15Z

@ChrisRackauckas opinions? should it also block indexings like A[2, :]?

ChrisRackauckas · 2022-08-07T20:05:11Z

I would think those are fine. In the GPU case, those are allowed and allowscalar still catches the overwhelming majority of bad fallbacks.

SobhanMP · 2022-08-07T20:09:49Z

do note that A[2, :] is equivalent to n scalar indexings in this package

ChrisRackauckas · 2022-08-07T20:11:29Z

oh, then that's bad 😅. I guess that would error at the fallback

ViralBShah · 2022-08-07T20:20:14Z

Well it would do what you want - disable the use of scalar indexing. You should not try to extract a row out of a CSC matrix.

ChrisRackauckas · 2022-08-07T20:21:41Z

Agreed.

ChrisRackauckas · 2022-08-07T23:25:31Z

🎉

ViralBShah mentioned this issue Jul 17, 2022

Ability to turn on and off scalar indexing for sparse matrices? JuliaLang/julia#46080

Closed

ViralBShah added the performance label Jul 17, 2022

ChrisRackauckas mentioned this issue Jul 17, 2022

Ability to fix the sparsity pattern of a sparse matrix #190

Closed

SobhanMP mentioned this issue Jul 28, 2022

add allowscalar #200

Merged

SobhanMP closed this as completed in #200 Aug 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to turn on and off scalar indexing for sparse matrices? #189

Ability to turn on and off scalar indexing for sparse matrices? #189

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022 •

edited

Loading

ChrisRackauckas commented Jul 17, 2022

ViralBShah commented Jul 17, 2022

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022 •

edited

Loading

wnoise commented Jul 18, 2022

SobhanMP commented Jul 18, 2022 •

edited

Loading

ViralBShah commented Jul 18, 2022

ChrisRackauckas commented Jul 18, 2022 •

edited

Loading

SobhanMP commented Jul 18, 2022 •

edited

Loading

ChrisRackauckas commented Jul 18, 2022 •

edited

Loading

SobhanMP commented Jul 18, 2022 •

edited

Loading

ChrisRackauckas commented Jul 18, 2022

SobhanMP commented Jul 18, 2022

SobhanMP commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

SobhanMP commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

ViralBShah commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

Ability to turn on and off scalar indexing for sparse matrices? #189

Ability to turn on and off scalar indexing for sparse matrices? #189

Comments

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022 • edited Loading

ChrisRackauckas commented Jul 17, 2022

ViralBShah commented Jul 17, 2022

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022

ChrisRackauckas commented Jul 17, 2022

rayegun commented Jul 17, 2022 • edited Loading

wnoise commented Jul 18, 2022

SobhanMP commented Jul 18, 2022 • edited Loading

ViralBShah commented Jul 18, 2022

ChrisRackauckas commented Jul 18, 2022 • edited Loading

SobhanMP commented Jul 18, 2022 • edited Loading

ChrisRackauckas commented Jul 18, 2022 • edited Loading

SobhanMP commented Jul 18, 2022 • edited Loading

ChrisRackauckas commented Jul 18, 2022

SobhanMP commented Jul 18, 2022

SobhanMP commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

SobhanMP commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

ViralBShah commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

ChrisRackauckas commented Aug 7, 2022

rayegun commented Jul 17, 2022 •

edited

Loading

rayegun commented Jul 17, 2022 •

edited

Loading

SobhanMP commented Jul 18, 2022 •

edited

Loading

ChrisRackauckas commented Jul 18, 2022 •

edited

Loading

SobhanMP commented Jul 18, 2022 •

edited

Loading

ChrisRackauckas commented Jul 18, 2022 •

edited

Loading

SobhanMP commented Jul 18, 2022 •

edited

Loading