Autograd.Sparse type causes regression #114

ekinakyurek · 2019-10-01T18:17:45Z

Earlier, I could accumulate my gradients across iterations. However, recent changes in AutoGrad break it, because I can't sum two gradient array when they are AutoGrad.Sparse. There can be other issues with this type which I didn't test yet. In general, I believe one should get a gradient which is capable of everything that the corresponding parameter type can do.

julia> function foo(w)
           return w[1][1]+w[2][1]
       end
foo (generic function with 1 method)

julia> w = [param(3,3),param(3,3)]
2-element Array{Param{KnetArray{Float32,2}},1}:
 P(KnetArray{Float32,2}(3,3))
 P(KnetArray{Float32,2}(3,3))

julia> J = @diff foo(w)
T(-0.32367945)

julia> grad(J,w[1])
Sparse(KnetArray{Float32,2}(3,3)())

julia> grad(J,w[1]) + grad(J,w[2])
ERROR: MethodError: +(::AutoGrad.Sparse{Float32,2}, ::AutoGrad.Sparse{Float32,2}) is ambiguous. Candidates:
  +(a::AbstractArray, s::AutoGrad.Sparse) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:73
  +(s::AutoGrad.Sparse, a::AbstractArray) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:74
Possible fix, define
  +(::AutoGrad.Sparse, ::AutoGrad.Sparse)
Stacktrace:
 [1] top-level scope at REPL[28]:1

julia> grad(J,w[1]) + grad(J,w[1])
ERROR: MethodError: +(::AutoGrad.Sparse{Float32,2}, ::AutoGrad.Sparse{Float32,2}) is ambiguous. Candidates:
  +(a::AbstractArray, s::AutoGrad.Sparse) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:73
  +(s::AutoGrad.Sparse, a::AbstractArray) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:74
Possible fix, define
  +(::AutoGrad.Sparse, ::AutoGrad.Sparse)
Stacktrace:
 [1] top-level scope at REPL[29]:1

denizyuret · 2019-10-02T05:16:22Z

Ekin: sparse gradients are a big boost to some models, e.g. ones that use word embeddings with large vocabularies. Without sparse gradients, the gradient would have to be the same size as the whole embedding matrix even though you only want to update a few columns. However I cannot support all possible array operations for this Sparse type without significant effort. I supported the ones I used internally for update! etc., we can add others as needed (your + seems to be an easy one, we have to decide whether you want the result to be sparse or dense). In the meantime you can simply use full(grad(x,y)) to get a regular array.

ekinakyurek · 2019-10-02T19:45:29Z

Let say your gpu cannot handle a task with batchsize = 32. However, you want to simulate same training. One way accomplish this using batchsize=8 and averaging the gradients through 4 iteration. This is where I got the error. I hope this helps.

________________________________ From: denizyuret <[email protected]> Sent: Wednesday, October 2, 2019 1:16 AM To: denizyuret/AutoGrad.jl Cc: Ekin Akyürek; Author Subject: Re: [denizyuret/AutoGrad.jl] Autograd.Sparse type causes regression (#114) I understand adding gradients to parameters but why would you add two gradients together?

On Tue, Oct 1, 2019 at 9:17 PM Ekin Akyürek ***@***.***> wrote: Earlier, I could accumulate my gradients across iterations. However, recent changes in AutoGrad break it. because I can't sum two gradient now. There can be other issues with this type which I didn't test. In general, I believe one should get a gradient which is capable of everything that the corresponding parameter type can do it. julia> function foo(w) return w[1][1]+w[2][1] end foo (generic function with 1 method) julia> w = [param(3,3),param(3,3)] 2-element Array{Param{KnetArray{Float32,2}},1}: P(KnetArray{Float32,2}(3,3)) P(KnetArray{Float32,2}(3,3)) julia> J = @diff foo(w) T(-0.32367945) julia> grad(J,w[1]) Sparse(KnetArray{Float32,2}(3,3)()) julia> grad(J,w[1]) + grad(J,w[2]) ERROR: MethodError: +(::AutoGrad.Sparse{Float32,2}, ::AutoGrad.Sparse{Float32,2}) is ambiguous. Candidates: +(a::AbstractArray, s::AutoGrad.Sparse) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:73 +(s::AutoGrad.Sparse, a::AbstractArray) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:74 Possible fix, define +(::AutoGrad.Sparse, ::AutoGrad.Sparse) Stacktrace: [1] top-level scope at REPL[28]:1 julia> grad(J,w[1]) + grad(J,w[1]) ERROR: MethodError: +(::AutoGrad.Sparse{Float32,2}, ::AutoGrad.Sparse{Float32,2}) is ambiguous. Candidates: +(a::AbstractArray, s::AutoGrad.Sparse) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:73 +(s::AutoGrad.Sparse, a::AbstractArray) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:74 Possible fix, define +(::AutoGrad.Sparse, ::AutoGrad.Sparse) Stacktrace: [1] top-level scope at REPL[29]:1 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#114?email_source=notifications&email_token=AAN43JTJRUYR6L5ZYSUOSSLQMOH4TA5CNFSM4I4NCDR2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HO5T6RA>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAN43JQWTA67GE3ZVLZXTZDQMOH4TANCNFSM4I4NCDRQ> .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#114?email_source=notifications&email_token=ADVGX4JP2W76GNB4U6I3V2DQMQVCPA5CNFSM4I4NCDR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEADSK7A#issuecomment-537339260>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ADVGX4MXJAFOAUYOLQV4AGTQMQVCPANCNFSM4I4NCDRQ>.

ekinakyurek · 2019-10-04T00:47:57Z

yeah, full works for me! Though, the problematic thing about this interface is that you don't know what will your gradient type be in advance.

denizyuret · 2019-10-04T07:57:56Z

I will make + work as well.

…

On Fri, Oct 4, 2019, 3:47 AM Ekin Akyürek ***@***.***> wrote: yeah, full works for me! Though, the problematic thing about this interface is that you don't know what will your gradient type be in advance. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#114?email_source=notifications&email_token=AAN43JXNEPCDVQ3YVTAYTXLQM2HD5A5CNFSM4I4NCDR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAKAQEI#issuecomment-538183697>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAN43JVPUBSMCO4ZZC4P2CDQM2HD5ANCNFSM4I4NCDRQ> .

ekinakyurek · 2019-10-11T17:52:53Z

I realize that it has broken the Knet. When you have Adam optimizer with gclip and you get a Sparse gradient, the gclip fails in this case.

denizyuret · 2019-10-11T19:20:13Z

I can't replicate, the following works fine. Please provide a minimal example.

using Knet

# Load data (mnistdata basically replicates mnist.ipynb)                                                                      
include(Knet.dir("data","mnist.jl"))
dtrn,dtst = mnistdata(xsize=(784,:),xtype=Array)

struct Foo; w; end

model = Foo(param(10,784))

# We turn Linear instances into callable objects for prediction:                                                              
(m::Foo)(x) = (I = (a->a[1]).(vec(argmax(x,dims=1))); m.w[:,I])

# model(x) gives predictions, let model(x,y) give the loss                                                                    
(m::Foo)(x, y) = nll(m(x), y)

@info "training..."
@time Knet.minimize!(model, dtst, Adam(lr=0.0001,gclip=0.1))

denizyuret · 2019-10-11T19:21:13Z

dy/sparsebugs branch has implemented + for two Sparse values, please test.

ekinakyurek · 2019-10-13T00:09:23Z

Although, I didn't run your example, I believe you didn't get the error because your gradients doesn't exceed the gclip value. Here is a simpler example you can replicate without downloading anything.

julia> using Knet

julia> function foo(w)
           s = 0.0
           for i=1:length(w); s+=w[i]; end
           return s
       end

foo (generic function with 1 method)

julia> w = Param(randn(2,2))
2×2 Param{Array{Float64,2}}:
  0.427868   0.657678
 -0.332868  -1.50003

julia> J = @diff foo(w)
T(-0.7473544438700652)

julia> update!(value(w), grad(J,w), Adam(gclip=0.1))
ERROR: MethodError: lmul!(::Float64, ::AutoGrad.Sparse{Float64,2}) is ambiguous. Candidates:
  lmul!(a, x::AutoGrad.Sparse{T,N}) where {T, N} in AutoGrad at /kuacc/users/eakyurek13/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:51
  lmul!(s::Number, X::AbstractArray) in LinearAlgebra at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/LinearAlgebra/src/generic.jl:100
Possible fix, define
  lmul!(::Number, ::AutoGrad.Sparse{T,N})
Stacktrace:
 [1] gclip!(::AutoGrad.Sparse{Float64,2}, ::Float64) at /kuacc/users/eakyurek13/.julia/packages/Knet/IIjk8/src/update.jl:613
 [2] update!(::Array{Float64,2}, ::AutoGrad.Sparse{Float64,2}, ::Adam) at /kuacc/users/eakyurek13/.julia/packages/Knet/IIjk8/src/update.jl:537
 [3] top-level scope at REPL[6]:1

denizyuret · 2019-10-13T06:16:12Z

You are right, it was an ambiguity issue. I will create a PR now.

Fixing #114 Sparse bugs

denizyuret · 2019-10-22T06:41:39Z

Fixed in current master.

ekinakyurek changed the title ~~Autograd.Sparse type regression~~ Autograd.Sparse type causes regression Oct 1, 2019

ekinakyurek assigned denizyuret Oct 2, 2019

denizyuret added a commit that referenced this issue Oct 13, 2019

tests for #114

c159788

denizyuret added a commit that referenced this issue Oct 17, 2019

Merge pull request #115 from denizyuret/dy/sparsebugs

b930ec4

Fixing #114 Sparse bugs

denizyuret closed this as completed Oct 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autograd.Sparse type causes regression #114

Autograd.Sparse type causes regression #114

ekinakyurek commented Oct 1, 2019 •

edited

Loading

denizyuret commented Oct 2, 2019 via email •

edited

Loading

ekinakyurek commented Oct 2, 2019 via email

ekinakyurek commented Oct 4, 2019

denizyuret commented Oct 4, 2019 via email

ekinakyurek commented Oct 11, 2019

denizyuret commented Oct 11, 2019

denizyuret commented Oct 11, 2019

ekinakyurek commented Oct 13, 2019 •

edited

Loading

denizyuret commented Oct 13, 2019

denizyuret commented Oct 22, 2019

Autograd.Sparse type causes regression #114

Autograd.Sparse type causes regression #114

Comments

ekinakyurek commented Oct 1, 2019 • edited Loading

denizyuret commented Oct 2, 2019 via email • edited Loading

ekinakyurek commented Oct 2, 2019 via email

ekinakyurek commented Oct 4, 2019

denizyuret commented Oct 4, 2019 via email

ekinakyurek commented Oct 11, 2019

denizyuret commented Oct 11, 2019

denizyuret commented Oct 11, 2019

ekinakyurek commented Oct 13, 2019 • edited Loading

denizyuret commented Oct 13, 2019

denizyuret commented Oct 22, 2019

ekinakyurek commented Oct 1, 2019 •

edited

Loading

denizyuret commented Oct 2, 2019 via email •

edited

Loading

ekinakyurek commented Oct 13, 2019 •

edited

Loading