Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autograd.Sparse type causes regression #114

Closed
ekinakyurek opened this issue Oct 1, 2019 · 10 comments
Closed

Autograd.Sparse type causes regression #114

ekinakyurek opened this issue Oct 1, 2019 · 10 comments
Assignees

Comments

@ekinakyurek
Copy link
Collaborator

ekinakyurek commented Oct 1, 2019

Earlier, I could accumulate my gradients across iterations. However, recent changes in AutoGrad break it, because I can't sum two gradient array when they are AutoGrad.Sparse. There can be other issues with this type which I didn't test yet. In general, I believe one should get a gradient which is capable of everything that the corresponding parameter type can do.

julia> function foo(w)
           return w[1][1]+w[2][1]
       end
foo (generic function with 1 method)

julia> w = [param(3,3),param(3,3)]
2-element Array{Param{KnetArray{Float32,2}},1}:
 P(KnetArray{Float32,2}(3,3))
 P(KnetArray{Float32,2}(3,3))

julia> J = @diff foo(w)
T(-0.32367945)

julia> grad(J,w[1])
Sparse(KnetArray{Float32,2}(3,3)())

julia> grad(J,w[1]) + grad(J,w[2])
ERROR: MethodError: +(::AutoGrad.Sparse{Float32,2}, ::AutoGrad.Sparse{Float32,2}) is ambiguous. Candidates:
  +(a::AbstractArray, s::AutoGrad.Sparse) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:73
  +(s::AutoGrad.Sparse, a::AbstractArray) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:74
Possible fix, define
  +(::AutoGrad.Sparse, ::AutoGrad.Sparse)
Stacktrace:
 [1] top-level scope at REPL[28]:1

julia> grad(J,w[1]) + grad(J,w[1])
ERROR: MethodError: +(::AutoGrad.Sparse{Float32,2}, ::AutoGrad.Sparse{Float32,2}) is ambiguous. Candidates:
  +(a::AbstractArray, s::AutoGrad.Sparse) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:73
  +(s::AutoGrad.Sparse, a::AbstractArray) in AutoGrad at /home/gridsan/eakyurek/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:74
Possible fix, define
  +(::AutoGrad.Sparse, ::AutoGrad.Sparse)
Stacktrace:
 [1] top-level scope at REPL[29]:1

@ekinakyurek ekinakyurek changed the title Autograd.Sparse type regression Autograd.Sparse type causes regression Oct 1, 2019
@denizyuret
Copy link
Owner

denizyuret commented Oct 2, 2019 via email

@ekinakyurek
Copy link
Collaborator Author

ekinakyurek commented Oct 2, 2019 via email

@ekinakyurek
Copy link
Collaborator Author

yeah, full works for me! Though, the problematic thing about this interface is that you don't know what will your gradient type be in advance.

@denizyuret
Copy link
Owner

denizyuret commented Oct 4, 2019 via email

@ekinakyurek
Copy link
Collaborator Author

I realize that it has broken the Knet. When you have Adam optimizer with gclip and you get a Sparse gradient, the gclip fails in this case.

@denizyuret
Copy link
Owner

I can't replicate, the following works fine. Please provide a minimal example.

using Knet

# Load data (mnistdata basically replicates mnist.ipynb)                                                                      
include(Knet.dir("data","mnist.jl"))
dtrn,dtst = mnistdata(xsize=(784,:),xtype=Array)

struct Foo; w; end

model = Foo(param(10,784))

# We turn Linear instances into callable objects for prediction:                                                              
(m::Foo)(x) = (I = (a->a[1]).(vec(argmax(x,dims=1))); m.w[:,I])

# model(x) gives predictions, let model(x,y) give the loss                                                                    
(m::Foo)(x, y) = nll(m(x), y)

@info "training..."
@time Knet.minimize!(model, dtst, Adam(lr=0.0001,gclip=0.1))

@denizyuret
Copy link
Owner

dy/sparsebugs branch has implemented + for two Sparse values, please test.

@ekinakyurek
Copy link
Collaborator Author

ekinakyurek commented Oct 13, 2019

Although, I didn't run your example, I believe you didn't get the error because your gradients doesn't exceed the gclip value. Here is a simpler example you can replicate without downloading anything.

julia> using Knet

julia> function foo(w)
           s = 0.0
           for i=1:length(w); s+=w[i]; end
           return s
       end

foo (generic function with 1 method)

julia> w = Param(randn(2,2))
2×2 Param{Array{Float64,2}}:
  0.427868   0.657678
 -0.332868  -1.50003

julia> J = @diff foo(w)
T(-0.7473544438700652)

julia> update!(value(w), grad(J,w), Adam(gclip=0.1))
ERROR: MethodError: lmul!(::Float64, ::AutoGrad.Sparse{Float64,2}) is ambiguous. Candidates:
  lmul!(a, x::AutoGrad.Sparse{T,N}) where {T, N} in AutoGrad at /kuacc/users/eakyurek13/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:51
  lmul!(s::Number, X::AbstractArray) in LinearAlgebra at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/LinearAlgebra/src/generic.jl:100
Possible fix, define
  lmul!(::Number, ::AutoGrad.Sparse{T,N})
Stacktrace:
 [1] gclip!(::AutoGrad.Sparse{Float64,2}, ::Float64) at /kuacc/users/eakyurek13/.julia/packages/Knet/IIjk8/src/update.jl:613
 [2] update!(::Array{Float64,2}, ::AutoGrad.Sparse{Float64,2}, ::Adam) at /kuacc/users/eakyurek13/.julia/packages/Knet/IIjk8/src/update.jl:537
 [3] top-level scope at REPL[6]:1

@denizyuret
Copy link
Owner

You are right, it was an ambiguity issue. I will create a PR now.

denizyuret added a commit that referenced this issue Oct 13, 2019
denizyuret added a commit that referenced this issue Oct 17, 2019
@denizyuret
Copy link
Owner

Fixed in current master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants