Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM example fails #197

Open
xiuliren opened this issue Jan 30, 2017 · 18 comments
Open

LSTM example fails #197

xiuliren opened this issue Jan 30, 2017 · 18 comments

Comments

@xiuliren
Copy link

xiuliren commented Jan 30, 2017

The "FullyConnected" do not work.

WARNING: symbol is deprecated, use Symbol instead.
 in depwarn(::String, ::Symbol) at ./deprecated.jl:64
 in symbol(::Symbol, ::Vararg{Any,N}) at ./deprecated.jl:30
 in #LSTM#2(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/lstm.jl:74
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at ./<missing>:0
 in include_from_node1(::String) at ./loading.jl:488
 in process_options(::Base.JLOptions) at ./client.jl:262
 in _start() at ./client.jl:318
while loading /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 11
ERROR: LoadError: AssertionError: FullyConnected only accepts SymbolicNode either as positional or keyword arguments, not both.
 in #FullyConnected#4042(::Array{Any,1}, ::Function, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at /usr/people/jingpeng/.julia/v0.5/MXNet/src/symbolic-node.jl:654
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at ./<missing>:0
 in #FullyConnected#4046(::Array{Any,1}, ::Function, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at /usr/people/jingpeng/.julia/v0.5/MXNet/src/symbolic-node.jl:696
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::MXNet.mx.SymbolicNode) at ./<missing>:0
 in #LSTM#2(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/lstm.jl:74
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at ./<missing>:0
 in include_from_node1(::String) at ./loading.jl:488
 in process_options(::Base.JLOptions) at ./client.jl:262
 in _start() at ./client.jl:318
while loading /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 11

the error line of code:
https://github.com/dmlc/MXNet.jl/blob/master/src/symbolic-node.jl#L654

If I comment it, it will construct the symbolic graph, but can not execute it.

ERROR: LoadError: AssertionError: Duplicated names in arguments: Symbol[:ptb_data_1,:ptb_embed_1_weight,:ptb_lstm_1_i2h_weight,:ptb_lstm_1_i2h_bias,:ptb_l1_init_h,:ptb_lstm_1_h2h_weight,:ptb_lstm_1_h2h_bias,:ptb_l1_init_c,:ptb_lstm_1_i2h_weight,:ptb_lstm_1_i2h_bias,:ptb_l2_init_h,:ptb_lstm_1_h2h_weight,:ptb_lstm_1_h2h_bias,:ptb_l2_init_c,:ptb_pred_1_weight,:ptb_pred_1_bias,:ptb_label_1,:ptb_data_2,:ptb_embed_2_weight,:ptb_lstm_2_i2h_weight,:ptb_lstm_2_i2h_bias,:

I am using the master branch with updated submodules.

julia> versioninfo()

Julia Version 0.5.0
Commit 3c9d753 (2016-09-19 18:14 UTC)
Platform Info:
  System: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, haswell)

@TravisA9
Copy link

Same problem here!

@pluskid
Copy link
Member

pluskid commented Feb 1, 2017

Could you try to change in lstm.jl all the code like mx.FullyConnected(data, ... to mx.FullyConnected(data=data, ... and see if it works?

@TravisA9
Copy link

TravisA9 commented Feb 1, 2017

Thank you for your reply Pluskid. I tried the changes you recommended and got the following error:

ERROR: LoadError: MethodError: MXNet.mx.#FullyConnected(::Array{Any,1}, ::MXNet.mx.#FullyConnected) is ambiguous. Candidates:
  (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, args::MXNet.mx.SymbolicNode...)
  (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, args::MXNet.mx.NDArray...)
 in #LSTM#2(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at .../.julia/v0.5/MXNet/examples/char-lstm/lstm.jl:73
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at ./<missing>:0
 in include_from_node1(::String) at ./loading.jl:488
while loading .../.julia/v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 11

On another note, there is a depreciation warning:

WARNING: deprecated syntax "[a=>b for (a,b) in c]".
Use "Dict(a=>b for (a,b) in c)" instead.

I haven't yet found where it's occurring but I will look into it.

@Arkoniak
Copy link
Contributor

Arkoniak commented Feb 2, 2017

@Arkoniak
Copy link
Contributor

Arkoniak commented Feb 2, 2017

Well, it looks like problem can be fixed by changing FullyConnected(data, .... to FullyConnected(mx.SymbolicNode, data=data, ...

I can make a PR with fixes, but this solution looks somewhat inconvenient. I think it's better to change appropriately _define_atomic_symbol_creator

P.S.: there is also bug in optimizer.jl: clip(grad, -opts.grad_clip, opts.grad_clip) should be changed to clip(grad, a_min=-opts.grad_clip, a_max=opts.grad_clip)

@TravisA9
Copy link

TravisA9 commented Feb 2, 2017

Thanks Arkoniak, That definitely got me further than before Now I'm getting another error:

LoadError: UnicodeError: invalid character index
 in schedule_and_wait(::Task, ::Void) at ./event.jl:110
 in consume(::Task) at ./task.jl:269
 in done at ./task.jl:274 [inlined]
 in #fit#6636(::Array{Any,1}, ::Function, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at .../.julia/v0.5/MXNet/src/model.jl:464
 in (::MXNet.mx.#kw##fit)(::Array{Any,1}, ::MXNet.mx.#fit, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at ./<missing>:0
...
.../v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 39

EDIT: It looks like I might have introduced a typo somewhere because I reinstalled MXNet and the char-LSTM example appears to be working fine now. Thank you! Now I can get back to figuring out the original problem of modifying the example to do things like translation.

@pluskid
Copy link
Member

pluskid commented Feb 3, 2017

@Arkoniak The ambiguity comes from the fact when every arguments are passed via keyword argument, then the function signature is empty, and the method dispatcher does not know which one (symbolic or NDArray) to call.

@TravisA9 It seems to be due to some encoding / decoding error reading your text file. Are you using non-ASCII text for testing? Maybe you need to check Julia document on how to properly decode a text file if you are not using UTF-8 encoding?

@Arkoniak
Copy link
Contributor

Arkoniak commented Feb 3, 2017

@pluskid Just as an idea. Would it be wrong, to test arguments of function and if all of them is SymbolicNode then call corresponding SymbolicNode function, and if all of them NDArray then call NDArray function? Something like

function somefunction(;kwargs...)
  num_symbolic = get_number_of_symbolic_node_args(kwargs)
  num_ndarray = get_number_of_ndarray_args(kwrags)
  if (num_symbolic > 0 && num_ndarray > 0) || (num_symbolic == 0 && num_ndarray == 0)
     error("Ambigous agruments")
  elseif num_symbolic > 0
    somefunction(SymbolicNode; kwargs...)
  else
     somefunction(NDArray; kwargs...)
  end
end

It feels somewhat hacky, but it could work, I presume.

@pluskid
Copy link
Member

pluskid commented Feb 5, 2017

@Arkoniak Yes, I agree this could be an option. We will need to define such wrapper for all the operators.

@TravisA9
Copy link

TravisA9 commented Feb 8, 2017

Thank you @pluskid, you might well be right:

@TravisA9 It seems to be due to some encoding / decoding error reading your text file. Are you using non-ASCII text for testing? Maybe you need to check Julia document on how to properly decode a text file if you are not using UTF-8 encoding?

However, I suspect there may be more to this. As I mentioned above I downloaded MXNet again and reinstalled and it worked fine. This lead me to believe that I have somehow mistakenly introduced an error somewhere. But I have run the example with no changes a few times and though it works most of the time there are times that it suddenly spits out that error again. In those cases I have deleted the generated files(vocab.dat and input.txt) and it runs fine again. Not a big deal though!

There is a second issue I have run into. In keeping with the mentioned scenario I decided to make a text (*txt) file with English->Nahuátl text ( just as a starting point ), swapping out input.txt for my language.txt file. I expected it to work because it is in fact UTF-8 text which I verified with different applications. Nahuátl text does have accent marks but somehow I don't think that is the problem I can't seem to find any important differences between this text and the original input.txt and nothing else has been changed from the char-lstm example.

Once again the error is: UnicodeError: invalid character index
Any suggestions?

EDIT: I tried running running char-lstm with language.txt after removing all accents and it still throws the error.

@vchuravy
Copy link
Collaborator

vchuravy commented Feb 9, 2017

@TravisA9 can you open a second issue with the encoding issue and post a small example, the error log, and if possible a reduced example?

@TravisA9
Copy link

TravisA9 commented Feb 9, 2017

Ok, @vchuravy that's not a bad idea. I'll do that.

@psilva07
Copy link

I have the same problem when trying to execute the regression example

Even following @Arkoniak suggestion to add FullyConnected(mx.SymbolicNode, data=data, the error only changed to

MethodError: no method matching FullyConnected(::MXNet.mx.SymbolicNode, ::Type{MXNet.mx.SymbolicNode}; num_hidden=500)
Closest candidates are:
  FullyConnected(::MXNet.mx.SymbolicNode...; kwargs...) at /Users/Pedro/.julia/v0.5/MXNet/src/symbolic-node.jl:696

Any help is appreciated.

@Arkoniak
Copy link
Contributor

Can you give gist of your code? Since there is no num_hidden=500 in original code, I suppose you've made some alterations and may be it is the reason, why code is not working. I've tested it locally and there are two bugs in original code.

This line https://github.com/dmlc/MXNet.jl/blob/master/examples/regression-example.jl#L33 should be changed to either

net  = @mx.chain    mx.FullyConnected(mx.SymbolicNode, data, num_hidden=10) =>

or

net  = @mx.chain    mx.Variable(:data) =>
                                mx.FullyConnected(num_hidden=10) =>

Secondly https://github.com/dmlc/MXNet.jl/blob/master/examples/regression-example.jl#L40 should be altered to

cost = mx.LinearRegressionOutput(mx.SymbolcNode, data = net, label=label)

with these changes code works fine. I'll send PR in near time.

@psilva07
Copy link

Right on! I was running the tutorial side-by-side with another dataset and they both got the same error. I just pasted the error from my run, rather than the tutorial. Nevertheless, your suggestion solves it. I also realized that the "mx.SymbolicNode" addition is only needed for the first layer. Thank you very much!

@aaronc8
Copy link

aaronc8 commented Mar 11, 2017

I still have the issue with the LSTM example. Changing FullyConnected(data, .... to FullyConnected(mx.SymbolicNode,data=data, .... still spits out the same error as original, that it only accepts positional or keyword arguments.

When I change it to 'FullyConnected(data=data, ....' I do get the change to the error saying it is Ambiguous, but when I use the proposed fix it doesn't help.

while loading C:\Users\xviol_000\Documents\Julia\225B Project\train.jl, in expression starting on line 12
 in #FullyConnected#3931(::Array{Any,1}, ::Function, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at symbolic-node.jl:654
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at <missing>:0
 in #FullyConnected#3935(::Array{Any,1}, ::Function, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at symbolic-node.jl:696
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::MXNet.mx.SymbolicNode) at <missing>:0
 in #lstm_cell#65(::Int64, ::Int64, ::Symbol, ::Function, ::MXNet.mx.SymbolicNode, ::LSTMState, ::LSTMParam) at lstm.jl:30
 in (::#kw##lstm_cell)(::Array{Any,1}, ::#lstm_cell, ::MXNet.mx.SymbolicNode, ::LSTMState, ::LSTMParam) at <missing>:0
 in #LSTM#66(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at lstm.jl:81
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at <missing>:0
 in include_string(::String, ::String) at loading.jl:441
 in include_string(::Module, ::String, ::String) at eval.jl:32
 in (::Atom.##59#62{String,String})() at eval.jl:81
 in withpath(::Atom.##59#62{String,String}, ::String) at utils.jl:30
 in withpath(::Function, ::String) at eval.jl:46
 in macro expansion at eval.jl:79 [inlined]
 in (::Atom.##58#61{Dict{String,Any}})() at task.jl:60

@pluskid
Copy link
Member

pluskid commented Mar 12, 2017

I did some test and realized that this is an issue that is seen on the latest release. @aaronc8 Maybe you can try Pkg.checkout("MXNet") to test if the latest version works. It has been working smoothly for me locally, but I do run into the issues described above with the latest MXNet.jl release.

@vchuravy Maybe we should consider making another bugfix release?

@aaronc8
Copy link

aaronc8 commented Mar 12, 2017

Thanks for the reply - doing checkout seems to have fixed the original error, but now I'm getting

LoadError: MXNet.mx.MXError("[15:59:51] D:\\Program Files (x86)\\Jenkins\\workspace\\mxnet\\mxnet\\src\\storage\\storage.cc:78: Compile with USE_CUDA=1 to enable GPU usage") while loading C:\Users\xviol_000\Documents\Julia\225B Project\train.jl, in expression starting on line 39 in macro expansion at base.jl:59 [inlined] in _ndarray_alloc(::Tuple{Int64,Int64}, ::MXNet.mx.Context, ::Bool) at ndarray.jl:42 in empty at ndarray.jl:152 [inlined] in zeros(::Tuple{Int64,Int64}, ::MXNet.mx.Context) at ndarray.jl:199 in copy!(::Array{MXNet.mx.NDArray,1}, ::Base.Generator{Array{Tuple,1},MXNet.mx.##6387#6391{MXNet.mx.Context}}) at abstractarray.jl:477 in _collect(::Type{MXNet.mx.NDArray}, ::Base.Generator{Array{Tuple,1},MXNet.mx.##6387#6391{MXNet.mx.Context}}, ::Base.HasShape) at array.jl:251 in #simple_bind#6386(::Dict{Symbol,MXNet.mx.GRAD_REQ}, ::Array{Any,1}, ::Function, ::MXNet.mx.SymbolicNode, ::MXNet.mx.Context) at executor.jl:133 in (::MXNet.mx.#kw##simple_bind)(::Array{Any,1}, ::MXNet.mx.#simple_bind, ::MXNet.mx.SymbolicNode, ::MXNet.mx.Context) at <missing>:0 in #fit#6516(::Array{Any,1}, ::Function, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at model.jl:396 in (::MXNet.mx.#kw##fit)(::Array{Any,1}, ::MXNet.mx.#fit, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at <missing>:0 in include_string(::String, ::String) at loading.jl:441 in include_string(::Module, ::String, ::String) at eval.jl:32 in (::Atom.##59#62{String,String})() at eval.jl:81 in withpath(::Atom.##59#62{String,String}, ::String) at utils.jl:30 in withpath(::Function, ::String) at eval.jl:46 in macro expansion at eval.jl:79 [inlined] in (::Atom.##58#61{Dict{String,Any}})() at task.jl:60

So in line 39-41 in an attempt to compile using USE_CUDA = 1 I changed it to

mx.fit(model, optimizer, data_tr, eval_data=data_val, n_epoch=N_EPOCH, USE_CUDA=1, initializer=mx.UniformInitializer(0.1), callbacks=[mx.speedometer(), mx.do_checkpoint(CKPOINT_PREFIX)], eval_metric=NLL())

and then it gives me

LoadError: MethodError: no method matching MXNet.mx.TrainingOptions(; eval_data = CharSeqProvider("\n\nGREMIO:\nGood morrow, neighbour Baptista.\n\nBAPTISTA:\nGood morrow, neighbour Gremio.\nGod save you, gentlemen!\n\nPETRUCHIO .....

where the whole text data is included in the error and at the end of the error it says

MXNet.mx.TrainingOptions(!Matched::MXNet.mx.AbstractInitializer, !Matched::Int64, !Matched::Union{MXNet.mx.AbstractDataProvider,Void}, !Matched::MXNet.mx.AbstractEvalMetric, !Matched::Union{MXNet.mx.KVStore,Symbol}, !Matched::Bool, !Matched::Array{MXNet.mx.AbstractCallback,1}, !Matched::Int64) at C:\Users\xviol_000\.julia\v0.5\MXNet\src\base.jl:271 got unsupported keyword arguments "eval_data", "n_epoch", "USE_CUDA", "initializer", "callbacks", "eval_metric" MXNet.mx.TrainingOptions(!Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any) at C:\Users\xviol_000\.julia\v0.5\MXNet\src\base.jl:271 got unsupported keyword arguments "eval_data", "n_epoch", "USE_CUDA", "initializer", "callbacks", "eval_metric"

which I think meant that I am not doing USE_CUDA = 1 at all correctly....

Sorry, I'm not the most code savvy :(

If it's any help - I do have an nvidia card but if it's simpler I'd rather just avoid incorporating CUDA and just use the CPU to just get the example to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants