dmlc / MXNet.jl

MXNet Julia Package - flexible and efficient deep learning in Julia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LSTM example fails

jingpengw opened this issue · comments

The "FullyConnected" do not work.

WARNING: symbol is deprecated, use Symbol instead.
 in depwarn(::String, ::Symbol) at ./deprecated.jl:64
 in symbol(::Symbol, ::Vararg{Any,N}) at ./deprecated.jl:30
 in #LSTM#2(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/lstm.jl:74
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at ./<missing>:0
 in include_from_node1(::String) at ./loading.jl:488
 in process_options(::Base.JLOptions) at ./client.jl:262
 in _start() at ./client.jl:318
while loading /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 11
ERROR: LoadError: AssertionError: FullyConnected only accepts SymbolicNode either as positional or keyword arguments, not both.
 in #FullyConnected#4042(::Array{Any,1}, ::Function, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at /usr/people/jingpeng/.julia/v0.5/MXNet/src/symbolic-node.jl:654
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at ./<missing>:0
 in #FullyConnected#4046(::Array{Any,1}, ::Function, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at /usr/people/jingpeng/.julia/v0.5/MXNet/src/symbolic-node.jl:696
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::MXNet.mx.SymbolicNode) at ./<missing>:0
 in #LSTM#2(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/lstm.jl:74
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at ./<missing>:0
 in include_from_node1(::String) at ./loading.jl:488
 in process_options(::Base.JLOptions) at ./client.jl:262
 in _start() at ./client.jl:318
while loading /usr/people/jingpeng/.julia/v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 11

the error line of code:
https://github.com/dmlc/MXNet.jl/blob/master/src/symbolic-node.jl#L654

If I comment it, it will construct the symbolic graph, but can not execute it.

ERROR: LoadError: AssertionError: Duplicated names in arguments: Symbol[:ptb_data_1,:ptb_embed_1_weight,:ptb_lstm_1_i2h_weight,:ptb_lstm_1_i2h_bias,:ptb_l1_init_h,:ptb_lstm_1_h2h_weight,:ptb_lstm_1_h2h_bias,:ptb_l1_init_c,:ptb_lstm_1_i2h_weight,:ptb_lstm_1_i2h_bias,:ptb_l2_init_h,:ptb_lstm_1_h2h_weight,:ptb_lstm_1_h2h_bias,:ptb_l2_init_c,:ptb_pred_1_weight,:ptb_pred_1_bias,:ptb_label_1,:ptb_data_2,:ptb_embed_2_weight,:ptb_lstm_2_i2h_weight,:ptb_lstm_2_i2h_bias,:

I am using the master branch with updated submodules.

julia> versioninfo()

Julia Version 0.5.0
Commit 3c9d753 (2016-09-19 18:14 UTC)
Platform Info:
  System: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, haswell)

Same problem here!

Could you try to change in lstm.jl all the code like mx.FullyConnected(data, ... to mx.FullyConnected(data=data, ... and see if it works?

Thank you for your reply Pluskid. I tried the changes you recommended and got the following error:

ERROR: LoadError: MethodError: MXNet.mx.#FullyConnected(::Array{Any,1}, ::MXNet.mx.#FullyConnected) is ambiguous. Candidates:
  (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, args::MXNet.mx.SymbolicNode...)
  (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, args::MXNet.mx.NDArray...)
 in #LSTM#2(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at .../.julia/v0.5/MXNet/examples/char-lstm/lstm.jl:73
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at ./<missing>:0
 in include_from_node1(::String) at ./loading.jl:488
while loading .../.julia/v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 11

On another note, there is a depreciation warning:

WARNING: deprecated syntax "[a=>b for (a,b) in c]".
Use "Dict(a=>b for (a,b) in c)" instead.

I haven't yet found where it's occurring but I will look into it.

Well, it looks like problem can be fixed by changing FullyConnected(data, .... to FullyConnected(mx.SymbolicNode, data=data, ...

I can make a PR with fixes, but this solution looks somewhat inconvenient. I think it's better to change appropriately _define_atomic_symbol_creator

P.S.: there is also bug in optimizer.jl: clip(grad, -opts.grad_clip, opts.grad_clip) should be changed to clip(grad, a_min=-opts.grad_clip, a_max=opts.grad_clip)

Thanks Arkoniak, That definitely got me further than before Now I'm getting another error:

LoadError: UnicodeError: invalid character index
 in schedule_and_wait(::Task, ::Void) at ./event.jl:110
 in consume(::Task) at ./task.jl:269
 in done at ./task.jl:274 [inlined]
 in #fit#6636(::Array{Any,1}, ::Function, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at .../.julia/v0.5/MXNet/src/model.jl:464
 in (::MXNet.mx.#kw##fit)(::Array{Any,1}, ::MXNet.mx.#fit, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at ./<missing>:0
...
.../v0.5/MXNet/examples/char-lstm/train.jl, in expression starting on line 39

EDIT: It looks like I might have introduced a typo somewhere because I reinstalled MXNet and the char-LSTM example appears to be working fine now. Thank you! Now I can get back to figuring out the original problem of modifying the example to do things like translation.

@Arkoniak The ambiguity comes from the fact when every arguments are passed via keyword argument, then the function signature is empty, and the method dispatcher does not know which one (symbolic or NDArray) to call.

@TravisA9 It seems to be due to some encoding / decoding error reading your text file. Are you using non-ASCII text for testing? Maybe you need to check Julia document on how to properly decode a text file if you are not using UTF-8 encoding?

@pluskid Just as an idea. Would it be wrong, to test arguments of function and if all of them is SymbolicNode then call corresponding SymbolicNode function, and if all of them NDArray then call NDArray function? Something like

function somefunction(;kwargs...)
  num_symbolic = get_number_of_symbolic_node_args(kwargs)
  num_ndarray = get_number_of_ndarray_args(kwrags)
  if (num_symbolic > 0 && num_ndarray > 0) || (num_symbolic == 0 && num_ndarray == 0)
     error("Ambigous agruments")
  elseif num_symbolic > 0
    somefunction(SymbolicNode; kwargs...)
  else
     somefunction(NDArray; kwargs...)
  end
end

It feels somewhat hacky, but it could work, I presume.

@Arkoniak Yes, I agree this could be an option. We will need to define such wrapper for all the operators.

Thank you @pluskid, you might well be right:

@TravisA9 It seems to be due to some encoding / decoding error reading your text file. Are you using non-ASCII text for testing? Maybe you need to check Julia document on how to properly decode a text file if you are not using UTF-8 encoding?

However, I suspect there may be more to this. As I mentioned above I downloaded MXNet again and reinstalled and it worked fine. This lead me to believe that I have somehow mistakenly introduced an error somewhere. But I have run the example with no changes a few times and though it works most of the time there are times that it suddenly spits out that error again. In those cases I have deleted the generated files(vocab.dat and input.txt) and it runs fine again. Not a big deal though!

There is a second issue I have run into. In keeping with the mentioned scenario I decided to make a text (*txt) file with English->Nahuátl text ( just as a starting point ), swapping out input.txt for my language.txt file. I expected it to work because it is in fact UTF-8 text which I verified with different applications. Nahuátl text does have accent marks but somehow I don't think that is the problem I can't seem to find any important differences between this text and the original input.txt and nothing else has been changed from the char-lstm example.

Once again the error is: UnicodeError: invalid character index
Any suggestions?

EDIT: I tried running running char-lstm with language.txt after removing all accents and it still throws the error.

@TravisA9 can you open a second issue with the encoding issue and post a small example, the error log, and if possible a reduced example?

Ok, @vchuravy that's not a bad idea. I'll do that.

I have the same problem when trying to execute the regression example

Even following @Arkoniak suggestion to add FullyConnected(mx.SymbolicNode, data=data, the error only changed to

MethodError: no method matching FullyConnected(::MXNet.mx.SymbolicNode, ::Type{MXNet.mx.SymbolicNode}; num_hidden=500)
Closest candidates are:
  FullyConnected(::MXNet.mx.SymbolicNode...; kwargs...) at /Users/Pedro/.julia/v0.5/MXNet/src/symbolic-node.jl:696

Any help is appreciated.

Can you give gist of your code? Since there is no num_hidden=500 in original code, I suppose you've made some alterations and may be it is the reason, why code is not working. I've tested it locally and there are two bugs in original code.

This line https://github.com/dmlc/MXNet.jl/blob/master/examples/regression-example.jl#L33 should be changed to either

net  = @mx.chain    mx.FullyConnected(mx.SymbolicNode, data, num_hidden=10) =>

or

net  = @mx.chain    mx.Variable(:data) =>
                                mx.FullyConnected(num_hidden=10) =>

Secondly https://github.com/dmlc/MXNet.jl/blob/master/examples/regression-example.jl#L40 should be altered to

cost = mx.LinearRegressionOutput(mx.SymbolcNode, data = net, label=label)

with these changes code works fine. I'll send PR in near time.

Right on! I was running the tutorial side-by-side with another dataset and they both got the same error. I just pasted the error from my run, rather than the tutorial. Nevertheless, your suggestion solves it. I also realized that the "mx.SymbolicNode" addition is only needed for the first layer. Thank you very much!

I still have the issue with the LSTM example. Changing FullyConnected(data, .... to FullyConnected(mx.SymbolicNode,data=data, .... still spits out the same error as original, that it only accepts positional or keyword arguments.

When I change it to 'FullyConnected(data=data, ....' I do get the change to the error saying it is Ambiguous, but when I use the proposed fix it doesn't help.

while loading C:\Users\xviol_000\Documents\Julia\225B Project\train.jl, in expression starting on line 12
 in #FullyConnected#3931(::Array{Any,1}, ::Function, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at symbolic-node.jl:654
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::Type{MXNet.mx.SymbolicNode}, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at <missing>:0
 in #FullyConnected#3935(::Array{Any,1}, ::Function, ::MXNet.mx.SymbolicNode, ::Vararg{MXNet.mx.SymbolicNode,N}) at symbolic-node.jl:696
 in (::MXNet.mx.#kw##FullyConnected)(::Array{Any,1}, ::MXNet.mx.#FullyConnected, ::MXNet.mx.SymbolicNode) at <missing>:0
 in #lstm_cell#65(::Int64, ::Int64, ::Symbol, ::Function, ::MXNet.mx.SymbolicNode, ::LSTMState, ::LSTMParam) at lstm.jl:30
 in (::#kw##lstm_cell)(::Array{Any,1}, ::#lstm_cell, ::MXNet.mx.SymbolicNode, ::LSTMState, ::LSTMParam) at <missing>:0
 in #LSTM#66(::Int64, ::Symbol, ::Bool, ::Function, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at lstm.jl:81
 in (::#kw##LSTM)(::Array{Any,1}, ::#LSTM, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64) at <missing>:0
 in include_string(::String, ::String) at loading.jl:441
 in include_string(::Module, ::String, ::String) at eval.jl:32
 in (::Atom.##59#62{String,String})() at eval.jl:81
 in withpath(::Atom.##59#62{String,String}, ::String) at utils.jl:30
 in withpath(::Function, ::String) at eval.jl:46
 in macro expansion at eval.jl:79 [inlined]
 in (::Atom.##58#61{Dict{String,Any}})() at task.jl:60

I did some test and realized that this is an issue that is seen on the latest release. @aaronc8 Maybe you can try Pkg.checkout("MXNet") to test if the latest version works. It has been working smoothly for me locally, but I do run into the issues described above with the latest MXNet.jl release.

@vchuravy Maybe we should consider making another bugfix release?

Thanks for the reply - doing checkout seems to have fixed the original error, but now I'm getting

LoadError: MXNet.mx.MXError("[15:59:51] D:\\Program Files (x86)\\Jenkins\\workspace\\mxnet\\mxnet\\src\\storage\\storage.cc:78: Compile with USE_CUDA=1 to enable GPU usage") while loading C:\Users\xviol_000\Documents\Julia\225B Project\train.jl, in expression starting on line 39 in macro expansion at base.jl:59 [inlined] in _ndarray_alloc(::Tuple{Int64,Int64}, ::MXNet.mx.Context, ::Bool) at ndarray.jl:42 in empty at ndarray.jl:152 [inlined] in zeros(::Tuple{Int64,Int64}, ::MXNet.mx.Context) at ndarray.jl:199 in copy!(::Array{MXNet.mx.NDArray,1}, ::Base.Generator{Array{Tuple,1},MXNet.mx.##6387#6391{MXNet.mx.Context}}) at abstractarray.jl:477 in _collect(::Type{MXNet.mx.NDArray}, ::Base.Generator{Array{Tuple,1},MXNet.mx.##6387#6391{MXNet.mx.Context}}, ::Base.HasShape) at array.jl:251 in #simple_bind#6386(::Dict{Symbol,MXNet.mx.GRAD_REQ}, ::Array{Any,1}, ::Function, ::MXNet.mx.SymbolicNode, ::MXNet.mx.Context) at executor.jl:133 in (::MXNet.mx.#kw##simple_bind)(::Array{Any,1}, ::MXNet.mx.#simple_bind, ::MXNet.mx.SymbolicNode, ::MXNet.mx.Context) at <missing>:0 in #fit#6516(::Array{Any,1}, ::Function, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at model.jl:396 in (::MXNet.mx.#kw##fit)(::Array{Any,1}, ::MXNet.mx.#fit, ::MXNet.mx.FeedForward, ::MXNet.mx.ADAM, ::CharSeqProvider) at <missing>:0 in include_string(::String, ::String) at loading.jl:441 in include_string(::Module, ::String, ::String) at eval.jl:32 in (::Atom.##59#62{String,String})() at eval.jl:81 in withpath(::Atom.##59#62{String,String}, ::String) at utils.jl:30 in withpath(::Function, ::String) at eval.jl:46 in macro expansion at eval.jl:79 [inlined] in (::Atom.##58#61{Dict{String,Any}})() at task.jl:60

So in line 39-41 in an attempt to compile using USE_CUDA = 1 I changed it to

mx.fit(model, optimizer, data_tr, eval_data=data_val, n_epoch=N_EPOCH, USE_CUDA=1, initializer=mx.UniformInitializer(0.1), callbacks=[mx.speedometer(), mx.do_checkpoint(CKPOINT_PREFIX)], eval_metric=NLL())

and then it gives me

LoadError: MethodError: no method matching MXNet.mx.TrainingOptions(; eval_data = CharSeqProvider("\n\nGREMIO:\nGood morrow, neighbour Baptista.\n\nBAPTISTA:\nGood morrow, neighbour Gremio.\nGod save you, gentlemen!\n\nPETRUCHIO .....

where the whole text data is included in the error and at the end of the error it says

MXNet.mx.TrainingOptions(!Matched::MXNet.mx.AbstractInitializer, !Matched::Int64, !Matched::Union{MXNet.mx.AbstractDataProvider,Void}, !Matched::MXNet.mx.AbstractEvalMetric, !Matched::Union{MXNet.mx.KVStore,Symbol}, !Matched::Bool, !Matched::Array{MXNet.mx.AbstractCallback,1}, !Matched::Int64) at C:\Users\xviol_000\.julia\v0.5\MXNet\src\base.jl:271 got unsupported keyword arguments "eval_data", "n_epoch", "USE_CUDA", "initializer", "callbacks", "eval_metric" MXNet.mx.TrainingOptions(!Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any, !Matched::Any) at C:\Users\xviol_000\.julia\v0.5\MXNet\src\base.jl:271 got unsupported keyword arguments "eval_data", "n_epoch", "USE_CUDA", "initializer", "callbacks", "eval_metric"

which I think meant that I am not doing USE_CUDA = 1 at all correctly....

Sorry, I'm not the most code savvy :(

If it's any help - I do have an nvidia card but if it's simpler I'd rather just avoid incorporating CUDA and just use the CPU to just get the example to work.