dmlc / MXNet.jl

MXNet Julia Package - flexible and efficient deep learning in Julia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support NVVM

vchuravy opened this issue · comments

We hope to release v0.9 RC1 next week. Would be great if you can make julia test pass by then

@vchuravy Are you working on or planning to work on this? If not I will probably try to give it a shot tonight or tomorrow. Otherwise if libmxnet switches to nnvm as master, MXNet.jl master will probably break.

I won't get to it till the weekend. So please go ahead.

Here is a big issue: now all the operators are defined for both ndarray and symbolic-node. But most of the operators do not have strong enough type for disambiguate between ndarray and symbolic-node. Because the keyword arguments are rather weakly typed, a call to FullyConnected(data=a_symbol, hidden_dim=100) cannot determine it is calling the op for ndarray or symbolic node.

Possible solutions are adding prefix to either nd or symbolic ops so that they could have different names, essentially removing ambiguity. Or we can put them under different namespaces, which is basically the same. Neither of the two seem to be very satisfactory, though.

@vchuravy any idea?

Hm, but are there two function ptrs one for symbolic-node and one for NDArray? Or is it one function ptr that you can call with both?

One pointer for both of them. Called using different API entry. Problem is
Julia side ambiguity.
On Wed, Sep 21, 2016 at 9:56 PM Valentin Churavy notifications@github.com
wrote:

Hm, but are there two function ptrs one for symbolic-node and one for
NDArray? Or is it one function ptr that you can call with both?


You are receiving this because you were assigned.

Reply to this email directly, view it on GitHub
#145 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAN1ib18yqz1VaIVmvv1Ac3ceYDVBewks5qseBlgaJpZM4J_cv0
.

But mixing them is not allowed? All varargs and positional arguments should be strongly typed, but I assume the problem is purely kwargs calls?

If you can pull the argument list into julia side and define the function signature, it is likely you can define function signature like.

Basically restricting the symbolic input as strong type, but rest kwargs as weak type

FullyConnected(data : Symbol, hidden_ndim:{Any, Nullable}) 

This will not solve the problem when there is no input symbols

@vchuravy Yes, they are strongly typed when only there are positional arguments. They are currently all defined to be OpName(args::SymbolicNode...; kwargs...). When there is no positional argument, for example, when one say FullyConnected(data=data, hidden_dim=100), everything is keyword argument, and ambiguity occurs.

I think the problem is that in Julia, positional arguments and keyword arguments are separated, one cannot call a positional argument using the style arg=val like that in Python. One possible solution is to only allow the user to write FullyConnected(data, hidden_dim=100), which will make it slightly different from the python way of writing. But I think the bigger problem is that how can I tell which arguments are positional and which are keyword from MXSymbolGetAtomicSymbolInfo? @tqchen

For most of the NN layer ops, I could probably assume they all have a data argument, and force it to be positional, but this does not seem to be true for all other operators.

The safest option would be to namespace them for now, but that would break backwards compatibility.
Alternatively we could check the keyword arguments for provided types and dispatch onto that.

So we provide multiple definitions.

FullyConnected(::Type{SymbolicNode}; kwargs...)
FullyConnected(::Type{NDArray}; kwargs...)
FullyConnected(; kwargs...) = T = check_types(kwargs); FullyConnected(T; kwargs...)
# + strongly typed positional

@vchuravy I think forcing the user to avoid using data=xxx solves most of the problem. There are cases, for example, if you look at test_compose in the unit test,

  data = mx.Variable(:data)
  net1 = mx.FullyConnected(data, name=:fc1, num_hidden=10)
  net1 = mx.FullyConnected(net1, name=:fc2, num_hidden=100)

  net2 = mx.FullyConnected(name=:fc3, num_hidden=10)
  net2 = mx.Activation(data=net2, act_type=:relu)
  net2 = mx.FullyConnected(data=net2, name=:fc4, num_hidden=20)

  composed  = net2(fc3_data=net1, name=:composed)

Then it is impossible to determine which mx.FullyConnected(name=:fc3, num_hidden=10) is referring to. There seem to be no easy way in Julia to explicitly choose a method to call. I'm proposing something like this:

function FullyConnected{T}(::Type{T}, args::T...; kwargs...)
  # ...
end

function FullyConnected(args::NDArray...; kwargs...)
  FullyConnected(NDArray, args...; kwargs...)
end

function FullyConnected(args::SymbolicNode...; kwargs...)
  FullyConnected(SymbolicNode, args...; kwargs...)
end

Then when ambiguity happens, the user could always force to call one of them by

mx.FullyConnected(SymbolicNode, name=:fc3, num_hidden=10)

That sounds like a decent solution. The only other viable solution I see is that MXNet proper has an entry point that could disambiguate both of them, since they are now more or less unified that might not be to unreasonable.