Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia-based definition of new layer #39

Open
nowozin opened this issue Nov 26, 2015 · 13 comments
Open

Julia-based definition of new layer #39

nowozin opened this issue Nov 26, 2015 · 13 comments

Comments

@nowozin
Copy link

nowozin commented Nov 26, 2015

Hi,

I would like to define a new layer similar to dropout, using random number generation.
According to https://mxnet.readthedocs.org/en/latest/tutorial/new_op_howto.html it is currently possible to define layers using Python.

Is it possible to define new layers using Julia, and if so, could you provide a minimal example of a Julia-defined layer? (For example, something as simple as scaling the input.)

Thanks,
Sebastian

@pluskid
Copy link
Member

pluskid commented Nov 26, 2015

@nowozin Unfortunately it is currently not possible yet. Because Julia GC is not thread safe currently, so that makes things a lot complicated than in the Python case. We are still working on trying to come up with a solution yet.

@nowozin
Copy link
Author

nowozin commented Nov 26, 2015

Thank you for the quick reply and clarification.

A related question regarding the mx.randn functionality: I would like to generate random noise within a neural network, ideally so that the output of a SymbolicNode is random each time it is evaluated. Would I have to write a new layer for this, or is there already functionality to have such random nodes? It seems that mx.randn is a low-level interface that cannot be used when setting up a symbolic computation graph.

(This would be useful for variational Bayes neural network training objectives, e.g. (Blundell et al., 2015) and also described here.)

@vchuravy
Copy link
Collaborator

@nowozin I am working on this and the non functional PR is at #16

@pluskid
Copy link
Member

pluskid commented Nov 27, 2015

There is currently no rand number generator symbolic node. But a workaround is to treat those random numbers as input data and generate from a data provider. For example, the lstm chat-rnn example uses a customized data provider to generate all-zero matrices.

@nowozin
Copy link
Author

nowozin commented Nov 27, 2015

Thank you so much for the quick and informative replies, much appreciated.

@Andy-P
Copy link
Contributor

Andy-P commented Dec 7, 2015

@pluskid I would like to implement a custom layer; MDN loss layer
There are two examples using Python to define layers. One is in pure Python. The other is using NDArray (MXRtc).
Is it possible to do the second of those (i.e. using NDArray) from Julia?

@vchuravy
Copy link
Collaborator

Yes with NDArray it should be possible, I just have been quite busy the last two weeks and haven't gotten any coding done. Getting this to work is on the top of my priority list for the weekend.

@Andy-P
Copy link
Contributor

Andy-P commented Jan 6, 2016

Any progress on using NDArray + Julia to define layers?

@vchuravy
Copy link
Collaborator

vchuravy commented Jan 7, 2016

I made progress today, but it still has some way to go, but help is always welcome.

@Andy-P
Copy link
Contributor

Andy-P commented Feb 5, 2016

@vchuravy 3 questions

  1. Is this the branch your using to work on this?
    https://github.com/vchuravy/MXNet.jl/tree/vc/nativeops/src
  2. How can I help push this along?
  3. Is there anything external holding it up? i.e. need the MxNet team to make some change to the main library?

@vchuravy
Copy link
Collaborator

vchuravy commented Feb 5, 2016

  1. Yes that is the branch I have been working on
  2. The branch has a small example in there and the best thing to do is to execute that and see it crash. Then look into where the issues are coming from.
  3. There is one immediate issue that comes to mind and that is the livetime of objects in the mxnet side of things. Everything that is passed to Julia needs to be allocated on the heap and then free'd later on. Otherwise we will access data that is no longer valid. Python gets away with this because everything is synchronous.

I will see if I can devote some more time to this, but work gets in the way right now.

@vchuravy
Copy link
Collaborator

I hope that I can make another push for this during the coming week. There
are a few issues I have a handle on and one that I am not sure about. It
requires another serious push and so for me it is a time thing. But I am in
a similar position to you.

Keep in mind that this only gets us a CPU implementation and I haven't
looked at how the Python interface handles GPU (I know that it does), so
that might be something worthwhile to check out.

On Mon, 15 Feb 2016, 00:01 Andre Pemmelaar [email protected] wrote:

@vchuravy https://github.com/vchuravy @pluskid
https://github.com/pluskid

I spent some time this past week trying to debug this. Unfortunately, my
lack of experience with c++ has made progress very slow.

I have spent a fair amount of time testing MXNet.jl and it is easily the
best solution for Julia + Deep Learning if you can get away with using the
current native loss functions. Unfortunately for me, this is not the case.
So this issue of a custom layer has become THE issue for me. It is
basically the only thing standing in the way of my using MXNet.jl for
serious work.

So I need to ask, is this something that will likely take a long time to
solve either because it is actually quite difficult, or because no one who
has the skills has the time? Or is it just one of those things that just
needs a little extra love? I really can't tell.

If it seems unlikely to be solve soon that is fine. I will look for
another solution and come back to MXNet later. On the other hand, if it is
just around the corner I will hold off investing time in a different
framework and instead spend my time contributing where I can.


Reply to this email directly or view it on GitHub
#39 (comment).

@CarloLucibello
Copy link

+1 for this, I'm really looking forward for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants