Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plans to add custom loss functions? #77

Open
Simplex-AP opened this issue Apr 13, 2016 · 3 comments
Open

Any plans to add custom loss functions? #77

Simplex-AP opened this issue Apr 13, 2016 · 3 comments

Comments

@Simplex-AP
Copy link

The python version of MXNet has the capability to define custom loss functions, but this is missing from the Julia version. Are there any plans to add it?

@vchuravy
Copy link
Collaborator

Depends on #39 and see my work in #16. The ability to define custom loss functions requires layers written in python and is what inspired me to work on it. I have plans to work on this but it will need to wait till julia v0.5

@pluskid
Copy link
Member

pluskid commented Apr 15, 2016

Another option which might be easier to implement (though involves quite a lot refactoring but not technically difficult) is the new module system in the mxnet python side. See apache/mxnet#1868 for example. With the new module system, a hybrid symbolic and imperative module can be used. The computation graph is built using symbolic nodes, while the loss function is written directly in Python.

Unfortunately, I have no estimate of timeline when I would be able to find enough time to port that to Julia side. Contributions are very welcome of course!

@Arkoniak
Copy link
Contributor

Arkoniak commented Dec 29, 2016

There is a discussion In R branch of MXNet. They propose to use MakeLoss function. I've tried to implement their solution in julia:

Well, generally speaking it works, but it looks ugly. it would be nice, if anyone with deep knowledge of MXNet could help.

There are few questions:

  • Because NN + custom loss layer has loss output as a result, it is not clear, how to calculate other eval metrics during training.
  • For the same reason, i had to keep two different neural networks, one for fit, another one for predict. May be there is better solution? Default output layers like SoftmaxOutput and others do not have this problem.
  • Even if there is need to keep two different networks, it would be nice to abstract loss layer, for future reuse. I guess it should be bind/simple_bind somehow to base net, but wasn't able to how to use this syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants