-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSTM builder is not the standard LSTM #154
Comments
I've heard that the motivation is the results in this paper: https://arxiv.org/abs/1503.04069 But I agree that this should be documented. |
I see. I started documenting the lstm. I also started implementing a vanilla LSTM. Would it be interesting to have it in the form of two optional parameters |
the LSTMBuilder is quite central. I agree that it would be good to add
On Tue, Nov 8, 2016 at 9:31 PM, Paul Michel [email protected]
|
After staring at the code (admittedly without pen and paper at hand, I'm commuting), I am pretty sure the code implements something else than what's written above -- for example @redpony ? |
@yoavg yes you are right this is my mistake, I'll update the original post. I've been working on implementing the aforementioned options to the LSTMBuilder, but I'll put it on hold until the dropout PR is merged. |
this may affect the dropout PR, as I may be introducing dropout in an un-intended place (or effectively introducing it twice) -- and may not be. |
An alternative VanillaLSTMBuilder is now available, documentation in the "Unorthodox Things" section as of this commit: 7935e79 |
After looking at the code in lstm.h, it seems that the LSTMBuilder cell has the following update rule :
![imgtemp-rtc2r2-1](https://cloud.githubusercontent.com/assets/10391785/20528799/9a25d48c-b09b-11e6-8df6-7f97cd859a49.png)
Which is not the common update rule :
(i.e. in the dynet implementation the forget and input gate are coupled and there are peaky connections with the cell state c)
What are the reasons behind this choice?
Anyway I think this should be documented.
The text was updated successfully, but these errors were encountered: