This repo provides the Keras implementation of MT-LSTM from the paper Learned in Translation: Contextualized Word Vectors (McCann et. al. 2017) . For a high-level overview of why CoVe are great, check out the post.
The Weights are ported from PyTorch implementation of MT-LSTM by the paper's authur - https://github.com/salesforce/cove
Ported & tested on:
- keras==2.1.3
- tensorflow-gpu==1.4.1
For re-running PortFromPytorchToKeras.ipynb requires of PyTorch MT-LSTM implementation from site: https://github.com/salesforce/cove
from keras.models import load_model
cove_model = load_model('Keras_CoVe.h5')
- input - GloVe vectors of dimension - (<batch_size>, <sentence_len>, 300)
- output - CoVe vectors of dimension - (<batch_size>, <sentence_len>, 600)
cove_model.predict(np.random.rand(1,10,300))
At the time of porting, keras has issue with using Masking along with Bidirectional layer - keras-team/keras#3086 ,a short-cut fix is applied, where the output of the final Bi-LSTM is removed off of prediction for padded field, refer PortFromPytorchToKeras.ipynb for the shortcut fix
For unknown words we recommend to use value other than ones used for padding, a small non-zero value say 1e-10 is recommended
- Refer to PortFromPytorchToKeras.ipynb
- The ported Keras model is tested against PyTorch model CoVe predictions on SNLI corpus, For testing details refer - PortingTest.ipynb