You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The text was updated successfully, but these errors were encountered:
neubig
added
the
minor bug
Bugs that aren't too bad, only concern documentation, or have easy work-arounds
label
Jan 13, 2017
neubig
changed the title
Attention Example is Not Efficient, Need Greedy Decoding
Attention Example is Not Efficient, Needs Greedy Decoding
Jan 13, 2017
I've been looking into this example and I've also done some refactoring (see PR #243) which does not affect the issue you mention - I would have time to do what you propose though and perhaps add it to the standing PR, although I'd probably need some help here with dynet matrix operations.
For a) I imagine this will imply a matrix-matrix-multiply of matrix w1 (repeatedly concatenated n times where n is the length of the sequence) with the concatenation of all input vectors (and similarly for the other terms of v*dy.tanh(w1*input_vector + w2dt).
For b) I imagine you mean sampling according to argmax.
For a), this basically means we calculate w1dt = w1 * dy.concat_columns(input_vectors) once at the beginning of the sentence, then replace: the loop over attention_weight = v*dy.tanh(w1*input_vector + w2dt) with a single attention_weights = v*dy.tanh(dy.colwise_add(w1dt, w2dt))
For b), yes that's right (although I wouldn't call argmax sampling).
Currently the attention example is not very efficient, particularly on GPUs. For example, this for loop could be changed so it only does a single matrix multiplication (which can be done only once per sentence):
https://github.com/clab/dynet/blob/master/examples/python/attention.py#L75
Also, here the attention model is randomly generating examples instead of selecting the best one, which is more in line with what we would expect:
https://github.com/clab/dynet/blob/master/examples/python/attention.py#L105
The text was updated successfully, but these errors were encountered: