What is "Temporal Attention" as used alongside the RNN in your paper? #52

argadewanata · 2024-06-20T08:39:56Z

I read your paper titled "OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages" on https://arxiv.org/abs/2110.05877.

On page 4, you state:
"For the RNN model, we use a 4-layered bidirectional LSTM with a hidden layer dimension of 128, which takes as input the frame-wise pose representation of 27 keypoints with 2 coordinates each, resulting in a vector of 54 points per frame. We also use a temporal attention layer to weight the most effective frames for classification."

However, I couldn't find a definition of "temporal attention" as used in your method. Could you please explain it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is "Temporal Attention" as used alongside the RNN in your paper? #52

What is "Temporal Attention" as used alongside the RNN in your paper? #52

argadewanata commented Jun 20, 2024

What is "Temporal Attention" as used alongside the RNN in your paper? #52

What is "Temporal Attention" as used alongside the RNN in your paper? #52

Comments

argadewanata commented Jun 20, 2024