You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, since memory length (1024) is longer than batch length (128), new music sequence can do attention to previous music after one block.
So, it seems that new sequence can do attention to 'prev sequence memory (896) + cur sequece memory (128)'.
I am wondering if this is intentional.
Thank you for sharing code!
The text was updated successfully, but these errors were encountered:
Hello.
In the training phase, there's reset_mems for resetting memory.
When the new music (sequence) begins, it blocks attention to memory of previous music by setting reset_mems=True.
https://github.com/amazon-research/transformer-gan/blob/1ccc9f251c1b1d054c1acc8be36c1da7bf8cf11c/model/mem_transformer.py#L529
However, since memory length (1024) is longer than batch length (128), new music sequence can do attention to previous music after one block.
So, it seems that new sequence can do attention to 'prev sequence memory (896) + cur sequece memory (128)'.
I am wondering if this is intentional.
Thank you for sharing code!
The text was updated successfully, but these errors were encountered: