Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug #2162

Merged
merged 4 commits into from
Nov 24, 2023
Merged

fix bug #2162

merged 4 commits into from
Nov 24, 2023

Conversation

manbaaaa
Copy link
Contributor

  1. In order to accommodate 'whisper' and ensure compatibility, both RelPositionMultiHeadedAttention and MultiHeadedAttention require the 'key_bias' argument.
  2. If gradient accumulation is being used and the recorded train loss and cv loss in TensorBoard are not in the same order of magnitude, it might be beneficial to multiply the recorded train loss in TensorBoard by accum_grad as well for easier comparison.

@xingchensong
Copy link
Member

plz fix lint

@manbaaaa
Copy link
Contributor Author

plz fix lint

done

@xingchensong xingchensong merged commit 8b2bc85 into wenet-e2e:main Nov 24, 2023
6 checks passed
@xingchensong
Copy link
Member

xingchensong commented Nov 24, 2023

我发现旧的代码写tensorboard是没有* accumgrad的(v2.2.1, https://github.com/wenet-e2e/wenet/blob/v2.2.1/wenet/utils/executor.py#L89 ),比较的时候记得注意下,@manbaaaa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants