In gradient clipping, if DTensors are used, need to first convert them to local tensors #3428
Job | Run time |
---|---|
7s | |
6s | |
2m 11s | |
6m 14s | |
6m 20s | |
5m 29s | |
42s | |
32s | |
37s | |
30s | |
22m 48s |
Job | Run time |
---|---|
7s | |
6s | |
2m 11s | |
6m 14s | |
6m 20s | |
5m 29s | |
42s | |
32s | |
37s | |
30s | |
22m 48s |