In gradient clipping, if DTensors are used, need to first convert them to local tensors #3413
Job | Run time |
---|---|
4s | |
5s | |
2m 16s | |
5m 30s | |
5m 54s | |
5m 31s | |
53s | |
50s | |
30s | |
28s | |
22m 1s |
Job | Run time |
---|---|
4s | |
5s | |
2m 16s | |
5m 30s | |
5m 54s | |
5m 31s | |
53s | |
50s | |
30s | |
28s | |
22m 1s |