Olive #32

Skyrion9 · 2023-05-25T03:25:20Z

Skyrion9
May 25, 2023

https://github.com/microsoft/OLive

This is high profile due to nvidia driver release mentioning Microsoft's project.

Would there be any benefit in utilizing Olive for conversion, if so can it be implemented here? The readme mentions that there's hardware-specific optimizations, but I am unable to test and compare. Maybe we can get higher it/s with it.

Amblyopius · 2023-05-25T09:21:27Z

Amblyopius
May 25, 2023
Maintainer

Hi,

they don't provide a lot of insight in their benchmarking (the part where it can get up to 2x performance increase) and so it's wise to be a bit sceptical:

On NVidia cards it's my understanding that ORT GPU (with CUDA support) is already twice as fast ORT DirectML, so a 2x speed increase would just imply getting their DirectML path closer to CUDA efficiency
On AMD cards there's a lot of talk about 7000-series and you can get a doubling in performance for many operations by utilising WMMA

Now of course on NVidia cards using ORT is a rather niche thing as it lacks the flexibility you have when using torch+CUDA. As a result it's not going to make a lot of difference as it might at best give those on ORT DirectML the same speed as on ORT GPU.

So only those with an AMD 7000-series card have a good reason to believe they can get some speed benefit (On Windows of course, on Linux ROCm 5.5 and the 5.6 prereleases give you great speed with torch).

What to expect:
As far as I can see Olive uses the ORT Transformer library and not much else. I already used this as part of the model tuning but I expect some changes have been made as part of ORT 1.15. I'm hence updating the conversion tool so that it can make use of more of the tuning and this should normally be enough to get the hardware specific tunings too. I need to dig deeper into Olive to make sure I'm not missing something of course.

You also get a 10% speed boost just by upgrading to ORT 1.15

0 replies

Skyrion9 · 2023-05-25T12:03:21Z

Skyrion9
May 25, 2023
Author

Thanks for the quick reply, I've attempted to modify and use their example scripts on my already converted ONNX (which their documentation says you can do to optimize) which promptly deleted my entire model folder lol. But they also use ver 1.13 of Torch. I think any performance benefits to be had are lost because of this.

The documentation is also possibly one of the worst I've seen from Microsoft. I'm updating to nightlies and sticking with this until AMD brings ROCm to windows I'm pretty sure people would port it to old GPUs anyway.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Olive #32

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Olive #32

Skyrion9 May 25, 2023

Replies: 2 comments

Amblyopius May 25, 2023 Maintainer

Skyrion9 May 25, 2023 Author

Skyrion9
May 25, 2023

Amblyopius
May 25, 2023
Maintainer

Skyrion9
May 25, 2023
Author