You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This topic is related to supporting of concurrent evaluation of the same model. In offline discussion, we agreed that we should have one copy of backendIDs while creating as many backends as we want as a view into the backends. But this creates several problems due to the fact that we one have one ExecutionEngine in the backend:
Thread safety when accessing anything about ExecutionEngine
Check compatibility and run graph is going to be slow as they will be serialized.
A minor problem that we now compile every ONNXIFI module into function with the same name.
Some idea:
Can we have a pool of ExecutionEngines in the BackendID, and every time we create a backend from that backendID, we give it an ArrayRef of these EEs? When we compile a graph, we hash the mode model id to pick one of the EEs to use. Access to the engines should be protected by locks. And we compile functions with the name inference_graphID.
The text was updated successfully, but these errors were encountered:
Hmm, round robin might not be good for model sharding. We may let the backend have view into all the ExecutionEngines and let it pick which one to use depending on the model string when doing onnxGraphInitIO.
This topic is related to supporting of concurrent evaluation of the same model. In offline discussion, we agreed that we should have one copy of backendIDs while creating as many backends as we want as a view into the backends. But this creates several problems due to the fact that we one have one
ExecutionEngine
in the backend:ExecutionEngine
Some idea:
Can we have a pool of
ExecutionEngine
s in the BackendID, and every time we create a backend from that backendID, we give it an ArrayRef of these EEs? When we compile a graph, we hash the mode model id to pick one of the EEs to use. Access to the engines should be protected by locks. And we compile functions with the nameinference_graphID
.The text was updated successfully, but these errors were encountered: