[ML] Move PyTorch request ID and cache hit indicator to top level #88901

droberts195 · 2022-07-28T13:19:55Z

This change will facilitate a performance improvement on the C++
side. The request ID and cache hit indicator are the parts that
need to be changed when the C++ process responds to an inference
request. Having them at the top level means we do not need to
parse and manipulate the original response - we can simply cache
the inner object of the response and add the outer fields around
it when serializing it.

Replaces #88485
Companion to elastic/ml-cpp#2376

This change will facilitate a performance improvement on the C++ side. The request ID and cache hit indicator are the parts that need to be changed when the C++ process responds to an inference request. Having them at the top level means we do not need to parse and manipulate the original response - we can simply cache the inner object of the response and add the outer fields around it when serializing it.

elasticsearchmachine · 2022-07-28T13:20:17Z

Pinging @elastic/ml-core (Team:ML)

droberts195 · 2022-07-28T15:43:48Z

Ah, the time taken should move up too, as that is calculated differently for cache hits.

benwtrent · 2022-07-28T16:59:33Z

...c/main/java/org/elasticsearch/xpack/ml/inference/pytorch/process/PyTorchResultProcessor.java

@@ -174,10 +192,10 @@ void processErrorResult(PyTorchResult result) {

        errorCount++;

-        logger.trace(() -> format("[%s] Parsed error with id [%s]", deploymentId, errorResult.requestId()));


Seeing how processResult is synchronized and processErrorResult isn't, errorCount can be woefully incorrect due to race conditions.

That can be fixed in a different commit. This particular change looks good.

Since it's quite a simple change I addressed it in 1bdf3a0

These tests will fail if elastic/ml-cpp#2376 with them unmuted. elastic#88901 will follow up with the Java side changes.

These tests will fail if elastic/ml-cpp#2376 with them unmuted. #88901 will follow up with the Java side changes.

This reverts commit bb7fd51.

…2376) Previously the inference cache stored complete results, including a request ID and time taken. This was inefficient as it then meant the original response had to be parsed and modified before sending back to the Java side. This PR changes the cache to store just the inner portion of the inference result. Then the outer layer is added per request after retrieving from the cache. Additionally, the result writing functions are moved into a class of their own, which means they can be unit tested. Companion to elastic/elasticsearch#88901

droberts195 · 2022-08-03T15:30:29Z

@elasticmachine update branch

droberts195 added >non-issue :ml Machine learning v8.5.0 labels Jul 28, 2022

elasticsearchmachine added the Team:ML Meta label for the ML team label Jul 28, 2022

Move time_ms up a level too

c7aa1a2

benwtrent approved these changes Jul 28, 2022

View reviewed changes

droberts195 added 4 commits August 2, 2022 10:47

Merge branch 'main' into refactor_pytorch_results

92202b6

Address review comment

1bdf3a0

Fix compilation

5a5442a

Mute tests that will fail without the corresponding C++ change

bb7fd51

droberts195 added a commit to droberts195/elasticsearch that referenced this pull request Aug 3, 2022

[ML] Mute tests for inference result format change

d30826d

These tests will fail if elastic/ml-cpp#2376 with them unmuted. elastic#88901 will follow up with the Java side changes.

This was referenced Aug 3, 2022

[ML] Mute tests for inference result format change #89075

Merged

[ML] Change inference cache to store only the inner part of results elastic/ml-cpp#2376

Merged

Fix typo

813bd1a

droberts195 added a commit that referenced this pull request Aug 3, 2022

[ML] Mute tests for inference result format change (#89075)

30142ea

These tests will fail if elastic/ml-cpp#2376 with them unmuted. #88901 will follow up with the Java side changes.

droberts195 added 2 commits August 3, 2022 12:40

Merge branch 'main' into refactor_pytorch_results

16ead46

Revert "Mute tests that will fail without the corresponding C++ change"

805bb84

This reverts commit bb7fd51.

droberts195 mentioned this pull request Aug 3, 2022

[ML] add new trained model deployment cache clear API #89074

Merged

Merge branch 'main' into refactor_pytorch_results

9826eec

droberts195 merged commit 8c21d03 into elastic:main Aug 3, 2022

droberts195 deleted the refactor_pytorch_results branch August 3, 2022 16:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Move PyTorch request ID and cache hit indicator to top level #88901

[ML] Move PyTorch request ID and cache hit indicator to top level #88901

droberts195 commented Jul 28, 2022 •

edited

Loading

elasticsearchmachine commented Jul 28, 2022

droberts195 commented Jul 28, 2022

benwtrent Jul 28, 2022

droberts195 Aug 3, 2022

droberts195 commented Aug 3, 2022

		@@ -174,10 +192,10 @@ void processErrorResult(PyTorchResult result) {

		errorCount++;

		logger.trace(() -> format("[%s] Parsed error with id [%s]", deploymentId, errorResult.requestId()));

[ML] Move PyTorch request ID and cache hit indicator to top level #88901

[ML] Move PyTorch request ID and cache hit indicator to top level #88901

Conversation

droberts195 commented Jul 28, 2022 • edited Loading

elasticsearchmachine commented Jul 28, 2022

droberts195 commented Jul 28, 2022

benwtrent Jul 28, 2022

Choose a reason for hiding this comment

droberts195 Aug 3, 2022

Choose a reason for hiding this comment

droberts195 commented Aug 3, 2022

droberts195 commented Jul 28, 2022 •

edited

Loading