-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: executor to sagemaker custom container #6046
Conversation
@@ -43,7 +43,7 @@ repos: | |||
args: | |||
- -S | |||
- repo: https://github.com/pycqa/isort | |||
rev: 5.10.1 | |||
rev: 5.12.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this update, I keep getting an error during commit. SO link
RuntimeError: The Poetry configuration is invalid:
- [extras.pipfile_deprecated_finder.2] 'pip-shims<=0.3.4' does not match '^[a-zA-Z-_.0-9]+$'
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #6046 +/- ##
==========================================
- Coverage 77.14% 73.70% -3.44%
==========================================
Files 144 143 -1
Lines 13777 13855 +78
==========================================
- Hits 10628 10212 -416
- Misses 3149 3643 +494
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is port 8080 the only way this can happen?
ed03b5c
to
ef3937b
Compare
ef3937b
to
791b3ee
Compare
da09260
to
812667b
Compare
Goals
The PR enables using an Executor as a sagemaker custom container for inference by implementing 2 endpoints-
POST /invocations
&GET /ping
and accepting aserve
command on port 8080.To separate sagemaker inference, I've added a
--provider [NONE|SAGEMAKER]
argument to the CLI which can be expanded later toazureml
,vertexai
etc. When provider is sagemaker, we start a custom fastapi app that implements theGET /ping
endpoint./invocations
route is already added to the Executor, we add it to the fastapi app./invocations
is not added and only one route is added to the Executor, we add an/invocations
route that points to the route the Executor implements.Sagemaker needs the image to be pushed to ECR. To use an already pushed Executor with sagemaker, one can do the following and push the image.
During run, sagemaker adds
serve
as a CMD to the container entrypoint. I haven't added any arguments to the jina cli for this asserve
is ignored (if we follow the above syntax in the entrypoint). While implementing other providers, I'll evaluate the need of theserve
command.End-to-end test
I've manually tested the Executor after pushing a model to S3, the Executor image to ECR & running real-time & serverless inference via sagemaker endpoints.
Pending
Support for batch-transform jobs is not added in this PR, will be followed in another PR.