-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion/SageMaker): Remove deprecated apis and add stateful ingestion capability #10573
feat(ingestion/SageMaker): Remove deprecated apis and add stateful ingestion capability #10573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one small thing, otherwise looks good
class SagemakerSourceConfig(AwsSourceConfig): | ||
class SagemakerSourceConfig( | ||
AwsSourceConfig, | ||
PlatformInstanceConfigMixin, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't look like this source supports platform instances, so having it in the config might be misleading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! It's interesting that if I give platform_instance
in the ingestion recipe, then I can see the platform instance available when browsing ML models, it's in the response of graphQL query getBrowseResultsV2
even though each ML models entity does not have platform_instance
.
Do you think I can make some more improvements to support platform instances? Most of the make_ml_model..*_urn
code is in mce_builder.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah adding support for it isn't too bad. it involves
- adding platform instance to the config (you already did this)
- adding the platform instance as a prefix to all urns. modifying the helpers in mce_builder.py to take an optional platform_instance is the right approach there
- emit dataPlatformInstance aspects for every entity
the browse path v2 aspects will get auto-generated based on everything else
This PR has the following improvements:
Checklist