-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python]Remove get_artifacts in MLTranform since artifacts are stored in artifact location #29016
Conversation
Codecov Report
@@ Coverage Diff @@
## master #29016 +/- ##
==========================================
- Coverage 38.41% 38.39% -0.02%
==========================================
Files 686 686
Lines 101614 101656 +42
==========================================
- Hits 39033 39032 -1
- Misses 60999 61042 +43
Partials 1582 1582
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 12 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
602752d
to
6d39769
Compare
6d39769
to
0fab242
Compare
Run Python PreCommit 3.8 |
Run Python_Integration PreCommit 3.8 |
Run Python_Integration PreCommit |
Run Python_Runners PreCommit |
R: @damccorm |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
@@ -130,7 +125,6 @@ def test_ScaleTo01_list(self): | |||
| "MLTransform" >> base.MLTransform( | |||
write_artifact_location=self.artifact_location).with_transform( | |||
tft.ScaleTo01(columns=['x']))) | |||
_ = (list_result | beam.Map(assert_ScaleTo01_artifacts)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than removing all of these checks, can we check that the written artifact is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added #29017 to do this but for now, we will have to use some TF tools to do it. It is not clear how to do it but it is not straight forward.
To read the artifacts, we use read_artifact_location
and there are tests covering this path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM outside of one minor suggestion (and getting checks green)
MLTransform outputs artifacts in the output dictionary. Initially, it was designed to provide users artifacts such as
min
andmax
values from the transformScaleTo01
. Now since the concept of artifact location is used, we store the artifacts directly in the artifact location and provide the user the option ofread_artifact_location
in MLTransform.In the efforts of read and write artifact location, let us remove outputting artifacts in the output dict since they might create noise in the user data. This is a breaking change but since MLTransform is not yet launched and possibly, there is no usage yet, I would like to mention this in CHANGES.md and merge the breaking changes.
Note: The artifacts are embedded in a TF graph and it is easier to read and use them with MLTransform.
Filed #29017 to export MLTransform artifacts in human readable format along with TF graph.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.