-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-24547][K8S] Allow for building spark on k8s docker images without cache and don't forget to push spark-py container. #21555
Conversation
cache, and include spark-py when pushing to registry.
can you edit the title to add |
ok to test |
Test build #91966 has finished for PR 21555 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for this. @mccheah @foxish @erikerlandson for review before merge
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status failure |
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #92054 has finished for PR 21555 at commit
|
Test build #92063 has finished for PR 21555 at commit
|
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #92103 has finished for PR 21555 at commit
|
Lgtm. Will merge. Thanks |
This change makes it such that using the tool forces building and pushing both Python and non-Python, but, what if the user wants to only build one to save time? I can imagine that being the case in something like a dev-CI workflow. Ideally the tool would allow one to be selective in managing either image or both. |
@mccheah Good note. As this is a blocker for other PRs. It is probably best to refactor the |
@mccheah, that is a good point but I agree with @ifilonenko that we can do it in a subsequent PR. I'm thinking we merge this as-is and I can try to get a follow-up PR here for dockerfile refactor. |
LGMT, I am OK to merge. Most of the automated image build tooling I've seen is custom, but I agree w/ Matt that being able to selectively build is worth supporting, via a followup PR. It might make it easier to write custom logic using this as an abstraction. |
Yup feel free to merge and follow up separately. |
Will follow-up. Thanks all for the comments! Merging to master. |
What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-24547
TL;DR from JIRA issue:
java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; local class incompatible: stream classdesc serialVersionUID = 6155820641931972169, local class serialVersionUID = -3720498261147521051
The second problem was that the spark container is pushed, but the spark-py container wasn't yet. This was just forgotten in the initial PR.
A third problem I also ran into because I had an older docker was [K8S] Fix issue in 'docker-image-tool.sh' #21551 so I have not included a fix for that in this ticket.
How was this patch tested?
I've tested it on my own Spark on k8s deployment.