-
Notifications
You must be signed in to change notification settings - Fork 118
Create base-image and minimize layer count #324
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,26 +15,13 @@ | |
# limitations under the License. | ||
# | ||
|
||
FROM openjdk:8-alpine | ||
FROM spark-base | ||
|
||
# If this docker file is being used in the context of building your images from a Spark distribution, the docker build | ||
# command should be invoked from the top level directory of the Spark distribution. E.g.: | ||
# docker build -t spark-driver:latest -f dockerfiles/driver/Dockerfile . | ||
|
||
RUN apk upgrade --update | ||
RUN apk add --update bash tini | ||
RUN mkdir -p /opt/spark | ||
RUN touch /opt/spark/RELEASE | ||
|
||
ADD jars /opt/spark/jars | ||
ADD examples /opt/spark/examples | ||
ADD bin /opt/spark/bin | ||
ADD sbin /opt/spark/sbin | ||
ADD conf /opt/spark/conf | ||
|
||
ENV SPARK_HOME /opt/spark | ||
|
||
WORKDIR /opt/spark | ||
COPY examples /opt/spark/examples | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the difference between There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good summary of
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The support for unpacking tarballs is actually one of the side-effects of ADD that is considered harder to reason about. I think the idea is, just looking at a docker-file you may not be able to tell exactly what operations that ADD is doing. I don't have a real strong opinion on that. It's arguably more convenient too. But that's the current "official" position. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like @erikerlandson said, I would prefer to use |
||
|
||
CMD SPARK_CLASSPATH="${SPARK_HOME}/jars/*" && \ | ||
if ! [ -z ${SPARK_MOUNTED_CLASSPATH+x} ]; then SPARK_CLASSPATH="$SPARK_MOUNTED_CLASSPATH:$SPARK_CLASSPATH"; fi && \ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,25 +15,12 @@ | |
# limitations under the License. | ||
# | ||
|
||
FROM openjdk:8-alpine | ||
FROM spark-base | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this have a tag, i.e. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. imo there should be a tag per spark release, something like I'd propose renaming it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would we handle snapshots then? Or are we only going to push images on every release, at which point we update all of the Dockerfiles? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ideally these would have version tags like We'd need to have some way of templating Dockerfiles to do that though I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not familiar with how people treat hadoop versions, but if we need multiple hadoop versions as part of the build crossproduct, then that would also be a part of the tag There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think what we need is some string replacement in the release process e.q. when version There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as long as we build in the correct order (spark-base first) I think we're fine. @erikerlandson do you feel strongly about renaming this to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My instinct is that it would be preferable to name them all There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we want |
||
|
||
# If this docker file is being used in the context of building your images from a Spark distribution, the docker build | ||
# command should be invoked from the top level directory of the Spark distribution. E.g.: | ||
# docker build -t spark-shuffle:latest -f dockerfiles/shuffle/Dockerfile . | ||
|
||
RUN apk upgrade --update | ||
RUN apk add --update bash tini | ||
RUN mkdir -p /opt/spark | ||
RUN touch /opt/spark/RELEASE | ||
|
||
ADD jars /opt/spark/jars | ||
ADD examples /opt/spark/examples | ||
ADD bin /opt/spark/bin | ||
ADD sbin /opt/spark/sbin | ||
ADD conf /opt/spark/conf | ||
|
||
ENV SPARK_HOME /opt/spark | ||
|
||
WORKDIR /opt/spark | ||
COPY examples /opt/spark/examples | ||
|
||
ENTRYPOINT [ "/sbin/tini", "--", "bin/spark-class", "org.apache.spark.deploy.ExternalShuffleService", "1" ] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
FROM openjdk:8-alpine | ||
|
||
# If this docker file is being used in the context of building your images from a Spark distribution, the docker build | ||
# command should be invoked from the top level directory of the Spark distribution. E.g.: | ||
# docker build -t spark-base:latest -f dockerfiles/spark-base/Dockerfile . | ||
|
||
RUN apk upgrade --no-cache && \ | ||
apk add --no-cache bash tini && \ | ||
mkdir -p /opt/spark && \ | ||
touch /opt/spark/RELEASE | ||
|
||
COPY jars /opt/spark/jars | ||
COPY bin /opt/spark/bin | ||
COPY sbin /opt/spark/sbin | ||
COPY conf /opt/spark/conf | ||
|
||
ENV SPARK_HOME /opt/spark | ||
|
||
WORKDIR /opt/spark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a note after this that the build order matters -- the spark-base must be built first, then the others can be built afterwards in any order