Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Document how to publish apache-spark-on-k8s releases #438

Open
ash211 opened this issue Aug 16, 2017 · 2 comments
Open

Document how to publish apache-spark-on-k8s releases #438

ash211 opened this issue Aug 16, 2017 · 2 comments

Comments

@ash211
Copy link

ash211 commented Aug 16, 2017

cc @erikerlandson

For reference, I use this script to build the Palantir Spark dist locally which might be usable as a starting point:

#!/bin/bash

set -eu

spark_version=$(git describe --tags)
hadoop_version=hadoop-2.8.0-palantir6
docker_base=my.host.example.com/spark

echo cleaning...
./build/mvn clean

echo setting version number for publish...
./build/mvn versions:set -DnewVersion="$spark_version"

echo building jars and publishing locally...
./build/mvn -DskipTests -Phadoop-cloud -Phadoop-palantir -Pkinesis-asl -Pkubernetes -Phive -Pyarn install

exit

echo making distribution...
./dev/make-distribution.sh --name "$hadoop_version" --tgz -Phadoop-cloud -Phadoop-palantir -Pkinesis-asl -Pkubernetes -Phive -Pyarn

echo publishing to local dir...
combined_version="$spark_version"-"$hadoop_version"
mkdir -p ~/.m2/repository/org/apache/spark/spark-dist/$combined_version/
cp spark-dist-$combined_version.tgz ~/.m2/repository/org/apache/spark/spark-dist/$combined_version/

echo building docker images...
tar -zxvf spark-dist-$combined_version.tgz
pushd spark-dist-$combined_version/

for type in driver executor init-container resource-staging-server shuffle-service; do
    docker build -t "$docker_base/$type:$spark_version" -f "dockerfiles/$type/Dockerfile" .
done

echo publishing docker images...
for type in driver executor init-container resource-staging-server shuffle-service; do
    docker push "$docker_base/$type:$spark_version"
done

popd
@erikerlandson
Copy link
Member

Thanks @ash211 - For reference and comparison, I built the 2.2 release using the following:

$ ./dev/make-distribution.sh --pip --tgz -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver -Pkubernetes -Phadoop-2.7 -Dhadoop.version=2.7.3
$ ./dev/make-distribution.sh --pip --tgz -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver -Pkubernetes -Phadoop-2.7 -Dhadoop.version=2.7.3 -Phadoop-provided

I noticed that all but one unit test succeeded using only -Pkubernetes -Phadoop-2.7 -Dhadoop.version=2.7.3.

I was unable to run integration tests out of the box because I use kvm instead of v-box for minikube. However I can probably spin minikube up separately and run them in shared-cluster mode, I need to test.

@erikerlandson
Copy link
Member

to build w/out hadoop deps included you have to use -Phadoop-provided which I'm not sure everybody knows. Simply not specifying -Phadoop-x.x defaults it to 2.2, it does not build without.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants