Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24372][build] Add scripts to help with preparing releases. #21515

Closed
wants to merge 11 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions dev/.rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -106,3 +106,4 @@ spark-warehouse
structured-streaming/*
kafka-source-initial-offset-version-2.1.0.bin
kafka-source-initial-offset-future-version.bin
vote.tmpl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if rat doesn't check, isn't vote.tmpl packaged into the source release this way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying this file should not be packaged in the source release? Not sure I see why that would be the case. There's a lot of stuff in .rat-excludes that is still packaged.

143 changes: 143 additions & 0 deletions dev/create-release/do-release-docker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#
# Creates a Spark release candidate. The script will update versions, tag the branch,
# build Spark binary packages and documentation, and upload maven artifacts to a staging
# repository. There is also a dry run mode where only local builds are performed, and
# nothing is uploaded to the ASF repos.
#
# Run with "-h" for options.
#

set -e
SELF=$(cd $(dirname $0) && pwd)
. "$SELF/release-util.sh"

function usage {
local NAME=$(basename $0)
cat <<EOF
Usage: $NAME [options]

This script runs the release scripts inside a docker image. The image is hardcoded to be called
"spark-rm" and will be re-generated (as needed) on every invocation of this script.

Options are:

-d [path] : required: working directory (output will be written to an "output" directory in
the working directory).
-n : dry run mode. Performs checks and local builds, but do not upload anything.
-t [tag] : tag for the spark-rm docker image to use for building (default: "latest").
-j [path] : path to local JDK installation to use for building. By default the script will
use openjdk8 installed in the docker image.
-s [step] : runs a single step of the process; valid steps are: tag, build, docs, publish
EOF
}

WORKDIR=
IMGTAG=latest
JAVA=
RELEASE_STEP=
while getopts "d:hj:ns:t:" opt; do
case $opt in
d) WORKDIR="$OPTARG" ;;
n) DRY_RUN=1 ;;
t) IMGTAG="$OPTARG" ;;
j) JAVA="$OPTARG" ;;
s) RELEASE_STEP="$OPTARG" ;;
h) usage ;;
?) error "Invalid option. Run with -h for help." ;;
esac
done

if [ -z "$WORKDIR" ] || [ ! -d "$WORKDIR" ]; then
error "Work directory (-d) must be defined and exist. Run with -h for help."
fi

if [ -d "$WORKDIR/output" ]; then
read -p "Output directory already exists. Overwrite and continue? [y/n] " ANSWER
if [ "$ANSWER" != "y" ]; then
error "Exiting."
fi
fi

cd "$WORKDIR"
rm -rf "$WORKDIR/output"
mkdir "$WORKDIR/output"

get_release_info

# Place all RM scripts and necessary data in a local directory that must be defined in the command
# line. This directory is mounted into the image.
for f in "$SELF"/*; do
if [ -f "$f" ]; then
cp "$f" "$WORKDIR"
fi
done

GPG_KEY_FILE="$WORKDIR/gpg.key"
fcreate_secure "$GPG_KEY_FILE"
$GPG --export-secret-key --armor "$GPG_KEY" > "$GPG_KEY_FILE"

run_silent "Building spark-rm image with tag $IMGTAG..." "docker-build.log" \
docker build -t "spark-rm:$IMGTAG" --build-arg UID=$UID "$SELF/spark-rm"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to do export UID=xxx before running this script?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. This is a system variable. So we can't run this script with root user...


# Write the release information to a file with environment variables to be used when running the
# image.
ENVFILE="$WORKDIR/env.list"
fcreate_secure "$ENVFILE"

function cleanup {
rm -f "$ENVFILE"
rm -f "$GPG_KEY_FILE"
}

trap cleanup EXIT

cat > $ENVFILE <<EOF
DRY_RUN=$DRY_RUN
SKIP_TAG=$SKIP_TAG
RUNNING_IN_DOCKER=1
GIT_BRANCH=$GIT_BRANCH
NEXT_VERSION=$NEXT_VERSION
RELEASE_VERSION=$RELEASE_VERSION
RELEASE_TAG=$RELEASE_TAG
GIT_REF=$GIT_REF
SPARK_PACKAGE_VERSION=$SPARK_PACKAGE_VERSION
ASF_USERNAME=$ASF_USERNAME
GIT_NAME=$GIT_NAME
GIT_EMAIL=$GIT_EMAIL
GPG_KEY=$GPG_KEY
ASF_PASSWORD=$ASF_PASSWORD
GPG_PASSPHRASE=$GPG_PASSPHRASE
RELEASE_STEP=$RELEASE_STEP
EOF

JAVA_VOL=
if [ -n "$JAVA" ]; then
echo "JAVA_HOME=/opt/spark-java" >> $ENVFILE
JAVA_VOL="--volume $JAVA:/opt/spark-java"
fi

echo "Building $RELEASE_TAG; output will be at $WORKDIR/output"
docker run -ti \
--env-file "$ENVFILE" \
--volume "$WORKDIR:/opt/spark-rm" \
$JAVA_VOL \
"spark-rm:$IMGTAG"
81 changes: 81 additions & 0 deletions dev/create-release/do-release.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

SELF=$(cd $(dirname $0) && pwd)
. "$SELF/release-util.sh"

while getopts "bn" opt; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to have a high level description in the script just saying this does a release which does things like tag, build, etc and pushes things to the asf spark repo.

case $opt in
b) GIT_BRANCH=$OPTARG ;;
n) DRY_RUN=1 ;;
?) error "Invalid option: $OPTARG" ;;
esac
done

if [ "$RUNNING_IN_DOCKER" = "1" ]; then
# Inside docker, need to import the GPG key stored in the current directory.
echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --import "$SELF/gpg.key"

# We may need to adjust the path since JAVA_HOME may be overridden by the driver script.
if [ -n "$JAVA_HOME" ]; then
export PATH="$JAVA_HOME/bin:$PATH"
else
# JAVA_HOME for the openjdk package.
export JAVA_HOME=/usr
fi
else
# Outside docker, need to ask for information about the release.
get_release_info
fi

function should_build {
local WHAT=$1
[ -z "$RELEASE_STEP" ] || [ "$WHAT" = "$RELEASE_STEP" ]
}

if should_build "tag" && [ $SKIP_TAG = 0 ]; then
run_silent "Creating release tag $RELEASE_TAG..." "tag.log" \
"$SELF/release-tag.sh"
echo "It may take some time for the tag to be synchronized to github."
echo "Press enter when you've verified that the new tag ($RELEASE_TAG) is available."
read
else
echo "Skipping tag creation for $RELEASE_TAG."
fi

if should_build "build"; then
run_silent "Building Spark..." "build.log" \
"$SELF/release-build.sh" package
else
echo "Skipping build step."
fi

if should_build "docs"; then
run_silent "Building documentation..." "docs.log" \
"$SELF/release-build.sh" docs
else
echo "Skipping docs step."
fi

if should_build "publish"; then
run_silent "Publishing release" "publish.log" \
"$SELF/release-build.sh" publish-release
else
echo "Skipping publish step."
fi
Loading