Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve caching for multi-platform images. #23562

Merged
merged 1 commit into from
May 9, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .github/workflows/build-images.yml
Original file line number Diff line number Diff line change
@@ -39,7 +39,6 @@ env:
secrets.CONSTRAINTS_GITHUB_REPOSITORY || 'apache/airflow' }}
# This token is WRITE one - pull_request_target type of events always have the WRITE token
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
LOGIN_TO_GITHUB_REGISTRY: "true"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed. We always login in CI if GITHUB_TOKEN is available.

IMAGE_TAG_FOR_THE_BUILD: "${{ github.event.pull_request.head.sha || github.sha }}"

concurrency:
@@ -135,7 +134,7 @@ jobs:
if [[ "${{ github.event_name }}" == 'schedule' ]]; then
echo "::set-output name=cacheDirective::disabled"
else
echo "::set-output name=cacheDirective::pulled"
echo "::set-output name=cacheDirective:registry"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "registry" is better name for the default caching as we are not pulling the image any more to use it as cache.

fi

if [[ "$SELECTIVE_CHECKS_IMAGE_BUILD" == "true" ]]; then
10 changes: 5 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -45,7 +45,6 @@ env:
secrets.CONSTRAINTS_GITHUB_REPOSITORY || 'apache/airflow' }}
# In builds from forks, this token is read-only. For scheduler/direct push it is WRITE one
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
LOGIN_TO_GITHUB_REGISTRY: "true"
ENABLE_TEST_COVERAGE: "${{ github.event_name == 'push' }}"
IMAGE_TAG_FOR_THE_BUILD: "${{ github.event.pull_request.head.sha || github.sha }}"

@@ -265,7 +264,7 @@ jobs:
if [[ "${{ github.event_name }}" == 'schedule' ]]; then
echo "::set-output name=cacheDirective::disabled"
else
echo "::set-output name=cacheDirective::pulled"
echo "::set-output name=cacheDirective::registry"
fi

if [[ "$SELECTIVE_CHECKS_IMAGE_BUILD" == "true" ]]; then
@@ -1608,9 +1607,10 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
IMAGE_TAG: ${{ env.IMAGE_TAG_FOR_THE_BUILD }}
- name: "Generate constraints"
run: |
breeze generate-constraints --run-in-parallel --generate-constraints-mode source-providers
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

longer names for "constraints-mode" but it makes sense to make them the same as the file prefixes from constraints in https://github.com/apache/airflow/tree/constraints-main

breeze generate-constraints --run-in-parallel --generate-constraints-mode pypi-providers
breeze generate-constraints --run-in-parallel --generate-constraints-mode no-providers
breeze generate-constraints --run-in-parallel \
--airflow-constraints-mode constraints-source-providers
breeze generate-constraints --run-in-parallel --airflow-constraints-mode constraints-no-providers
breeze generate-constraints --run-in-parallel --airflow-constraints-mode constraints
env:
PYTHON_VERSIONS: ${{ needs.build-info.outputs.pythonVersionsListAsString }}
- name: "Set constraints branch name"
9 changes: 4 additions & 5 deletions BREEZE.rst
Original file line number Diff line number Diff line change
@@ -1128,23 +1128,22 @@ all or selected python version and single constraint mode like this:

.. code-block:: bash

breeze generate-constraints --generate-constraints-mode pypi-providers
breeze generate-constraints --airflow-constraints-mode constraints

Constraints are generated separately for each python version and there are separate constraints modes:

* 'constraints' - those are constraints generated by matching the current airflow version from sources
and providers that are installed from PyPI. Those are constraints used by the users who want to
install airflow with pip. Use ``pypi-providers`` mode for that.
install airflow with pip.

* "constraints-source-providers" - those are constraints generated by using providers installed from
current sources. While adding new providers their dependencies might change, so this set of providers
is the current set of the constraints for airflow and providers from the current main sources.
Those providers are used by CI system to keep "stable" set of constraints. Use
``source-providers`` mode for that.
Those providers are used by CI system to keep "stable" set of constraints.

* "constraints-no-providers" - those are constraints generated from only Apache Airflow, without any
providers. If you want to manage airflow separately and then add providers individually, you can
use those. Use ``no-providers`` mode for that.
use those.

Those are all available flags of ``generate-constraints`` command:

11 changes: 6 additions & 5 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -882,18 +882,19 @@ There are several sets of constraints we keep:
providers. If you want to manage airflow separately and then add providers individually, you can
use those. Those constraints are named ``constraints-no-providers-<PYTHON_MAJOR_MINOR_VERSION>.txt``.

We also have constraints with "source-providers" but they are used i

The first two can be used as constraints file when installing Apache Airflow in a repeatable way.
It can be done from the sources:

from the PyPI package:

.. code-block:: bash

pip install apache-airflow==2.2.5 \
pip install apache-airflow[google,amazon,async]==2.2.5 \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.2.5/constraints-3.7.txt"

The last one can be used to install Airflow in "minimal" mode - i.e when bare Airflow is installed without
extras.

When you install airflow from sources (in editable mode) you should use "constraints-source-providers"
instead (this accounts for the case when some providers have not yet been released and have conflicting
requirements).
@@ -909,14 +910,14 @@ This works also with extras - for example:
.. code-block:: bash

pip install ".[ssh]" \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints--source-providers-3.7.txt"
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.7.txt"


There are different set of fixed constraint files for different python major/minor versions and you should
use the right file for the right python version.

If you want to update just airflow dependencies, without paying attention to providers, you can do it using
-no-providers constraint files as well.
``constraints-no-providers`` constraint files as well.

.. code-block:: bash

62 changes: 25 additions & 37 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -315,12 +315,14 @@ function install_airflow_dependencies_from_branch_tip() {
fi
# Install latest set of dependencies using constraints. In case constraints were upgraded and there
# are conflicts, this might fail, but it should be fixed in the following installation steps
set -x
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of error, we see what command was used.

pip install \
"https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}" || true
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
pip freeze | grep apache-airflow-providers | xargs pip uninstall --yes 2>/dev/null || true
set +x
echo
echo "${COLOR_BLUE}Uninstalling just airflow. Dependencies remain. Now target airflow can be reinstalled using mostly cached dependencies${COLOR_RESET}"
echo
@@ -384,7 +386,7 @@ function common::get_constraints_location() {
local constraints_base="https://raw.githubusercontent.com/${CONSTRAINTS_GITHUB_REPOSITORY}/${AIRFLOW_CONSTRAINTS_REFERENCE}"
local python_version
python_version="$(python --version 2>/dev/stdout | cut -d " " -f 2 | cut -d "." -f 1-2)"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS}-${python_version}.txt"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS_MODE}-${python_version}.txt"
fi
}

@@ -563,40 +565,19 @@ function install_airflow_and_providers_from_docker_context_files(){
return
fi

if [[ "${UPGRADE_TO_NEWER_DEPENDENCIES}" != "false" ]]; then
echo
echo "${COLOR_BLUE}Force re-installing airflow and providers from local files with eager upgrade${COLOR_RESET}"
echo
# force reinstall all airflow + provider package local files with eager upgrade
pip install "${pip_flags[@]}" --upgrade --upgrade-strategy eager \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages} \
${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
else
echo
echo "${COLOR_BLUE}Force re-installing airflow and providers from local files with constraints and upgrade if needed${COLOR_RESET}"
echo
if [[ ${AIRFLOW_CONSTRAINTS_LOCATION} == "/"* ]]; then
grep -ve '^apache-airflow' <"${AIRFLOW_CONSTRAINTS_LOCATION}" > /tmp/constraints.txt
else
# Remove provider packages from constraint files because they are locally prepared
curl -L "${AIRFLOW_CONSTRAINTS_LOCATION}" | grep -ve '^apache-airflow' > /tmp/constraints.txt
fi
# force reinstall airflow + provider package local files with constraints + upgrade if needed
pip install "${pip_flags[@]}" --force-reinstall \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages} \
--constraint /tmp/constraints.txt
rm /tmp/constraints.txt
# make sure correct PIP version is used \
pip install "pip==${AIRFLOW_PIP_VERSION}"
# then upgrade if needed without using constraints to account for new limits in setup.py
pip install --upgrade --upgrade-strategy only-if-needed \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages}
fi
echo
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are cases where the newly generated provider package will require upgrading the dependencies. It does not matter for prod image to have upgraded constraints because we only run k8s tests for those prod images, so even if there are some incompatible dependencies they will be detected via running tests in CI image in "upgrade" builds.

echo "${COLOR_BLUE}Force re-installing airflow and providers from local files with eager upgrade${COLOR_RESET}"
echo
# force reinstall all airflow + provider package local files with eager upgrade
set -x
pip install "${pip_flags[@]}" --upgrade --upgrade-strategy eager \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages} \
${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
set +x

# make sure correct PIP version is left installed
pip install "pip==${AIRFLOW_PIP_VERSION}"
pip check

}

function install_all_other_packages_from_docker_context_files() {
@@ -608,10 +589,12 @@ function install_all_other_packages_from_docker_context_files() {
# shellcheck disable=SC2010
reinstalling_other_packages=$(ls /docker-context-files/*.{whl,tar.gz} 2>/dev/null | \
grep -v apache_airflow | grep -v apache-airflow || true)
if [[ -n "${reinstalling_other_packages}" ]]; then \
if [[ -n "${reinstalling_other_packages}" ]]; then
set -x
pip install --force-reinstall --no-deps --no-index ${reinstalling_other_packages}
# make sure correct PIP version is used
pip install "pip==${AIRFLOW_PIP_VERSION}"
set -x
fi
}

@@ -664,9 +647,11 @@ function install_airflow() {
if [[ -n "${AIRFLOW_INSTALL_EDITABLE_FLAG}" ]]; then
# Remove airflow and reinstall it using editable flag
# We can only do it when we install airflow from sources
set -x
pip uninstall apache-airflow --yes
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
set +x
fi

# make sure correct PIP version is used
@@ -679,6 +664,7 @@ function install_airflow() {
echo
echo "${COLOR_BLUE}Installing all packages with constraints and upgrade if needed${COLOR_RESET}"
echo
set -x
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"
@@ -690,6 +676,7 @@ function install_airflow() {
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
@@ -718,18 +705,17 @@ set -euo pipefail

. "$( dirname "${BASH_SOURCE[0]}" )/common.sh"


set -x

function install_additional_dependencies() {
if [[ "${UPGRADE_TO_NEWER_DEPENDENCIES}" != "false" ]]; then
echo
echo "${COLOR_BLUE}Installing additional dependencies while upgrading to newer dependencies${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy eager \
${ADDITIONAL_PYTHON_DEPS} ${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
@@ -738,10 +724,12 @@ function install_additional_dependencies() {
echo
echo "${COLOR_BLUE}Installing additional dependencies upgrading only if needed${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy only-if-needed \
${ADDITIONAL_PYTHON_DEPS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
@@ -1185,7 +1173,7 @@ ARG AIRFLOW_EXTRAS
ARG ADDITIONAL_AIRFLOW_EXTRAS=""
# Allows to override constraints source
ARG CONSTRAINTS_GITHUB_REPOSITORY="apache/airflow"
ARG AIRFLOW_CONSTRAINTS="constraints"
ARG AIRFLOW_CONSTRAINTS_MODE="constraints"
ARG AIRFLOW_CONSTRAINTS_REFERENCE=""
ARG AIRFLOW_CONSTRAINTS_LOCATION=""
ARG DEFAULT_CONSTRAINTS_BRANCH="constraints-main"
@@ -1275,7 +1263,7 @@ ENV AIRFLOW_PIP_VERSION=${AIRFLOW_PIP_VERSION} \
AIRFLOW_BRANCH=${AIRFLOW_BRANCH} \
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}${ADDITIONAL_AIRFLOW_EXTRAS:+,}${ADDITIONAL_AIRFLOW_EXTRAS} \
CONSTRAINTS_GITHUB_REPOSITORY=${CONSTRAINTS_GITHUB_REPOSITORY} \
AIRFLOW_CONSTRAINTS=${AIRFLOW_CONSTRAINTS} \
AIRFLOW_CONSTRAINTS_MODE=${AIRFLOW_CONSTRAINTS_MODE} \
AIRFLOW_CONSTRAINTS_REFERENCE=${AIRFLOW_CONSTRAINTS_REFERENCE} \
AIRFLOW_CONSTRAINTS_LOCATION=${AIRFLOW_CONSTRAINTS_LOCATION} \
DEFAULT_CONSTRAINTS_BRANCH=${DEFAULT_CONSTRAINTS_BRANCH} \
19 changes: 13 additions & 6 deletions Dockerfile.ci
Original file line number Diff line number Diff line change
@@ -275,12 +275,14 @@ function install_airflow_dependencies_from_branch_tip() {
fi
# Install latest set of dependencies using constraints. In case constraints were upgraded and there
# are conflicts, this might fail, but it should be fixed in the following installation steps
set -x
pip install \
"https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}" || true
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
pip freeze | grep apache-airflow-providers | xargs pip uninstall --yes 2>/dev/null || true
set +x
echo
echo "${COLOR_BLUE}Uninstalling just airflow. Dependencies remain. Now target airflow can be reinstalled using mostly cached dependencies${COLOR_RESET}"
echo
@@ -344,7 +346,7 @@ function common::get_constraints_location() {
local constraints_base="https://raw.githubusercontent.com/${CONSTRAINTS_GITHUB_REPOSITORY}/${AIRFLOW_CONSTRAINTS_REFERENCE}"
local python_version
python_version="$(python --version 2>/dev/stdout | cut -d " " -f 2 | cut -d "." -f 1-2)"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS}-${python_version}.txt"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS_MODE}-${python_version}.txt"
fi
}

@@ -514,9 +516,11 @@ function install_airflow() {
if [[ -n "${AIRFLOW_INSTALL_EDITABLE_FLAG}" ]]; then
# Remove airflow and reinstall it using editable flag
# We can only do it when we install airflow from sources
set -x
pip uninstall apache-airflow --yes
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
set +x
fi

# make sure correct PIP version is used
@@ -529,6 +533,7 @@ function install_airflow() {
echo
echo "${COLOR_BLUE}Installing all packages with constraints and upgrade if needed${COLOR_RESET}"
echo
set -x
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"
@@ -540,6 +545,7 @@ function install_airflow() {
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
@@ -568,18 +574,17 @@ set -euo pipefail

. "$( dirname "${BASH_SOURCE[0]}" )/common.sh"


set -x

function install_additional_dependencies() {
if [[ "${UPGRADE_TO_NEWER_DEPENDENCIES}" != "false" ]]; then
echo
echo "${COLOR_BLUE}Installing additional dependencies while upgrading to newer dependencies${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy eager \
${ADDITIONAL_PYTHON_DEPS} ${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
@@ -588,10 +593,12 @@ function install_additional_dependencies() {
echo
echo "${COLOR_BLUE}Installing additional dependencies upgrading only if needed${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy only-if-needed \
${ADDITIONAL_PYTHON_DEPS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
@@ -1178,7 +1185,7 @@ ARG AIRFLOW_EXTRAS="all"
ARG ADDITIONAL_AIRFLOW_EXTRAS=""
# Allows to override constraints source
ARG CONSTRAINTS_GITHUB_REPOSITORY="apache/airflow"
ARG AIRFLOW_CONSTRAINTS="constraints"
ARG AIRFLOW_CONSTRAINTS_MODE="constraints-source-providers"
ARG AIRFLOW_CONSTRAINTS_REFERENCE=""
ARG AIRFLOW_CONSTRAINTS_LOCATION=""
ARG DEFAULT_CONSTRAINTS_BRANCH="constraints-main"
@@ -1207,7 +1214,7 @@ ENV AIRFLOW_REPO=${AIRFLOW_REPO}\
AIRFLOW_BRANCH=${AIRFLOW_BRANCH} \
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}${ADDITIONAL_AIRFLOW_EXTRAS:+,}${ADDITIONAL_AIRFLOW_EXTRAS} \
CONSTRAINTS_GITHUB_REPOSITORY=${CONSTRAINTS_GITHUB_REPOSITORY} \
AIRFLOW_CONSTRAINTS=${AIRFLOW_CONSTRAINTS} \
AIRFLOW_CONSTRAINTS_MODE=${AIRFLOW_CONSTRAINTS_MODE} \
AIRFLOW_CONSTRAINTS_REFERENCE=${AIRFLOW_CONSTRAINTS_REFERENCE} \
AIRFLOW_CONSTRAINTS_LOCATION=${AIRFLOW_CONSTRAINTS_LOCATION} \
DEFAULT_CONSTRAINTS_BRANCH=${DEFAULT_CONSTRAINTS_BRANCH} \
Loading