Skip to content

Commit

Permalink
Removed the unused provider's distribution (apache#46608)
Browse files Browse the repository at this point in the history
This is a set of cleanup steps (first stage) that allow us to remove
the "intermediate" provider's distribution from Airlfow code and replace
it fully with individual provider's distributions - already with own
`pyproject.toml` files and basically being (when we complete) a
completely separate distributions from Airflow and without implicit
dependencies between unrelated distributions.

There are a number of other changes needed but that one is only focusing
on removing all references to the "umbrella" `providers` distribution
and consequences of removing it.

Those are the changes implemented in this PR:

* There are no separate "providers" system tests - each provider has
  own system tests and there are no common "generic" providers empty
  system test

* Integration tests are moved to respective providers under the
  `integration` package inside `tests` directory

* (nearly) empty __init__.py files are added in `tests` directories
  of providers - this way "tests" becomes just a directory and root
  for all tests per provider, rather than a Python package on its own.
  That allows to use "from integration.PROVIDER import" and
  "from system.PROVIDER" rather than importing them from the root of
  the whole airflow project. The (nearly) is because we need to
  handle multiple "system", "system.apache" and other import locations.

* Removed references to "providers/" generic package which were
  scheduled for removal after all providers are moved to the new
  structure

* Few remaining references / links referring to old "providers/src" and
  "providers/test" have been fixed.

* The "conftest.py" files in all providers are trimmed down - the code
  to store ignored deprecation warnings have been moved to the
  test_common pytest_plugin. That allows to remove 90+ duplicated
  snippets of deprecation_warnings retrieval while keeping the warnings
  per-provider in the provider's distribution.

* The "moving_providers" scripts are removed. They've done their job and
  are not needed any more - we keep them in history

* The __init__.py files are automatically checked and properly updated
  in provider folders - in order to properly handle path extension
  mechanisms

* The www tests that were using FAB permisssion model are moved to the
  FAB provider tests.
  • Loading branch information
potiuk authored Feb 15, 2025
1 parent 7513269 commit e027457
Show file tree
Hide file tree
Showing 604 changed files with 3,500 additions and 4,491 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/integration-system-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -197,8 +197,7 @@ jobs:
use-uv: ${{ inputs.use-uv }}
- name: "System Tests"
run: >
./scripts/ci/testing/run_system_tests.sh
tests/system/example_empty.py providers/tests/system/example_empty.py
./scripts/ci/testing/run_system_tests.sh tests/system/example_empty.py
- name: "Post Tests success"
uses: ./.github/actions/post_tests_success
with:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/special-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ jobs:
runs-on-as-json-default: ${{ inputs.runs-on-as-json-default }}
test-name: "SystemTest"
test-scope: "System"
test-groups: ${{ inputs.test-groups }}
test-groups: "['core']"
backend: "postgres"
python-versions: "['${{ inputs.default-python-version }}']"
backend-versions: "['${{ inputs.default-postgres-version }}']"
Expand Down
8 changes: 2 additions & 6 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -556,9 +556,6 @@ repos:
language: pygrep
entry: >
(?i)
.*https://github.*[0-9]/providers/.*/tests/system/|
.*https://github.*/main/providers/.*/tests/system/|
.*https://github.*/master/providers/.*/tests/system/|
.*https://github.*/main/providers/.*/src/airflow/providers/.*/example_dags/|
.*https://github.*/master/providers/.*/src/airflow/providers/.*/example_dags/
pass_filenames: true
Expand Down Expand Up @@ -674,7 +671,6 @@ repos:
^scripts/ci/docker-compose/integration-keycloak.yml$|
^scripts/ci/docker-compose/keycloak/keycloak-entrypoint.sh$|
^tests/|
^providers/tests/|
^providers/.*/tests/|
^.pre-commit-config\.yaml$|
^.*CHANGELOG\.(rst|txt)$|
Expand Down Expand Up @@ -920,6 +916,7 @@ repos:
entry: ./scripts/ci/pre_commit/check_providers_subpackages_all_have_init.py
language: python
require_serial: true
additional_dependencies: ['rich>=12.4.4']
- id: check-pre-commit-information-consistent
name: Validate hook IDs & names and sync with docs
entry: ./scripts/ci/pre_commit/check_pre_commit_hooks.py
Expand Down Expand Up @@ -983,7 +980,6 @@ repos:
entry: ./scripts/ci/pre_commit/check_system_tests.py
language: python
files: ^(providers/)?tests/system/.*/example_[^/]*\.py$
exclude: ^providers/google/tests/system/google/cloud/bigquery/example_bigquery_queries\.py$
pass_filenames: true
additional_dependencies: ['rich>=12.4.4']
- id: generate-pypi-readme
Expand Down Expand Up @@ -1386,7 +1382,7 @@ repos:
stages: ['manual']
name: Run mypy for providers (manual)
language: python
entry: ./scripts/ci/pre_commit/mypy_folder.py providers/src/airflow/providers all_new_providers
entry: ./scripts/ci/pre_commit/mypy_folder.py all_new_providers
pass_filenames: false
files: ^.*\.py$
require_serial: true
Expand Down
2 changes: 1 addition & 1 deletion PROVIDERS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ Airflow main branch to being decommissioned and removed from the main branch in

Technical details on how to manage lifecycle of providers are described in the document:

`Managing provider's lifecycle <https://github.com/apache/airflow/blob/main/providers/src/airflow/providers/MANAGING_PROVIDERS_LIFECYCLE.rst>`_
`Managing provider's lifecycle <https://github.com/apache/airflow/blob/main/providers/MANAGING_PROVIDERS_LIFECYCLE.rst>`_


Accepting new community providers
Expand Down
26 changes: 3 additions & 23 deletions contributing-docs/07_local_virtualenv.rst
Original file line number Diff line number Diff line change
Expand Up @@ -135,14 +135,6 @@ In a project like airflow it's important to have a consistent set of dependencie
You can use ``uv sync`` to install dependencies from ``pyproject.toml`` file. This will install all dependencies
from the ``pyproject.toml`` file in the current directory.

.. note::

We are currently in the process of moving providers from old structure (where all providers were under
``providers/src`` directory in a package structure shared between Providers) to a new structure
where each provider is a separate python package in ``providers`` directory. The "old" providers support
will be removed once we move all the providers to the new structure.


.. code:: bash
uv sync
Expand Down Expand Up @@ -188,32 +180,20 @@ run tests is to use ``pip`` to install airflow dependencies:

.. code:: bash
pip install -e "./providers"
pip install -e ".[devel,devel-tests,<OTHER EXTRAS>]" # for example: pip install -e ".[devel,devel-tests,google,postgres]"
This will install:

* old structure provider sources in ``editabl`e` mode - where sources are read from ``providers`` folder.
* airflow in ``editable`` mode - where sources of Airflow are taken directly from ``airflow`` source code.

You need to run this command in the virtualenv you want to install Airflow in -
and you need to have the virtualenv activated.

.. note::
This will install airflow in ``editable`` mode - where sources of
Airflow are taken directly from ``airflow`` source code.

For the providers that are already moved (i.e. have separate folder in ``providers`` directory), instead
of adding extra in airflow command you need to separately install the provider in the same venv. For example
to install ``airbyte`` provider you can run:
You need to run this command in the virtualenv you want to install Airflow in and you need to have the virtualenv activated.

.. code:: bash
pip install -e "./providers"
pip install -e ".[devel,devel-tests,<OTHER EXTRAS>]" # for example: pip install -e ".[devel,devel-tests,google,postgres]"
pip install -e "./providers/airbyte[devel]"
This will install:

* old structure provider sources in ``editable`` mode - where sources are read from ``providers/src`` folder
* airflow in ``editable`` mode - where sources of Airflow are taken directly from ``airflow`` source code.
* airbyte provider in ``editable`` mode - where sources are read from ``providers/airbyte`` folder

Expand Down
8 changes: 1 addition & 7 deletions contributing-docs/testing/system_tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ There are multiple ways of running system tests. Each system test is a self-cont
other DAG. Some tests may require access to external services, enabled APIs or specific permissions. Make sure to
prepare your environment correctly, depending on the system tests you want to run - some may require additional
configuration which should be documented by the relevant providers in their subdirectory
``providers/tests/system/<provider_name>/README.md``.
``tests/system/<provider_name>/README.md``.

Running as Airflow DAGs
.......................
Expand Down Expand Up @@ -105,12 +105,6 @@ For core:
breeze testing system-tests tests/system/example_empty.py
For providers:

.. code-block:: bash
breeze testing system-tests providers/tests/system/example_empty.py
If you need to add some initialization of environment variables when entering Breeze, you can add a
``variables.env`` file in the ``files/airflow-breeze-config/variables.env`` file.

Expand Down
6 changes: 3 additions & 3 deletions contributing-docs/testing/unit_tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -871,7 +871,7 @@ will ask you to rebuild the image if it is needed and some new dependencies shou

.. code-block:: bash
breeze testing providers-tests providers/tests/http/hooks/test_http.py tests/core/test_core.py --db-reset --log-cli-level=DEBUG
breeze testing providers-tests providers/http/tests/http/hooks/test_http.py tests/core/test_core.py --db-reset --log-cli-level=DEBUG
You can run the whole core test suite without adding the test target:

Expand Down Expand Up @@ -1070,7 +1070,7 @@ directly to the container.

.. code-block:: bash
pytest providers/tests/<provider>/test.py
pytest providers/<provider>/tests/.../test.py
4. Iterate with the tests and providers. Both providers and tests are mounted from local sources so
changes you do locally in both - tests and provider sources are immediately reflected inside the
Expand Down Expand Up @@ -1207,7 +1207,7 @@ In case you want to reproduce canary run, you need to add ``--clean-airflow-inst

.. code-block:: bash
pytest providers/tests/<provider>/test.py
pytest providers/<provider>/tests/.../test.py
7. Iterate with the tests

Expand Down
2 changes: 1 addition & 1 deletion dev/README_AIRFLOW3_DEV.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ Make sure your code is only about Providers or Helm chart.
Avoid mixing core changes into the same PR

> [!NOTE]
> Please note that providers have been relocated from `airflow/providers` to `providers/src/airflow/providers`.
> Please note that providers have been relocated from `airflow/providers` to `providers/<provider_id>/src/airflow/providers`.
## Developing for Airflow 3 and 2.10.x / 2.11.x

Expand Down
2 changes: 1 addition & 1 deletion dev/breeze/src/airflow_breeze/commands/testing_commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ def docker_compose_tests(
sys.exit(return_code)


TEST_PROGRESS_REGEXP = r"tests/.*|providers/.*/tests/.*|providers/tests/.*|task_sdk/tests/.*|.*=====.*"
TEST_PROGRESS_REGEXP = r"tests/.*|providers/.*/tests/.*|task_sdk/tests/.*|.*=====.*"
PERCENT_TEST_PROGRESS_REGEXP = r"^tests/.*\[[ \d%]*\].*|^\..*\[[ \d%]*\].*"


Expand Down
75 changes: 25 additions & 50 deletions dev/breeze/src/airflow_breeze/utils/run_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@
from airflow_breeze.utils.path_utils import (
AIRFLOW_PROVIDERS_DIR,
AIRFLOW_SOURCES_ROOT,
OLD_TESTS_PROVIDERS_ROOT,
)
from airflow_breeze.utils.run_utils import run_command
from airflow_breeze.utils.virtualenv_utils import create_temp_venv
Expand Down Expand Up @@ -138,18 +137,7 @@ def test_paths(test_type: str, backend: str) -> tuple[str, str, str]:
def get_ignore_switches_for_provider(provider_folders: list[str]) -> list[str]:
args = []
for providers in provider_folders:
args.extend(
[
# TODO(potiuk): remove the old ways once we migrate all providers to the new structure
f"--ignore=providers/tests/{providers}",
f"--ignore=providers/tests/system/{providers}",
f"--ignore=providers/tests/integration/{providers}",
# New structure
f"--ignore=providers/{providers}/tests/",
f"--ignore=providers/{providers}/tests/system",
f"--ignore=providers/{providers}/tests/integration",
]
)
args.append(f"--ignore=providers/{providers}/tests/")
return args


Expand Down Expand Up @@ -198,15 +186,25 @@ def get_excluded_provider_args(python_version: str) -> list[str]:
for path in AIRFLOW_SOURCES_ROOT.glob("providers/*/*/tests/")
]
)
ALL_PROVIDER_INTEGRATION_TEST_FOLDERS: list[str] = sorted(
[
path.relative_to(AIRFLOW_SOURCES_ROOT).as_posix()
for path in AIRFLOW_SOURCES_ROOT.glob("providers/*/tests/integration/")
]
+ [
path.relative_to(AIRFLOW_SOURCES_ROOT).as_posix()
for path in AIRFLOW_SOURCES_ROOT.glob("providers/*/*/tests/integration/")
]
)


TEST_GROUP_TO_TEST_FOLDERS: dict[GroupOfTests, list[str]] = {
GroupOfTests.CORE: ["tests"],
# TODO(potiuk): remove me when we migrate all providers to new structure
GroupOfTests.PROVIDERS: [*ALL_NEW_PROVIDER_TEST_FOLDERS, "providers/tests"],
GroupOfTests.PROVIDERS: ALL_NEW_PROVIDER_TEST_FOLDERS,
GroupOfTests.TASK_SDK: ["task_sdk/tests"],
GroupOfTests.HELM: ["helm_tests"],
GroupOfTests.INTEGRATION_CORE: ["tests/integration"],
GroupOfTests.INTEGRATION_PROVIDERS: ["providers/tests/integration"],
GroupOfTests.INTEGRATION_PROVIDERS: ALL_PROVIDER_INTEGRATION_TEST_FOLDERS,
GroupOfTests.PYTHON_API_CLIENT: ["clients/python"],
}

Expand Down Expand Up @@ -293,41 +291,26 @@ def convert_test_type_to_pytest_args(
f"[info]Removing {provider_test_to_exclude} from {providers_with_exclusions}[/]"
)
providers_with_exclusions.remove(provider_test_to_exclude)
else:
# TODO(potiuk): remove me when all providers are migrated
get_console().print(f"[info]Adding {provider_test_to_exclude} to pytest ignores[/]")
providers_with_exclusions.append(
"--ignore=providers/tests/" + excluded_provider.replace(".", "/")
)
return providers_with_exclusions
if test_type.startswith(PROVIDERS_LIST_PREFIX):
provider_list = test_type[len(PROVIDERS_LIST_PREFIX) : -1].split(",")
providers_to_test = []
for provider in provider_list:
# TODO(potiuk) - remove when all providers are new-style
provider_path = OLD_TESTS_PROVIDERS_ROOT.joinpath(provider.replace(".", "/")).relative_to(
AIRFLOW_SOURCES_ROOT
provider_path = (
AIRFLOW_PROVIDERS_DIR.joinpath(provider.replace(".", "/")).relative_to(
AIRFLOW_SOURCES_ROOT
)
/ "tests"
)
if provider_path.is_dir():
providers_to_test.append(provider_path.as_posix())
else:
# TODO(potiuk) - remove when all providers are new-style
old_provider_path = provider_path
provider_path = (
AIRFLOW_PROVIDERS_DIR.joinpath(provider.replace(".", "/")).relative_to(
AIRFLOW_SOURCES_ROOT
)
/ "tests"
get_console().print(
f"[error] {provider_path} does not exist for {provider} "
"- which means that this provider has no tests. This is bad idea. "
"Please add it (all providers should have at least a package in tests)."
)
if provider_path.is_dir():
providers_to_test.append(provider_path.as_posix())
else:
get_console().print(
f"[error]Neither {old_provider_path} nor {provider_path} exist for {provider} "
"- which means that provider has no tests. This is bad idea. "
"Please add it (all providers should have a package in tests)"
)
sys.exit(1)
sys.exit(1)
return providers_to_test
if not test_type.startswith(PROVIDERS_PREFIX):
get_console().print(f"[error]Unknown test type for {GroupOfTests.PROVIDERS}: {test_type}[/]")
Expand Down Expand Up @@ -468,12 +451,4 @@ def convert_parallel_types_to_folders(test_group: GroupOfTests, parallel_test_ty
for group_folders in TEST_GROUP_TO_TEST_FOLDERS.values():
for group_folder in group_folders:
all_test_prefixes.append(group_folder)
folders = [arg for arg in args if any(arg.startswith(prefix) for prefix in all_test_prefixes)]
# remove specific provider sub-folders if "providers/tests" is already in the list
# This workarounds pytest issues where it will only run tests from specific subfolders
# if both parent and child folders are in the list
# The issue in Pytest (changed behaviour in Pytest 8.2 is tracked here
# https://github.com/pytest-dev/pytest/issues/12605
if "providers/tests" in folders:
folders = [folder for folder in folders if not folder.startswith("providers/tests/")]
return folders
return [arg for arg in args if any(arg.startswith(prefix) for prefix in all_test_prefixes)]
9 changes: 0 additions & 9 deletions dev/breeze/src/airflow_breeze/utils/selective_checks.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,6 @@ def __hash__(self):
r"^\.github/SECURITY\.rst$",
r"^airflow/.*\.py$",
r"^chart",
r"^providers/src/",
r"^providers/.*/src/",
r"^task_sdk/src/",
r"^tests/system",
Expand All @@ -209,9 +208,6 @@ def __hash__(self):
FileGroupForCi.KUBERNETES_FILES: [
r"^chart",
r"^kubernetes_tests",
r"^providers/src/airflow/providers/cncf/kubernetes/",
r"^providers/tests/cncf/kubernetes/",
r"^providers/tests/system/cncf/kubernetes/",
r"^providers/cncf/kubernetes/",
],
FileGroupForCi.ALL_PYTHON_FILES: [
Expand Down Expand Up @@ -273,11 +269,9 @@ def __hash__(self):
FileGroupForCi.ALL_AIRFLOW_PYTHON_FILES: [
r"^.*/.*_vendor/.*",
r"^airflow/migrations/.*",
r"^providers/src/airflow/providers/.*",
r"^providers/.*/src/airflow/providers/.*",
r"^dev/.*",
r"^docs/.*",
r"^providers/tests/.*",
r"^providers/.*/tests/.*",
r"^tests/dags/test_imports.py",
r"^task_sdk/src/airflow/sdk/.*\.py$",
Expand All @@ -287,7 +281,6 @@ def __hash__(self):
)

PYTHON_OPERATOR_FILES = [
r"^providers/src/providers/standard/operators/python.py",
r"^providers/tests/standard/operators/test_python.py",
]

Expand All @@ -310,8 +303,6 @@ def __hash__(self):
r"^tests/operators/",
],
SelectiveProvidersTestType.PROVIDERS: [
r"^providers/src/airflow/providers/",
r"^providers/tests/",
r"^providers/.*/src/airflow/providers/",
r"^providers/.*/tests/",
],
Expand Down
Loading

0 comments on commit e027457

Please sign in to comment.