Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure Docs #27235

Merged
merged 10 commits into from
Dec 16, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions .github/boring-cyborg.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,8 @@ labelPRBasedOnFilePath:
- airflow/kubernetes_executor_templates/**/*
- airflow/executors/kubernetes_executor.py
- airflow/executors/celery_kubernetes_executor.py
- docs/apache-airflow/executor/kubernetes.rst
- docs/apache-airflow/executor/celery_kubernetes.rst
- docs/apache-airflow/core-concepts/executor/kubernetes.rst
- docs/apache-airflow/core-concepts/executor/celery_kubernetes.rst
- docs/apache-airflow-providers-cncf-kubernetes/**/*
- kubernetes_tests/**/*

Expand Down Expand Up @@ -136,17 +136,17 @@ labelPRBasedOnFilePath:
- airflow/cli/**/*.py
- tests/cli/**/*.py
- docs/apache-airflow/cli-and-env-variables-ref.rst
- docs/apache-airflow/usage-cli.rst
- docs/apache-airflow/howto/usage-cli.rst

area:Lineage:
- airflow/lineage/**/*
- tests/lineage/**/*
- docs/apache-airflow/lineage.rst
- docs/apache-airflow/administration-and-deployment/lineage.rst

area:Logging:
- airflow/providers/**/log/*
- airflow/utils/log/**/*
- docs/apache-airflow/logging-monitoring/logging-*.rst
- docs/apache-airflow/administration-and-deployment/logging-monitoring/logging-*.rst
- tests/providers/**/log/*
- tests/utils/log/**/*

Expand All @@ -155,15 +155,15 @@ labelPRBasedOnFilePath:
- airflow/plugins_manager.py
- tests/cli/commands/test_plugins_command.py
- tests/plugins/**/*
- docs/apache-airflow/plugins.rst
- docs/apache-airflow/authoring-and-scheduling/plugins.rst

area:Scheduler/Executor:
- airflow/executors/**/*
- airflow/jobs/**/*
- airflow/task/task_runner/**/*
- airflow/dag_processing/**/*
- docs/apache-airflow/executor/**/*
- docs/apache-airflow/concepts/scheduler.rst
- docs/apache-airflow/core-concepts/executor/**/*
- docs/apache-airflow/administration-and-deployment/scheduler.rst
- tests/executors/**/*
- tests/jobs/**/*

Expand All @@ -172,14 +172,14 @@ labelPRBasedOnFilePath:
- airflow/providers/**/secrets/*
- tests/secrets/**/*
- tests/providers/**/secrets/*
- docs/apache-airflow/security/secrets/**/*
- docs/apache-airflow/administration-and-deployment/security/secrets/**/*

area:Serialization:
- airflow/serialization/**/*
- airflow/models/serialized_dag.py
- tests/serialization/**/*
- tests/models/test_serialized_dag.py
- docs/apache-airflow/dag-serialization.rst
- docs/apache-airflow/administration-and-deployment/dag-serialization.rst

area:core-operators:
- airflow/operators/**/*
Expand Down
4 changes: 2 additions & 2 deletions RELEASE_NOTES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -429,14 +429,14 @@ If you have the producer and consumer in different files you do not need to use
Datasets represent the abstract concept of a dataset, and (for now) do not have any direct read or write
capability - in this release we are adding the foundational feature that we will build upon.

For more info on Datasets please see :doc:`/concepts/datasets`.
For more info on Datasets please see :doc:`/authoring-and-scheduling/datasets`.

Expanded dynamic task mapping support
"""""""""""""""""""""""""""""""""""""

Dynamic task mapping now includes support for ``expand_kwargs``, ``zip`` and ``map``.

For more info on dynamic task mapping please see :doc:`/concepts/dynamic-task-mapping`.
For more info on dynamic task mapping please see :doc:`/authoring-and-scheduling/dynamic-task-mapping`.

DAGS used in a context manager no longer need to be assigned to a module variable (#23592)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Expand Down
2 changes: 1 addition & 1 deletion airflow/hooks/subprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def run_command(
environment in which ``command`` will be executed. If omitted, ``os.environ`` will be used.
Note, that in case you have Sentry configured, original variables from the environment
will also be passed to the subprocess with ``SUBPROCESS_`` prefix. See
:doc:`/logging-monitoring/errors` for details.
:doc:`/administration-and-deployment/logging-monitoring/errors` for details.
:param output_encoding: encoding to use for decoding stdout
:param cwd: Working directory to run the command in.
If None (default), the command is run in a temporary directory.
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/cncf/kubernetes/CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,7 @@ Notes on changes KubernetesPodOperator and PodLauncher
Overview
''''''''

Generally speaking if you did not subclass ``KubernetesPodOperator`` and you didn't use the ``PodLauncher`` class directly,
Generally speaking if you did not subclass ``KubernetesPodOperator`` and you did not use the ``PodLauncher`` class directly,
then you don't need to worry about this change. If however you have subclassed ``KubernetesPodOperator``, what
follows are some notes on the changes in this release.

Expand Down
4 changes: 2 additions & 2 deletions docs/apache-airflow-providers-cncf-kubernetes/operators.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@



.. _howto/operator:KubernetesPodOperator:
.. _howto/operator:kubernetespodoperator:

KubernetesPodOperator
=====================
Expand All @@ -32,7 +32,7 @@ you to create and run Pods on a Kubernetes cluster.
simplifies the Kubernetes authorization process.

.. note::
The :doc:`Kubernetes executor <apache-airflow:executor/kubernetes>` is **not** required to use this operator.
The :doc:`Kubernetes executor <apache-airflow:core-concepts/executor/kubernetes>` is **not** required to use this operator.

How does this operator work?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ mechanism for records that arrive on these streams using a job status
polling mechanism. The success stream (i.e., stdout or shell output)
is handled differently, as explained in the following:

When :doc:`XComs <apache-airflow:concepts/xcoms>` are enabled and when
When :doc:`XComs <apache-airflow:core-concepts/xcoms>` are enabled and when
the operator is used with a native PowerShell cmdlet or script, the
shell output is converted to JSON using the ``ConvertTo-Json`` cmdlet
and then decoded on the client-side by the operator such that the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ capabilities. You can read more about those in
`FAB security docs <https://flask-appbuilder.readthedocs.io/en/latest/security.html>`_.

You can also
take a look at Auth backends available in the core Airflow in :doc:`apache-airflow:security/webserver`
take a look at Auth backends available in the core Airflow in :doc:`apache-airflow:administration-and-deployment/security/webserver`
or see those provided by the community-managed providers:

.. airflow-auth-backends::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ connection, when the connection is managed via Airflow UI. Those connections als
that can be used to automatically create Airflow Hooks for specific connection types.

The connection management is explained in
:doc:`apache-airflow:concepts/connections` and you can also see those
:doc:`apache-airflow:authoring-and-scheduling/connections` and you can also see those
provided by the community-managed providers:

.. airflow-connections::
Expand Down
2 changes: 1 addition & 1 deletion docs/apache-airflow-providers/core-extensions/logging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Writing logs

This is a summary of all Apache Airflow Community provided implementations of writing task logs
exposed via community-managed providers. You can also see logging options available in the core Airflow in
:doc:`apache-airflow:logging-monitoring/logging-tasks` and here you can see those
:doc:`apache-airflow:administration-and-deployment/logging-monitoring/logging-tasks` and here you can see those
provided by the community-managed providers:

.. airflow-logging::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ via providers that implement secrets backends for services Airflow integrates wi

You can also take a
look at Secret backends available in the core Airflow in
:doc:`apache-airflow:security/secrets/secrets-backend/index` and here you can see the ones
:doc:`apache-airflow:administration-and-deployment/security/secrets/secrets-backend/index` and here you can see the ones
provided by the community-managed providers:

.. airflow-secrets-backends::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ In order to make Airflow Webserver stateless, Airflow >=1.10.7 supports
DAG Serialization and DB Persistence. From Airflow 2.0.0, the Scheduler
also uses Serialized DAGs for consistency and makes scheduling decisions.

.. image:: img/dag_serialization.png
.. image:: ../img/dag_serialization.png

Without DAG Serialization & persistence in DB, the Webserver and the Scheduler both
need access to the DAG files. Both the Scheduler and Webserver parse the DAG files.
Expand Down
37 changes: 37 additions & 0 deletions docs/apache-airflow/administration-and-deployment/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

Administration and Deployment
=====================================

This section contains information about deploying DAGs into production and the administration of airflow deployments.

.. toctree::
:maxdepth: 2

production-deployment
security/index
logging-monitoring/index
kubernetes
lineage
listeners
dag-serialization
modules_management
scheduler
pools
cluster-policies
priority-weight
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@ We maintain :doc:`official Helm chart <helm-chart:index>` for Airflow that helps
Kubernetes Executor
^^^^^^^^^^^^^^^^^^^

The :doc:`Kubernetes Executor <executor/kubernetes>` allows you to run all the Airflow tasks on
The :doc:`Kubernetes Executor </core-concepts/executor/kubernetes>` allows you to run all the Airflow tasks on
Kubernetes as separate Pods.

KubernetesPodOperator
^^^^^^^^^^^^^^^^^^^^^

The :ref:`KubernetesPodOperator <howto/operator:KubernetesPodOperator>` allows you to create
The :ref:`KubernetesPodOperator <howto/operator:kubernetespodoperator>` allows you to create
Pods on Kubernetes.

Pod Mutation Hook
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Their specification is defined as ``hookspec`` in ``airflow/listeners/spec`` dir
Your implementation needs to accept the same named parameters as defined in hookspec, or Pluggy will complain about your plugin.
On the other hand, you don't need to implement every method - it's perfectly fine to have a listener that implements just one method, or any subset of methods.

To include listener in your Airflow installation, include it as a part of an :doc:`Airflow Plugin </plugins>`
To include listener in your Airflow installation, include it as a part of an :doc:`Airflow Plugin </authoring-and-scheduling/plugins>`

Listener API is meant to be called across all dags, and all operators - in contrast to methods like
``on_success_callback``, ``pre_execute`` and related family which are meant to provide callbacks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ For example, you may wish to alert when certain tasks have failed, or have the l
.. note::

Callback functions are only invoked when the task state changes due to execution by a worker.
As such, task changes set by the command line interface (:doc:`CLI <../usage-cli>`) or user interface (:doc:`UI <../ui>`) do not
As such, task changes set by the command line interface (:doc:`CLI <../../howto/usage-cli>`) or user interface (:doc:`UI <../../ui>`) do not
execute callback functions.

Callback Types
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Logging and Monitoring architecture

Airflow supports a variety of logging and monitoring mechanisms as shown below.

.. image:: ../img/arch-diag-logging.png
.. image:: ../../img/arch-diag-logging.png

By default, Airflow supports logging into the local file system. These include logs from the Web server, the Scheduler, and the Workers running tasks. This is suitable for development environments and for quick debugging.

Expand All @@ -33,9 +33,9 @@ The logging settings and options can be specified in the Airflow Configuration f
For production deployments, we recommend using FluentD to capture logs and send it to destinations such as ElasticSearch or Splunk.

.. note::
For more information on configuring logging, see :doc:`/logging-monitoring/logging-tasks`
For more information on configuring logging, see :doc:`/administration-and-deployment/logging-monitoring/logging-tasks`

Similarly, we recommend using StatsD for gathering metrics from Airflow and send them to destinations such as Prometheus.

.. note::
For more information on configuring metrics, see :doc:`/logging-monitoring/metrics`
For more information on configuring metrics, see :doc:`/administration-and-deployment/logging-monitoring/metrics`
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,4 @@ Edit ``airflow.cfg`` and set the ``webserver`` block to have an ``analytics_tool
variables are set in ``airflow/www/templates/app.py``.

.. note::
For more information on setting the configuration, see :doc:`../howto/set-config`
For more information on setting the configuration, see :doc:`../../howto/set-config`
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,7 @@ you might organize your versioning approach, control which versions of the share
and deploy the code to all your instances and containers in controlled way - all by system admins/DevOps
rather than by the DAG writers. It is usually suitable when you have a separate team that manages this
shared code, but if you know your python ways you can also distribute your code this way in smaller
deployments. You can also install your :doc:`/plugins` and :doc:`apache-airflow-providers:index` as python
deployments. You can also install your :doc:`../authoring-and-scheduling/plugins` and :doc:`apache-airflow-providers:index` as python
packages, so learning how to build your package is handy.

Here is how to create your package:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Database backend
Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external
database. However, such a setup is meant to be used for testing purposes only; running the default setup
in production can lead to data loss in multiple scenarios. If you want to run production-grade Airflow,
make sure you :doc:`configure the backend <howto/set-up-database>` to be an external database
make sure you :doc:`configure the backend <../howto/set-up-database>` to be an external database
such as PostgreSQL or MySQL.

You can change the backend using the following config
Expand Down Expand Up @@ -60,8 +60,8 @@ Airflow uses :class:`~airflow.executors.sequential_executor.SequentialExecutor`
nature, the user is limited to executing at most one task at a time. ``Sequential Executor`` also pauses
the scheduler when it runs a task, hence it is not recommended in a production setup. You should use the
:class:`~airflow.executors.local_executor.LocalExecutor` for a single machine.
For a multi-node setup, you should use the :doc:`Kubernetes executor <../executor/kubernetes>` or
the :doc:`Celery executor <../executor/celery>`.
For a multi-node setup, you should use the :doc:`Kubernetes executor <../core-concepts/executor/kubernetes>` or
the :doc:`Celery executor <../core-concepts/executor/celery>`.


Once you have configured the executor, it is necessary to make sure that every node in the cluster contains
Expand Down Expand Up @@ -111,7 +111,7 @@ Airflow users occasionally report instances of the scheduler hanging without a t
* `Scheduler gets stuck without a trace <https://github.com/apache/airflow/issues/7935>`_
* `Scheduler stopping frequently <https://github.com/apache/airflow/issues/13243>`_

To mitigate these issues, make sure you have a :doc:`health check </logging-monitoring/check-health>` set up that will detect when your scheduler has not heartbeat in a while.
To mitigate these issues, make sure you have a :doc:`health check <logging-monitoring/check-health>` set up that will detect when your scheduler has not heartbeat in a while.

.. _docker_image:

Expand Down Expand Up @@ -154,7 +154,7 @@ the side-car container and read by the worker container.
This concept is implemented in :doc:`the Helm Chart for Apache Airflow <helm-chart:index>`.


.. spelling::
.. spelling:word-list::

pypirc
dockerignore
Expand Down Expand Up @@ -222,7 +222,7 @@ you can exchange the Google Cloud Platform identity to the Amazon Web Service id
which effectively means access to Amazon Web Service platform.
For more information, see: :ref:`howto/connection:aws:gcp-federation`

.. spelling::
.. spelling:word-list::

nsswitch
cryptographic
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Airflow production environment. To kick it off, all you need to do is
execute the ``airflow scheduler`` command. It uses the configuration specified in
``airflow.cfg``.

The scheduler uses the configured :doc:`Executor </executor/index>` to run tasks that are ready.
The scheduler uses the configured :doc:`Executor <../core-concepts/executor/index>` to run tasks that are ready.

To start a scheduler, simply run the command:

Expand All @@ -44,7 +44,7 @@ Your DAGs will start executing once the scheduler is running successfully.
.. note::

The first DAG Run is created based on the minimum ``start_date`` for the tasks in your DAG.
Subsequent DAG Runs are created according to your DAG's :doc:`timetable </concepts/timetable>`.
Subsequent DAG Runs are created according to your DAG's :doc:`timetable <../authoring-and-scheduling/timetable>`.


For dags with a cron or timedelta schedule, scheduler won't trigger your tasks until the period it covers has ended e.g., A job with ``schedule`` set as ``@daily`` runs after the day
Expand All @@ -57,15 +57,15 @@ In the UI, it appears as if Airflow is running your tasks a day **late**

**Let's Repeat That**, the scheduler runs your job one ``schedule`` AFTER the start date, at the END of the interval.

You should refer to :doc:`/dag-run` for details on scheduling a DAG.
You should refer to :doc:`../core-concepts/dag-run` for details on scheduling a DAG.

DAG File Processing
-------------------

You can have the Airflow Scheduler be responsible for starting the process that turns the Python files contained in the DAGs folder into DAG objects
that contain tasks to be scheduled.

Refer to :doc:`dagfile-processing` for details on how this can be achieved
Refer to :doc:`../authoring-and-scheduling/dagfile-processing` for details on how this can be achieved


Triggering DAG with Future Date
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Access Control of Airflow Webserver UI is handled by Flask AppBuilder (FAB).
Please read its related `security document <http://flask-appbuilder.readthedocs.io/en/latest/security.html>`_
regarding its security model.

.. spelling::
.. spelling:word-list::
clearTaskInstances
dagRuns
dagSources
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ This guide provides ways to protect this data.

The following are particularly protected:

* Variables. See the :doc:`Variables Concepts </concepts/variables>` documentation for more information.
* Connections. See the :doc:`Connections Concepts </concepts/connections>` documentation for more information.
* Variables. See the :doc:`Variables Concepts </core-concepts/variables>` documentation for more information.
* Connections. See the :doc:`Connections Concepts </authoring-and-scheduling/connections>` documentation for more information.


.. toctree::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ can also enable alternative secrets backend to retrieve Airflow connections or A
If you use an alternative secrets backend, check inside your backend to view the values of your variables and connections.

You can also get Airflow configurations with sensitive data from the Secrets Backend.
See :doc:`../../../howto/set-config` for more details.
See :doc:`/howto/set-config` for more details.

Search path
^^^^^^^^^^^
Expand Down
Loading