Skip to content

Solves Airflow DAG management in bare Git repos, loading correct DAGs upon branch switch.

License

Notifications You must be signed in to change notification settings

lu0/git-worktree-airflow

Repository files navigation

git-worktree-airflow

This repository contains a post-checkout hook script useful to manage Airflow dags_folders pointing to bare repositories.

Called after a successful git checkout, this hook script creates a file .airflowignore in the root directory of a bare repository, listing all files and directories except for the worktree directory of the last checked out tree-ish (branch, tag or commit); then the Airflow UI will show only the DAGs contained in this directory.

Table of Contents

Installation

Activate this post-checkout hook by adding it to your .pre-commit-config.yaml:

repos:
  - repo: https://github.com/lu0/git-worktree-airflow
    rev: v1.1.3
    hooks:
      - id: airflow-worktree
        name: Update .airflowignore to load DAGs from worktree
        stages: [post-checkout]
        always_run: true
        verbose: true

And installing it:

pre-commit install --hook-type post-checkout

Manual

  • Copy or link the script select-airflow-worktree.sh into the hooks directory of your dags_folder (which should be a bare repository), and rename it to post-checkout.

  • Example:

ln -srf select-airflow-worktree.sh /path/to/your/dags_folder/hooks/post-checkout

Note: Make the script executable with chmod +x post-checkout if you *copied the script instead of linking it.

Usage

Recommended option: Using git-worktree-wrapper

  • Install git-worktree-wrapper.

  • Checkout into a tree-ish object. The hook will be triggered automatically.

    $ git checkout <tree-ish>
    
        .airflowignore updated to load DAGs from <tree-ish>

Alternative option: Using vanilla git

  1. First cd into the worktree directory of a tree-ish

    $ cd /path/to/the/root/directory/of/the/bare/repo
    $ cd tree-ish
    
  2. Then trigger the hook

    $ git checkout <tree-ish>
    
        .airflowignore updated to load DAGs from <tree-ish>

Examples

Let's say we have a dags folder pointing to a bare repository in ~/dags with the following structure:

Note: Directories and files common to bare repositories are hidden.

. dags
├── development
│   ├── dag_1.py
│   └── dag_2.py
├── feature
│   └── dag_4
│       ├── dag_1.py
│       ├── dag_2.py
│       ├── dag_3.py
│       ├── dag_4.py
│       └── examples
│           ├── example_dag_1.py
│           └── example_dag_2.py
└── master
    ├── dag_1.py
    ├── dag_2.py
    ├── dag_3.py
    └── examples
        ├── example_dag_1.py
        └── example_dag_2.py

And you want the Airflow UI to show the DAGs contained in the feature/dag_4 worktree.

  • If you are using git-worktree-wrapper and pre-commit, just checkout into the tree-ish. You can checkout from any nested worktree.

    $ git checkout feature/dag_4

    hook showcase to feature/dag_4

  • If you are using vanilla git:

    $ cd ~/dags
    $ cd feature/dag_4
    $ git checkout feature/dag_4
    
        .airflowignore updated to load DAGs from feature/dag_4
    

Either way, the post-checkout hook will create a file named .airflowignore with the following contents:

Note: Directories and files common to bare repositories are hidden.

development
master

If you trigger the hook in tree-ish development, .airflowignore will look like this:

Note: Directories and files common to bare repositories are hidden.

feature
master