-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Production Helm chart support #8777
Conversation
I am also working on something similar, which is initially a rewrite of the stable/airflow chart. |
@schnie I have released v7.0.0 of the There definitely are a few changes from your chart which would improve things in Regardless, I think we should decouple the release of airflow from its Helm chart (and potentially keep it in a seperate repo). Also note, the official |
chart/values.yaml
Outdated
tag: ~ | ||
pullPolicy: IfNotPresent | ||
flower: | ||
repository: astronomerinc/ap-airflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to use the official airflow image, is it just changing this image?
Or changes required across the chart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. @dimberman has some experience with this. I believe the intention is to follow up this PR with a better integration into the official image as well as Airflow 2.0. Any thoughts on this Daniel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schnie I'd say let's merge what we have and I will then
- Create a PR making it compatible with the airflow 2.0 official image
- Create a backport making it compatible with the 1.10 image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't understand how the 'offical' helm chart for airflow could use anything but the offical Dockerfile.
For reference, we already use the offical Dockerfile at stable/airflow
@thesuperzapper There is an issue in the Helm chart. Furthermore, I think it should be easier to set it for the KubernetesExecutor as well. It depends too much of the CeleryExecutor whereas both are quite used now. Hope it helps :) |
@marclamberti, thanks for the heads up, you'll be glad to know that fixes for both issues you raised have made their way into v7.1.0 of the chart. |
@schnie, I can't see a section of this chart which will allow users to sync their DAGs from an external git repo. I think this is probably the single most common deployment config for airflow, you need to support it. |
@schnie, not all K8S clusters support Ingress that are not on the public internet, therefore not all companies be able to use this chart. You should add support for exposing the webserver/flower as LoadBalancer type Services, (and therefore using Airflow's inbuilt SSL support for HTTPS. |
Hi @schnie, can we help you somehow with moving this PR forward? 😉 |
Wouldn't be easier to start with the chart of @thesuperzapper ? In the sense that it already include a lot of features ? I'm pretty sure it is widely used too as it is available on the stable Helm charts repo |
I would definitely agree in terms of basing it from the chart that is in the stable Helm charts repo. There are some interesting components in the Astronomer and Bitnami charts that could make this chart really good. I think another good thing would be increasing the ease of getting going with the Kubernetes executor too. |
@marclamberti we get a lot of questions from users of the community chart who can't seem to get it to work/have all sorts of weird issues. Has this not been your experience? @streetmapp completely agreed. I'm working on backporting a fix that will allow users to apply arbitrary pod specs via yaml files on the KubernetesExecutor (hopefully in time for 1.10.11), and will then write much more documentation around that. |
@dimberman Well indeed, I got some issues too, one at this moment actually. I agreed with @streetmapp my point was, there are three charts right now, having some very interesting features that could be nice to merge in some way. Btw, I would be glad to help :) |
Great point, I meant to fix this before pushing. For now, I've removed the |
@schnie can you make the yamllint tests pass please? I think we can merge an initial PR just to get the code in and then start fixing it on master. |
The ingress components were already capable of being on/off were they not? Seems odd to remove them if that was the case. |
@dimberman, I honestly think we should consider putting any official helm charts in a separate repository, as there is little reason to pin their releases to airflow itself. |
@thesuperzapper The main reason is that we're going to make both helm charts and production image used in automated tests of airflow - this way they will both be automatically tested during the tests of airflow. We are basically going to "eat of our dog food" as the term goes - so we are going to use airlfow tests in master to test our Dockerfile and Helm chart. At the same time we use our Helm and Dockerfile to test airflow - it already happens for Dockerfile - we are running Kubernetes Executor tests using production image and we are going to switch to helm charts for that soon. Even now both Production image and Helm chart from master can be used to install older versions of Airflow - see https://github.com/apache/airflow/blob/master/IMAGES.rst - then you will how to use the current image from master to build any airflow 1.10* version of the image. You cam treat those Dockerfile and Charts as development versions - same as all other airflow code. Once this is more stable and proven, both production docker file and help chart will land in their dedicated official repositories (additionally to continuing development in master of Airflow). This is all described in https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-26+Production-ready+Airflow+Docker+Image+and+helm+chart and we are approximately half way from what we considered as Done Helm chart will be here: Doickefile will become an official docker image |
@potiuk I maintain the I think we (Apache Airflow) should have a helm repository, similar to how ArgoProj has their own helm repo, initially this can contain the existing stable/airflow chart. Please see here: https://github.com/helm/charts#deprecation-timeline |
Great. Then we will have to aadapt our plans and make whatever everyone will be doing if "official" helm chart will no longer be official. Regardless of that the Dockerfile an Helm Chart will be here to stay and once we stabilise it and make both image and helm robust we will find a good place for both I am sure. |
Looking at the yamllint errors, I think we have to make it not check anything under |
Thank you all for the comments and discussions. PRs are now welcome to make any changes to existing code or add new features. |
Big thanks to @schnie 🚀 |
Awesome! Time to use the official chart 🚀🚀 |
Fantastic! 🚀 🚀 . I am going to switch to the helm chart for our Kubernetes tests soon and iterate on the prod image @dimberman @schnie @kaxil @ashb -> I guess I am free to do so ? You are not working on this? |
Thanks for this. Was waiting for this eagerly. I tested it out quickly and I am getting below error using helm 2. Am I doing something wrong? I just want to get the yaml files and not install via helm.
|
You will need to run |
What's the repo URL to be used for this chart? The one listed in the Readme points to the stable Helm repo so if you are following that, you end up pulling down the Airflow chart from there and not this one. Also of note, later in the Readme the Astronomer repo is listed in the Walkthrough using kind section. |
We haven't published this chart yet, PRs are welcome to improve the chart including documentation and we would let everyone know once we publish it. |
Is the I am worried that we will end up clogging master's commit history, and confusing users about the maturity/availability of the chart feature. |
Yes. it is good the @thesuperzapper . IMHO Monorepo with all stuff in is one of the best ideas for development for such project size. We are going to use it for our kubernetes tests and we already have unit tests in progress #9371 and we are going to develop it at the same time as developing production image and airflow code itself. We are definitely going to release - both images and helm charts in an "official" way - the same way as all the code is released for Apache Airflow. Note that there is a very formal process to follow to release any code for Apache Airflow once the code becomes "community managed" http://www.apache.org/legal/release-policy.html - this includes voting by PMCs and at least 72 hours of voting period. So we will likely be releasing Helm chart together with official releases of Apache Airlfow (same with Docker Image). Everything in master is in "development" for Apache Airflow. |
I have some questions:
|
@Siddharthk , for
so that it runs as the correct airflow user, instead of root Have added this to the config map in a PR |
@schnie see a few questions from @Siddharthk (was chatting w/ him in Airflow Slack) |
@Siddharthk ,
2/ I don't think there is a recommended way to store logs as it depends on your environment and requirements. For instance, I store my logs in S3. S3 is cheap and reliable. I can process the logs later with other tools to make analysis if needed and I can archive them at a defined time. 3 3/ About your error, I got it once with the "unofficial" chart. Sadly I don't remember how I fixed it. Check with kubectl describe your pod if you don't get additional information and keep your workers after completion so that you can debug them. Hope it helps :) |
Thanks @aneesh-joseph @marclamberti I have made webserver as HA for now and configured S3 for logs. For python module error, I have created below env for now which worked:
Another issue I am facing is mounting secrets to task pods. Basically I am trying to use ssh_key file for ssh operator. I added additional volumes and volume mounts for the scheduler. Will this reflect on the task pods as well?
|
Well I did that too actually but it was with the unofficial helm chart. Are you using the official one here? |
Yes, using the official one. |
No, I don't think it will mount them automatically to the worker pods created by the KubernetesExecutor. If this is for a specific task, you could probably specify the |
Co-authored-by: Ash Berlin-Taylor <[email protected]>
Co-authored-by: Ash Berlin-Taylor <[email protected]> (cherry picked from commit 66e7382)
Co-authored-by: Ash Berlin-Taylor <[email protected]> (cherry picked from commit 66e7382)
Co-authored-by: Ash Berlin-Taylor <[email protected]> (cherry picked from commit 66e7382)
Co-authored-by: Ash Berlin-Taylor <[email protected]> (cherry picked from commit 66e7382)
This PR adds a default helm chart for Airflow. This chart is based on the one we use at Astronomer to manage hundreds of production deployments.
The chart has been cleaned up to remove any references to the Astronomer platform or anything specific to the surrounding infrastructure.
The
bin
directory contains some scripts to help run this chart locally or in CI using kind, as well as packaging and releasing it to a helm repository. The scripts reference the Astronomer helm repo right now, but I figured we could use it as a starting point and tweak these scripts, or remove them completely to fit into how we want to manage this going forward. Opening this as a draft PR for now to start some discussion.Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.