-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flowetl refactoring & bug fixes #1380
Conversation
…ause they are fixed anyway.
… with schedule_interval
…ally pass the concurrency setting on to the DAG.
Test summaryRun details
View run in Cypress Dashboard ➡️ This comment has been generated by cypress-bot as a result of this project's GitHub integration settings. You can manage this integration in this project's settings in the Cypress Dashboard |
Codecov Report
@@ Coverage Diff @@
## master #1380 +/- ##
==========================================
+ Coverage 95.01% 95.02% +<.01%
==========================================
Files 157 157
Lines 7548 7551 +3
Branches 704 704
==========================================
+ Hits 7172 7175 +3
Misses 268 268
Partials 108 108
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Closes #1375, closes #1376, closes #1377, closes #1378, closes #1379.
I have:
Description
The
default_args
section has been removed from the FlowETLconfig.yml
. Similarly, we don't pass in adefault_args
dict toconstruct_etl_dag()
any more - instead, the values forowner
andstart_date
are now hard-coded inside that function because they don't need to change.We now create DAGs only for those CDR types that are present in
config.yml
.The config is now always validated and default values filled in when reading the
config.yml
file.The
concurrency
setting fromconfig.yml
is now passed on to the DAG during creation, using themax_active_runs_per_dag
parameter. Note that Airflow has a plethora of concurrency/parallelism settings, which can be a bit confusing (see discussions here, here, here and here, among other places), butmax_active_runs_per_dag
seems to be the right setting for this purpose.The FlowETL deployment example has been updated so that it works when simply following the instructions.