Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-542] [Bug] hook execution ordering not consistent with documentation #5133

Closed
1 task done
JohnMav opened this issue Apr 21, 2022 · 7 comments
Closed
1 task done
Labels
bug Something isn't working

Comments

@JohnMav
Copy link

JohnMav commented Apr 21, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

post-hooks defined in the dbt_project.yml are executing prior to post-hooks defined within a model. This is inconsistent with documentation on pre-hook/post-hook execution ordering which states:

If multiple instances of any hooks are defined, dbt will run each hook using the following ordering:
1. Hooks from dependent packages will be run before hooks in the active package.
2. Hooks defined within the model itself will be run before hooks defined in dbt_project.yml.
3. Hooks within a given context will be run in the order in which they are defined.

Expected Behavior

post-hooks defined in dbt_project.yml run after hooks defined in the model itself as described in the pre-hook/post-hook documentation

Steps To Reproduce

  1. Define a post-hook on a set of models from within the dbt_project.yml e.g.
models:
  jaffle_shop:    
    +post-hook: '{{ log("[Dry-Run] - RUNNING POST HOOK FROM DBT_PROJECT.YML") }}'
  1. Define post-hook within a model e.g. models/jaffle_shop/shop_orders.sql
{{ config(
    post_hook='{{ log("[Dry-Run] - RUNNING POST HOOK FROM MODEL CONFIG") }}'
) }}

select *
from {{ ref('raw_orders') }} 
  1. run the model e.g. dbt run -s shop_orders
  2. Inspect the dbt run logs i.e. logs/dbt.log
21:00:28.818291 [debug] [Thread-1  ]: Opening a new connection, currently in state closed
21:00:29.403886 [debug] [Thread-1  ]: SQL status: SUCCESS 1 in 0.59 seconds
21:00:29.406615 [debug] [Thread-1  ]: [Dry-Run] - RUNNING POST HOOK FROM DBT_PROJECT.YML
21:00:29.408061 [debug] [Thread-1  ]: [Dry-Run] - RUNNING POST HOOK FROM MODEL CONFIG
21:00:29.415241 [debug] [Thread-1  ]: finished collecting timing info
21:00:29.415492 [debug] [Thread-1  ]: On model.jaffle_shop.shop_orders: Close
21:00:29.529517 [info ] [Thread-1  ]: 1 of 1 OK created view model test_db.shop_orders............................. [�[32mSUCCESS 1�[0m in 0.75s]
21:00:29.530103 [debug] [Thread-1  ]: Finished running node model.jaffle_shop.shop_orders
21:00:29.531379 [debug] [MainThread]: Acquiring new snowflake connection "master"
21:00:29.531864 [info ] [MainThread]: 
21:00:29.532288 [info ] [MainThread]: Finished running 1 view model in 2.10s.
21:00:29.532711 [debug] [MainThread]: Connection 'master' was properly closed.
21:00:29.532908 [debug] [MainThread]: Connection 'model.jaffle_shop.shop_orders' was properly closed.
21:00:29.538959 [info ] [MainThread]: 
21:00:29.539393 [info ] [MainThread]: �[32mCompleted successfully�[0m
21:00:29.539798 [info ] [MainThread]: 
21:00:29.540223 [info ] [MainThread]: Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

We are noticing this behavior occurs in our snowflake warehouse as well. We have post-hook macros that perform operations in snowflake and the execution order remains consistent with the above example where post-hooks defined in the dbt_project.yml are executing prior to the post-hooks defined directly in the model. This is causing problems since we rely on execution ordering for specific role grant/revoke operations.

Relevant log output

20:42:14.715413 [debug] [Thread-1  ]: [Dry-Run] - RUNNING POST HOOK FROM DBT_PROJECT.YML
20:42:14.716872 [debug] [Thread-1  ]: [Dry-Run] - RUNNING POST HOOK FROM MODEL CONFIG

Environment

- OS: macOS Monterrey v12.3.1
- Python: 3.9.12
- dbt: 1.0.3

What database are you using dbt with?

snowflake

Additional Context

There is a similar issue regarding on-run-start / on-run-end execution ordering in #4785 , this is possibly related to that issue as well?

@JohnMav JohnMav added bug Something isn't working triage labels Apr 21, 2022
@github-actions github-actions bot changed the title [Bug] hook execution ordering not consistent with documentation [CT-542] [Bug] hook execution ordering not consistent with documentation Apr 21, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Apr 22, 2022

Thanks @JohnMav, someone from our side will take a look.

This is causing problems since we rely on execution ordering for specific role grant/revoke operations.

I'd also be curious to hear more about how you're using grants for more complex grant management, given that we're thinking about adding out-of-the-box support for grants in materializations soon: #5090

@ChenyuLInx
Copy link
Contributor

@JohnMav Thanks for opening it and providing detailed reproduction steps!!! I was able to reproduce the issue and looking into it now.
Is this a behavior that changed from matching the document to reversed order? Or it has always been like this?

@ChenyuLInx
Copy link
Contributor

Just checked that this behavior has been there since 0.21, @JohnMav sorry about the confusion with the document! We are going to update the docs!

@JohnMav
Copy link
Author

JohnMav commented Apr 26, 2022

Thank you @ChenyuLInx and thank you for determining when that issue came up!

@JohnMav
Copy link
Author

JohnMav commented Apr 27, 2022

@ChenyuLInx I do have one additional concern and happy to open this as a different issue, but this hook execution ordering isn't very consistent with the documentation about configurations in general. https://docs.getdbt.com/reference/configs-and-properties#config-inheritance

Most notably

Configurations are prioritized in order of specificity, which is generally the order above: an in-file config() block takes precedence over properties defied in a .yml file, which takes precedence over a config defined in the project file.

Since hook execution is reversed from this general pattern it may be worthwhile to call out in the Config Inheritance and/or combining configs sections on the above page.

Either that or making hook execution ordering consistent with other configurations throughout the project.

@jtcohen6
Copy link
Contributor

@JohnMav Heard! From my perspective, that is consistent insofar as the most specific configs are applied last.

For configs that are "clobbered" when merged (materialized, schema), that means the in-model config "wins," since it's the last one applied.

For configs that are "appended" when merged (tags, post_hook), the project-level configs are applied first, then the in-model configs after. That determines the order in which the hooks run as well.

@jtcohen6 jtcohen6 removed the triage label Apr 27, 2022
@JohnMav
Copy link
Author

JohnMav commented Apr 27, 2022

Thanks @jtcohen6! That does make sense, would you be able to add that clarification in the docs as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants