on-run-start and on-run-end should be fully ignored by tasks that don't run hooks #3530

jtcohen6 · 2021-07-01T16:58:11Z

Describe the bug

If on-run-start and on-run-end project hooks that include Jinja operations (calling macros, logging, etc) will still perform those operations, as part of execute-time compilation, by tasks like compile and docs generate. Even though they don't go so far as to run compiled SQL, this leads to confusion and unintended consequences for some dbt users.

Steps To Reproduce

-- macros/placeholder.sql
{% macro select_1_as_id() %}
  select 1 as id
{% endmacro %}

# dbt_project.yml
on-run-start:
  - "{{ log('Help! I am being execute-time compiled!', info = true) if execute }}"
  - "{{ select_1_as_id() }}"

on-run-end:
  - "{{ log('Help! I am being execute-time compiled _again_!', info = true) if execute }}"
  - "{{ select_1_as_id() }}"

dbt run compiles and executes these project hooks, exactly as it should:

12:52:46 | Running 2 on-run-start hooks
Help! I am being execute-time compiled!
12:52:46 | 1 of 2 START hook: my_project.on-run-start.0......................... [RUN]
12:52:46 | 1 of 2 OK hook: my_project.on-run-start.0............................ [OK in 0.00s]
12:52:46 | 2 of 2 START hook: my_project.on-run-start.1......................... [RUN]
12:52:46 | 2 of 2 OK hook: my_project.on-run-start.1............................ [SELECT 1 in 0.00s]
12:52:46 |
12:52:46 | Concurrency: 1 threads (target='dev')
12:52:46 |
12:52:46 | 1 of 1 START view model dbt_jcohen.my_model.......................... [RUN]
12:52:46 | 1 of 1 OK created view model dbt_jcohen.my_model..................... [CREATE VIEW in 0.10s]
12:52:46 |
12:52:46 | Running 2 on-run-end hooks
Help! I am being execute-time compiled _again_!
12:52:46 | 1 of 2 START hook: my_project.on-run-end.0........................... [RUN]
12:52:46 | 1 of 2 OK hook: my_project.on-run-end.0.............................. [OK in 0.00s]
12:52:46 | 2 of 2 START hook: my_project.on-run-end.1........................... [RUN]
12:52:46 | 2 of 2 OK hook: my_project.on-run-end.1.............................. [SELECT 1 in 0.00s]
12:52:46 |
12:52:46 |
12:52:46 | Finished running 1 view model, 4 hooks in 6.03s.

dbt test ignores them entirely, as it's currently designed to do:

12:52:57 | Concurrency: 1 threads (target='dev')
12:52:57 |
12:52:57 | 1 of 1 START test not_null_my_model_id............................... [RUN]
12:52:57 | 1 of 1 PASS not_null_my_model_id..................................... [PASS in 0.05s]
12:52:57 |
12:52:57 | Finished running 1 test in 5.73s.

But dbt compile does something in between. It compiles the hooks to SQL (with execute: True), including any incidental effects, and not necessarily in order! It just doesn't run any of that compiled SQL against the database:

12:53:05 | Concurrency: 1 threads (target='dev')
12:53:05 |
Help! I am being execute-time compiled _again_!
Help! I am being execute-time compiled!
12:53:05 | Done.

It does, however, store that compiled SQL in manifest.json:

"operation.my_project.my_project-on-run-start-1": {
    "raw_sql": "{{ select_1_as_id() }}",
    "compiled": true,
    "resource_type": "operation",
    "compiled_sql": "\n  select 1 as id\n",
},

Well, technically, that's what dbt compile is meant to do—compile every compilable resource in the project.

Expected behavior

Based on what's documented here, I'd expect on-run-start and on-run-end hooks to be fully ignored by any task that isn't run, seed, or snapshot. I think that would look like dbt compile and dbt docs generate ignoring operations (project hooks) by default, with an optional way (via node selection) to flip them back on.

Now, is it a good thing that there's some inconsistency across tasks around observing vs. ignoring project hooks? That's a good question, and we can think more about it over in #3463.

Alternative solutions

Documentation, documentation, documentation. We should make clear that all hooks (on-run-start, on-run-end, pre-hook, post-hook) are primarily intended to run SQL, rather than perform Jinja operations (a la run-operation).

So run/snapshot/seed will run any hook's SQL, without needing to be told. Other command won't, but may still do incidental things related to Jinja-compiling that hook's SQL. The upshot of this: users should not use statements or run_query inside of hooks, since dbt already knows to run any compiled hook SQL against the database.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2021-07-01T17:04:00Z

Now that I've written all of that up, I do think the alternative solution may be the right solution here: hooks are compilable resources, just like models, and dbt compile and dbt docs generate should seek to render their compiled SQL.

We should make clear in the documentation that:

on-run-* hooks should ultimately render to strings, which are then executed as SQL
"pure Jinja" operations that don't want to run any SQL at all should be executed via run-operation instead

And we should work toward the proposal in #3463, giving users the options to filter based on the current invocation command, something like (say):

on-run-end:
  - "{{ do_my_fancy_jinja_thing() if flags.args.which in ('run', 'seed', 'snapshot') }}"

For the time being, I'm going to close this issue in favor of a docs.getdbt.com change. I found it a useful exercise. Maybe someone else will stumble across it and find it useful again in the future.

jtcohen6 added the bug Something isn't working label Jul 1, 2021

jtcohen6 closed this as completed Jul 1, 2021

jeremyyeo mentioned this issue Feb 25, 2022

[CT-285] [Bug] on-run-end hooks execute before on-run-start when the command is dbt docs generate #4785

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on-run-start and on-run-end should be fully ignored by tasks that don't run hooks #3530

on-run-start and on-run-end should be fully ignored by tasks that don't run hooks #3530

jtcohen6 commented Jul 1, 2021 •

edited

Loading

jtcohen6 commented Jul 1, 2021

on-run-start and on-run-end should be fully ignored by tasks that don't run hooks #3530

on-run-start and on-run-end should be fully ignored by tasks that don't run hooks #3530

Comments

jtcohen6 commented Jul 1, 2021 • edited Loading

Describe the bug

Steps To Reproduce

Expected behavior

Alternative solutions

jtcohen6 commented Jul 1, 2021

jtcohen6 commented Jul 1, 2021 •

edited

Loading