Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on-run-start and on-run-end should be fully ignored by tasks that don't run hooks #3530

Closed
jtcohen6 opened this issue Jul 1, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jul 1, 2021

Describe the bug

If on-run-start and on-run-end project hooks that include Jinja operations (calling macros, logging, etc) will still perform those operations, as part of execute-time compilation, by tasks like compile and docs generate. Even though they don't go so far as to run compiled SQL, this leads to confusion and unintended consequences for some dbt users.

Steps To Reproduce

-- macros/placeholder.sql
{% macro select_1_as_id() %}
  select 1 as id
{% endmacro %}
# dbt_project.yml
on-run-start:
  - "{{ log('Help! I am being execute-time compiled!', info = true) if execute }}"
  - "{{ select_1_as_id() }}"

on-run-end:
  - "{{ log('Help! I am being execute-time compiled _again_!', info = true) if execute }}"
  - "{{ select_1_as_id() }}"

dbt run compiles and executes these project hooks, exactly as it should:

12:52:46 | Running 2 on-run-start hooks
Help! I am being execute-time compiled!
12:52:46 | 1 of 2 START hook: my_project.on-run-start.0......................... [RUN]
12:52:46 | 1 of 2 OK hook: my_project.on-run-start.0............................ [OK in 0.00s]
12:52:46 | 2 of 2 START hook: my_project.on-run-start.1......................... [RUN]
12:52:46 | 2 of 2 OK hook: my_project.on-run-start.1............................ [SELECT 1 in 0.00s]
12:52:46 |
12:52:46 | Concurrency: 1 threads (target='dev')
12:52:46 |
12:52:46 | 1 of 1 START view model dbt_jcohen.my_model.......................... [RUN]
12:52:46 | 1 of 1 OK created view model dbt_jcohen.my_model..................... [CREATE VIEW in 0.10s]
12:52:46 |
12:52:46 | Running 2 on-run-end hooks
Help! I am being execute-time compiled _again_!
12:52:46 | 1 of 2 START hook: my_project.on-run-end.0........................... [RUN]
12:52:46 | 1 of 2 OK hook: my_project.on-run-end.0.............................. [OK in 0.00s]
12:52:46 | 2 of 2 START hook: my_project.on-run-end.1........................... [RUN]
12:52:46 | 2 of 2 OK hook: my_project.on-run-end.1.............................. [SELECT 1 in 0.00s]
12:52:46 |
12:52:46 |
12:52:46 | Finished running 1 view model, 4 hooks in 6.03s.

dbt test ignores them entirely, as it's currently designed to do:

12:52:57 | Concurrency: 1 threads (target='dev')
12:52:57 |
12:52:57 | 1 of 1 START test not_null_my_model_id............................... [RUN]
12:52:57 | 1 of 1 PASS not_null_my_model_id..................................... [PASS in 0.05s]
12:52:57 |
12:52:57 | Finished running 1 test in 5.73s.

But dbt compile does something in between. It compiles the hooks to SQL (with execute: True), including any incidental effects, and not necessarily in order! It just doesn't run any of that compiled SQL against the database:

12:53:05 | Concurrency: 1 threads (target='dev')
12:53:05 |
Help! I am being execute-time compiled _again_!
Help! I am being execute-time compiled!
12:53:05 | Done.

It does, however, store that compiled SQL in manifest.json:

"operation.my_project.my_project-on-run-start-1": {
    "raw_sql": "{{ select_1_as_id() }}",
    "compiled": true,
    "resource_type": "operation",
    "compiled_sql": "\n  select 1 as id\n",
},

Well, technically, that's what dbt compile is meant to do—compile every compilable resource in the project.

Expected behavior

Based on what's documented here, I'd expect on-run-start and on-run-end hooks to be fully ignored by any task that isn't run, seed, or snapshot. I think that would look like dbt compile and dbt docs generate ignoring operations (project hooks) by default, with an optional way (via node selection) to flip them back on.

Now, is it a good thing that there's some inconsistency across tasks around observing vs. ignoring project hooks? That's a good question, and we can think more about it over in #3463.

Alternative solutions

Documentation, documentation, documentation. We should make clear that all hooks (on-run-start, on-run-end, pre-hook, post-hook) are primarily intended to run SQL, rather than perform Jinja operations (a la run-operation).

So run/snapshot/seed will run any hook's SQL, without needing to be told. Other command won't, but may still do incidental things related to Jinja-compiling that hook's SQL. The upshot of this: users should not use statements or run_query inside of hooks, since dbt already knows to run any compiled hook SQL against the database.

@jtcohen6 jtcohen6 added the bug Something isn't working label Jul 1, 2021
@jtcohen6
Copy link
Contributor Author

jtcohen6 commented Jul 1, 2021

Now that I've written all of that up, I do think the alternative solution may be the right solution here: hooks are compilable resources, just like models, and dbt compile and dbt docs generate should seek to render their compiled SQL.

We should make clear in the documentation that:

  • on-run-* hooks should ultimately render to strings, which are then executed as SQL
  • "pure Jinja" operations that don't want to run any SQL at all should be executed via run-operation instead

And we should work toward the proposal in #3463, giving users the options to filter based on the current invocation command, something like (say):

on-run-end:
  - "{{ do_my_fancy_jinja_thing() if flags.args.which in ('run', 'seed', 'snapshot') }}"

For the time being, I'm going to close this issue in favor of a docs.getdbt.com change. I found it a useful exercise. Maybe someone else will stumble across it and find it useful again in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant