Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect value set for full_refresh in config #260

Closed
grindheim opened this issue Nov 25, 2021 · 1 comment · Fixed by #262
Closed

Respect value set for full_refresh in config #260

grindheim opened this issue Nov 25, 2021 · 1 comment · Fixed by #262
Labels
type:bug Something isn't working

Comments

@grindheim
Copy link
Contributor

grindheim commented Nov 25, 2021

Describe the bug

Attempts to disable full-refresh for an incremental model, either by setting +full_refresh: false in dbt_project.yml or full_refresh = false in a model's config, is ignored when running dbt run --full-refresh.

Steps To Reproduce

Create a new, incremental model with the following definition:

{{ config(
    materialized = 'incremental',
    full_refresh = false
    ) 
}}
select CURRENT_TIMESTAMP() AS ChangeTimestamp

Then run dbt run --select <model_name> --full-refresh twice.

Finally, run a select query against the table that selects the min and max value for the change timestamp.

select min(ChangeTimestamp), max(ChangeTimestamp) from <database_name>.<model_name>

This will give the same value for both the min and the max timestamp.

Expected behavior

The full-refresh config setting should be respected when the FULL_REFRESH flag is used, meaning that an incremental table with the config full_refresh = false should always load incrementally except for the first time the model is run.

The error is caused by line 13 in incremental.sql:
https://github.com/dbt-labs/dbt-spark/blob/main/dbt/include/spark/macros/materializations/incremental/incremental.sql
{%- set full_refresh_mode = (flags.FULL_REFRESH == True) -%}

This does not respect the config definition, and does instead solely rely on the FULL_REFRESH flag.

The correct definition for this is as defined in dbt-core/dbt-snowflake/etc
https://github.com/dbt-labs/dbt-core/blob/v0.21.1rc1/core/dbt/include/global_project/macros/materializations/incremental/incremental.sql

{%- set full_refresh_mode = (should_full_refresh()) -%}

This has been tested locally and shown to work as expected.

Screenshots and log output

System information

The output of dbt --version:

installed version: 0.21.0
   latest version: 0.21.0

Up to date!

Plugins:
  - snowflake: 0.21.0
  - bigquery: 0.21.0
  - redshift: 0.21.0
  - spark: 0.21.0
  - postgres: 0.21.0

NB: the plugin that is used is dbt-spark.

The operating system you're using:
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye

The output of python --version:
Python 3.8.12

Additional context

Add any other context about the problem here.

@jtcohen6
Copy link
Contributor

@grindheim Thanks for the issue and the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants