-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-2057] [Regression] Flaky generic tests on multiple threads in dbt 1.4.1 #6885
Comments
@gastlich Thanks for opening! Are the two generic tests both defined on an ephemeral model? Or a model that depends on an ephemeral model? The code you linked is related to how ephemeral models are compiled into downstream nodes (as CTEs). We made a few changes to that code in v1.4, which were behind-the-scenes refactoring and cleanup of our node types (#6384, #6427). While those chnages shouldn't have had any functional effect, it's of course possible that this is an unintended consequence. A clear & reliable reproduction case would help us get to the bottom of this. I've tried reproducing with: -- models/my_model.sql
{{ config(materialized = 'ephemeral') }}
select 1 as id # models/config.yml
version: 2
models:
- name: my_model
columns:
- name: id
tests:
- unique
- not_null But I haven't been able to reproduce the error. I'm able to run both tests with multiple threads successfully. |
hello @jtcohen6 . Thanks for coming back to me so quickly :) I've managed to recreate the issue on a clean project: Setup# schema.yml
version: 2
models:
- name: int_eph_first
columns:
- name: first_column
tests:
- not_null
- name: second_column
tests:
- not_null
- name: fct_eph_first
columns:
- name: first_column
tests:
- not_null
- name: second_column
tests:
- not_null -- int_eph_first.sql
{{ config(materialized='ephemeral') }}
select
1 as first_column,
2 as second_column, -- fct_eph_first.sql
{{ config(materialized='ephemeral') }}
with int_eph_first as(
select * from {{ ref('int_eph_first') }}
)
select * from int_eph_first Single-threaded outcome
Multi-threaded outcome
|
@gastlich Thanks for the full reproduction case! I was able to reliably produce the error while running locally with It sounds like the needed ingredients are:
We'll need to dig in some more to understand what's actually going on behind the scenes. It sounds like a thread safety issue with |
@gshank is this scheduled to be in a 1.4 release? |
Is this a regression in a recent version of dbt-core?
Current Behavior
Hello,
Our project faced failures in CI test jobs after upgrading to the latest DBT version (1.4.1). The failures were due to flaky generic tests. After investigating the issue I was able to end up having 2 generic tests running on a single model, which causes the failure.
We are still running the older version
1.3.2
on main and everything is working fine and reliably.The issue appears to be related to a concurrency/race condition problem. This was confirmed when the following was discovered:
The issues is related to:
https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/compilation.py#L243
Some of the CTEs don't have their
sql
attribute set:I'm not familiar with the algorithm how the entities of
InjectedCTE
are created in the concurrent mode, but it would be good to hear from you if you have any suggestions/workarounds for this error.Expected/Previous Behavior
We expect all the tests to succeed in multi-threaded environment with DBT version 1.4.1
Steps To Reproduce
I haven't tried yet setting a new project and running tests on a clean setup, but based on our findings, we should focus on:
profiles.yml
not_null
)Relevant log output
Environment
Which database adapter are you using with dbt?
bigquery
Additional Context
No response
The text was updated successfully, but these errors were encountered: