Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1535] Investigate performance of functional tests #6289

Closed
gshank opened this issue Nov 18, 2022 · 5 comments
Closed

[CT-1535] Investigate performance of functional tests #6289

gshank opened this issue Nov 18, 2022 · 5 comments
Labels
repo ci/cd Testing and continuous integration for dbt-core + adapter plugins tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@gshank
Copy link
Contributor

gshank commented Nov 18, 2022

It takes a long time to run our functional/integration tests now. Investigate to identify if there are particular long-running tests which are slowing things down, or if it's the nature of our testing framework. Right now there is a large delay in getting the results of our functional test runs which delays the processing of pull requests.

Identify steps that could be taken to improve the test speed. One option is to split up the tests into subsets so that they can be run in parallel. We could possibly limit the testing of multiple versions of Python to a once-daily build.

@github-actions github-actions bot changed the title Investigate performance of functional tests [CT-1535] Investigate performance of functional tests Nov 18, 2022
@gshank gshank added the dbt tests Issues related to built-in dbt testing functionality label Nov 18, 2022
@gshank
Copy link
Contributor Author

gshank commented Nov 18, 2022

I am seeing a bunch of my functional test runs canceled when they reach 45 minutes.

@nitinbhojwani
Copy link

Another area of performance improvement:

Dbt invokes a single query per test e.g. if a null check is configured for 3 fields in a model, it simply invokes 3 queries against the target data store.

I think it's suboptimal and there should be a way to group them in a single query.

@jtcohen6 jtcohen6 added repo ci/cd Testing and continuous integration for dbt-core + adapter plugins tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality and removed dbt tests Issues related to built-in dbt testing functionality labels Nov 20, 2022
@jtcohen6
Copy link
Contributor

Relabeling this from tests to ci/cd, since this is about our integration/functional tests (pytest) that run in CI for dbt-core, rather than the built-in feature for data testing (dbt test).

@nitinbhojwani For what you're asking, you might be interested in:

@max-sixty
Copy link
Contributor

max-sixty commented Nov 28, 2022

At risk of commenting from the peanut gallery, a couple of things that might be helpful:

  • pytest has a --durations option which will show the longest running tests
  • There's also a -n pytest option for running in parallel (although GH runners are fairly small)
  • Something to consider is running some set of slow tests either after merging or when specifically requested with a label. That way, they don't get in the way of most small PRs. If a break does get mistakenly get through, it's always easy to revert. Here's how we do that in PRQL: https://github.com/prql/prql/blob/0.2.11/.github/workflows/pull-request.yaml#L58-L69

@jtcohen6
Copy link
Contributor

Two next steps here:

Going to close this issue for now, given there are a few threads to pull on being tracked in other places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
repo ci/cd Testing and continuous integration for dbt-core + adapter plugins tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

4 participants