From a582ff67859e8ca34124fb3a2e05b5716f27ffeb Mon Sep 17 00:00:00 2001 From: Ly Nguyen Date: Fri, 15 Mar 2024 09:02:29 -0700 Subject: [PATCH 01/11] Improvements to applied state metadata --- .../docs/docs/dbt-cloud-apis/project-state.md | 24 ++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/website/docs/docs/dbt-cloud-apis/project-state.md b/website/docs/docs/dbt-cloud-apis/project-state.md index 78dff4309db..e649d88e4dd 100644 --- a/website/docs/docs/dbt-cloud-apis/project-state.md +++ b/website/docs/docs/dbt-cloud-apis/project-state.md @@ -14,7 +14,7 @@ There are two states that can be queried in dbt Cloud: - **Definition state** depends on what exists in the project given the code defined in it (for example, manifest state), which hasn’t necessarily been executed in the data platform (maybe just the result of `dbt compile`). -### Definition (logical) vs. applied state of dbt nodes +## Definition (logical) vs. applied state of dbt nodes In a dbt project, the state of a node _definition_ represents the configuration, transformations, and dependencies defined in the SQL and YAML files. It captures how the node should be processed in relation to other nodes and tables in the data warehouse and may be produced by a `dbt build`, `run`, `parse`, or `compile`. It changes whenever the project code changes. @@ -57,7 +57,7 @@ query Compare($environmentId: Int!, $first: Int!) { Most Discovery API use cases will favor the _applied state_ since it pertains to what has actually been run and can be analyzed. -### Affected states by node type +## Affected states by node type | Node | Executed in DAG | Created by execution | Exists in database | Lineage | States | |-----------|------------------|----------------------|--------------------|-----------------------|----------------------| @@ -72,7 +72,7 @@ Most Discovery API use cases will favor the _applied state_ since it pertains to | Group | No | No | No | Downstream | Definition | | Macro | Yes | No | No | N/A | Definition | - ### Caveats about state/metadata updates +## Caveats about state/metadata updates Over time, Cloud Artifacts will provide information to maintain state for features/services in dbt Cloud and enable you to access state in dbt Cloud and its downstream ecosystem. Cloud Artifacts is currently focused on the latest production state, but this focus will evolve. @@ -83,3 +83,21 @@ Here are some limitations of the state representation in the Discovery API: - Compiled code results may be outdated depending on dbt Cloud run step order and failures. - Catalog info can be outdated, or incomplete (in the applied state), based on if/when `docs generate` was last run. - Source freshness checks can be out of date (in the applied state) depending on when the command was last run, and it’s not included in `build`. + + +## Adapter features for applied state + +The following lists the features available for adapters: + +| Adapter | Catalog | Source freshness | +|---------|---------|------------------| +| `dbt-snowflake` | incremental | metadata-based | +| `dbt-spark` | manual run | `loaded_at` field | + +### Catalog + +You can build the catalog incrementally for adapters that support it. This allows for the catalog to be built along with the model, which eliminates the need to run a lengthy `dbt docs generate` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate` to build the catalog. + +### Source freshness + +You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. \ No newline at end of file From 667c6e92fe546a4ad2e7d8e7cb5c3fd7e4ad9b28 Mon Sep 17 00:00:00 2001 From: Ly Nguyen Date: Fri, 15 Mar 2024 09:24:00 -0700 Subject: [PATCH 02/11] Minor nit --- website/docs/docs/dbt-cloud-apis/project-state.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/dbt-cloud-apis/project-state.md b/website/docs/docs/dbt-cloud-apis/project-state.md index e649d88e4dd..54ec5d830b7 100644 --- a/website/docs/docs/dbt-cloud-apis/project-state.md +++ b/website/docs/docs/dbt-cloud-apis/project-state.md @@ -85,7 +85,7 @@ Here are some limitations of the state representation in the Discovery API: - Source freshness checks can be out of date (in the applied state) depending on when the command was last run, and it’s not included in `build`. -## Adapter features for applied state +## Supported features for applied state The following lists the features available for adapters: From 113761ab403e1cdec2460c8e47435409b9abc46e Mon Sep 17 00:00:00 2001 From: Ly Nguyen Date: Wed, 20 Mar 2024 17:05:17 -0700 Subject: [PATCH 03/11] Feedback --- website/docs/docs/dbt-cloud-apis/project-state.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/dbt-cloud-apis/project-state.md b/website/docs/docs/dbt-cloud-apis/project-state.md index 54ec5d830b7..9320d4d4fcb 100644 --- a/website/docs/docs/dbt-cloud-apis/project-state.md +++ b/website/docs/docs/dbt-cloud-apis/project-state.md @@ -96,7 +96,7 @@ The following lists the features available for adapters: ### Catalog -You can build the catalog incrementally for adapters that support it. This allows for the catalog to be built along with the model, which eliminates the need to run a lengthy `dbt docs generate` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate` to build the catalog. +You can build the catalog incrementally for adapters that support it. This allows for the catalog to be built along with the model, which eliminates the need to run a lengthy `dbt docs generate` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate --select ...` to build the catalog. ### Source freshness From 8b98e71bd93aff6e670d712f5bc17bc90ca484f7 Mon Sep 17 00:00:00 2001 From: Ly Nguyen Date: Mon, 25 Mar 2024 09:18:49 -0700 Subject: [PATCH 04/11] Feedback --- .../about-core-connections.md | 24 +++++++++++++++++++ .../docs/docs/dbt-cloud-apis/project-state.md | 18 -------------- 2 files changed, 24 insertions(+), 18 deletions(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 097d0e4edb7..b40477f0dfb 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -30,3 +30,27 @@ These connection instructions provide the basic fields required for configuring If you're using dbt from the command line (CLI), you'll need a profiles.yml file that contains the connection details for your data platform. When you run dbt from the CLI, it reads your dbt_project.yml file to find the profile name, and then looks for a profile with the same name in your profiles.yml file. This profile contains all the information dbt needs to connect to your data platform. For detailed info, you can refer to the [Connection profiles](/docs/core/connect-data-platform/connection-profiles). + + +## Adapter features + +The following table lists the features available for adapters: + +| Adapter | Catalog | Source freshness | +|---------|---------|------------------| +| dbt default configuration | manual run | `loaded_at` field | +| `dbt-bigquery` | incremental | metadata-based | +| `dbt-databricks` | manual run | metadata-based | +| `dbt-postgres` | incremental | `loaded_at` field | +| `dbt-redshift` | incremental | metadata-based | +| `dbt-snowflake` | incremental | metadata-based | +| `dbt-spark` | manual run | `loaded_at` field | + + +### Catalog + +You can build the catalog incrementally for adapters that support it. This allows for the catalog to be built along with the model, which eliminates the need to run a lengthy `dbt docs generate --select ...` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate --select ...` to build the catalog. + +### Source freshness + +You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. \ No newline at end of file diff --git a/website/docs/docs/dbt-cloud-apis/project-state.md b/website/docs/docs/dbt-cloud-apis/project-state.md index 9320d4d4fcb..b424c6c29c8 100644 --- a/website/docs/docs/dbt-cloud-apis/project-state.md +++ b/website/docs/docs/dbt-cloud-apis/project-state.md @@ -83,21 +83,3 @@ Here are some limitations of the state representation in the Discovery API: - Compiled code results may be outdated depending on dbt Cloud run step order and failures. - Catalog info can be outdated, or incomplete (in the applied state), based on if/when `docs generate` was last run. - Source freshness checks can be out of date (in the applied state) depending on when the command was last run, and it’s not included in `build`. - - -## Supported features for applied state - -The following lists the features available for adapters: - -| Adapter | Catalog | Source freshness | -|---------|---------|------------------| -| `dbt-snowflake` | incremental | metadata-based | -| `dbt-spark` | manual run | `loaded_at` field | - -### Catalog - -You can build the catalog incrementally for adapters that support it. This allows for the catalog to be built along with the model, which eliminates the need to run a lengthy `dbt docs generate` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate --select ...` to build the catalog. - -### Source freshness - -You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. \ No newline at end of file From 740ee75441d9224c474b8f591cf47fefb786847b Mon Sep 17 00:00:00 2001 From: Ly Nguyen Date: Wed, 27 Mar 2024 10:08:58 -0700 Subject: [PATCH 05/11] PM feedback --- .../about-core-connections.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index b40477f0dfb..12c8f4046c1 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -31,6 +31,7 @@ If you're using dbt from the command line (CLI), you'll need a profiles.yml file For detailed info, you can refer to the [Connection profiles](/docs/core/connect-data-platform/connection-profiles). + ## Adapter features @@ -39,18 +40,20 @@ The following table lists the features available for adapters: | Adapter | Catalog | Source freshness | |---------|---------|------------------| | dbt default configuration | manual run | `loaded_at` field | -| `dbt-bigquery` | incremental | metadata-based | -| `dbt-databricks` | manual run | metadata-based | +| `dbt-bigquery` | incremental | metadata-based and `loaded_at` field | +| `dbt-databricks` | manual run | metadata-based and `loaded_at` field | | `dbt-postgres` | incremental | `loaded_at` field | -| `dbt-redshift` | incremental | metadata-based | -| `dbt-snowflake` | incremental | metadata-based | +| `dbt-redshift` | incremental | metadata-based and `loaded_at` field | +| `dbt-snowflake` | incremental | metadata-based and `loaded_at` field | | `dbt-spark` | manual run | `loaded_at` field | ### Catalog -You can build the catalog incrementally for adapters that support it. This allows for the catalog to be built along with the model, which eliminates the need to run a lengthy `dbt docs generate --select ...` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate --select ...` to build the catalog. +For adapters that support it, you can partially build the catalog. This allows for the catalog to be built along with the model, eliminating the need to run a lengthy `dbt docs generate --select ...` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate --select ...` to build the catalog. ### Source freshness +You can measure source freshness using the warehouse metadata tables when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. +You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. -You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. \ No newline at end of file + \ No newline at end of file From 8868660f412657fd3cb7057cb3d3f1e2aec2d3d4 Mon Sep 17 00:00:00 2001 From: Ly Nguyen Date: Wed, 27 Mar 2024 10:12:23 -0700 Subject: [PATCH 06/11] Feedback --- .../docs/core/connect-data-platform/about-core-connections.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 12c8f4046c1..9692b78bcc1 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -54,6 +54,6 @@ For adapters that support it, you can partially build the catalog. This allows f ### Source freshness You can measure source freshness using the warehouse metadata tables when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. -You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. +You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible (though it might be inaccurate at times). You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. \ No newline at end of file From 2688decd3ba6f1a98fbc641773ce744035af1457 Mon Sep 17 00:00:00 2001 From: Ly Nguyen <107218380+nghi-ly@users.noreply.github.com> Date: Wed, 27 Mar 2024 10:43:32 -0700 Subject: [PATCH 07/11] Update website/docs/docs/core/connect-data-platform/about-core-connections.md Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com> --- .../about-core-connections.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 9692b78bcc1..0a4b7340b2a 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -39,13 +39,13 @@ The following table lists the features available for adapters: | Adapter | Catalog | Source freshness | |---------|---------|------------------| -| dbt default configuration | manual run | `loaded_at` field | -| `dbt-bigquery` | incremental | metadata-based and `loaded_at` field | -| `dbt-databricks` | manual run | metadata-based and `loaded_at` field | -| `dbt-postgres` | incremental | `loaded_at` field | -| `dbt-redshift` | incremental | metadata-based and `loaded_at` field | -| `dbt-snowflake` | incremental | metadata-based and `loaded_at` field | -| `dbt-spark` | manual run | `loaded_at` field | +| dbt default configuration | full | `loaded_at` field | +| `dbt-bigquery` | partial and full | metadata-based and `loaded_at` field | +| `dbt-databricks` | full | metadata-based and `loaded_at` field | +| `dbt-postgres` | partial and full | `loaded_at` field | +| `dbt-redshift` | partial and full | metadata-based and `loaded_at` field | +| `dbt-snowflake` | partial and full | metadata-based and `loaded_at` field | +| `dbt-spark` | full | `loaded_at` field | ### Catalog From 276fd06a649cf4cf8a8e3828b4b4b3908ece282c Mon Sep 17 00:00:00 2001 From: Ly Nguyen <107218380+nghi-ly@users.noreply.github.com> Date: Wed, 27 Mar 2024 10:44:22 -0700 Subject: [PATCH 08/11] Update website/docs/docs/core/connect-data-platform/about-core-connections.md Co-authored-by: Grace Goheen <53586774+graciegoheen@users.noreply.github.com> --- .../docs/core/connect-data-platform/about-core-connections.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 0a4b7340b2a..4e050aa5bfc 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -50,7 +50,7 @@ The following table lists the features available for adapters: ### Catalog -For adapters that support it, you can partially build the catalog. This allows for the catalog to be built along with the model, eliminating the need to run a lengthy `dbt docs generate --select ...` at the end of a dbt run. For adapters that don't support incremental catalog generation, you must run `dbt docs generate --select ...` to build the catalog. +For adapters that support it, you can partially build the catalog. This allows for the catalog to be built for only a select number of models, by running `dbt docs generate --select ...`. For adapters that don't support partial catalog generation, you must run the full `dbt docs generate` to build the entire catalog. ### Source freshness You can measure source freshness using the warehouse metadata tables when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. From 6a5f96c94a0f88d48eed4e85b3fef41e787ec159 Mon Sep 17 00:00:00 2001 From: Ly Nguyen Date: Wed, 27 Mar 2024 11:04:33 -0700 Subject: [PATCH 09/11] Feedback --- .../core/connect-data-platform/about-core-connections.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 4e050aa5bfc..4f84c576f31 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -50,10 +50,9 @@ The following table lists the features available for adapters: ### Catalog -For adapters that support it, you can partially build the catalog. This allows for the catalog to be built for only a select number of models, by running `dbt docs generate --select ...`. For adapters that don't support partial catalog generation, you must run the full `dbt docs generate` to build the entire catalog. +For adapters that support it, you can partially build the catalog. This allows the catalog to be built for only a select number of models via `dbt docs generate --select ...`. For adapters that don't support partial catalog generation, you must run `dbt docs generate` to build the entire (full) catalog. ### Source freshness -You can measure source freshness using the warehouse metadata tables when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible. You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. -You can measure source freshness using the metadata when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible (though it might be inaccurate at times). You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. +You can measure source freshness using the warehouse metadata tables when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible (though it might be inaccurate at times, depending on how the warehouse tracks altered tables). You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. \ No newline at end of file From a3d2c976cc91f716d203de88f7b1f36080ff9976 Mon Sep 17 00:00:00 2001 From: Ly Nguyen <107218380+nghi-ly@users.noreply.github.com> Date: Wed, 27 Mar 2024 12:36:27 -0700 Subject: [PATCH 10/11] Update website/docs/docs/core/connect-data-platform/about-core-connections.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- .../docs/core/connect-data-platform/about-core-connections.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 4f84c576f31..56c9e7d1fdc 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -50,7 +50,7 @@ The following table lists the features available for adapters: ### Catalog -For adapters that support it, you can partially build the catalog. This allows the catalog to be built for only a select number of models via `dbt docs generate --select ...`. For adapters that don't support partial catalog generation, you must run `dbt docs generate` to build the entire (full) catalog. +For adapters that support it, you can partially build the catalog. This allows the catalog to be built for only a select number of models via `dbt docs generate --select ...`. For adapters that don't support partial catalog generation, you must run `dbt docs generate` to build the full catalog. ### Source freshness You can measure source freshness using the warehouse metadata tables when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible (though it might be inaccurate at times, depending on how the warehouse tracks altered tables). You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. From 9a318a136368e61d172e207a1450b7d98e33eee2 Mon Sep 17 00:00:00 2001 From: Ly Nguyen <107218380+nghi-ly@users.noreply.github.com> Date: Wed, 27 Mar 2024 12:37:38 -0700 Subject: [PATCH 11/11] Update website/docs/docs/core/connect-data-platform/about-core-connections.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- .../docs/core/connect-data-platform/about-core-connections.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 56c9e7d1fdc..726ace37d88 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -53,6 +53,6 @@ The following table lists the features available for adapters: For adapters that support it, you can partially build the catalog. This allows the catalog to be built for only a select number of models via `dbt docs generate --select ...`. For adapters that don't support partial catalog generation, you must run `dbt docs generate` to build the full catalog. ### Source freshness -You can measure source freshness using the warehouse metadata tables when the adapter supports it. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible (though it might be inaccurate at times, depending on how the warehouse tracks altered tables). You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. +You can measure source freshness using the warehouse metadata tables on supported adapters. This allows for calculating source freshness without using the `loaded_at` field and without querying the table directly. This is faster and more flexible (though it might sometimes be inaccurate, depending on how the warehouse tracks altered tables). You can override this with the `loaded_at` field in the model config. If the adapter doesn't support this, you can still use the `loaded_at` field. \ No newline at end of file