Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update/rename package #81

Merged
merged 28 commits into from
Aug 3, 2023
Merged

Update/rename package #81

merged 28 commits into from
Aug 3, 2023

Conversation

fivetran-jamie
Copy link
Contributor

@fivetran-jamie fivetran-jamie commented May 4, 2023

PR Overview

This PR will address the following Issue/Feature:

height task T-390568
#83

This PR will result in the following new package version:

considering we are renaming the package, v1.0.0?

Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:

  • updates the fivetran_log prefix on all variables and models to fivetran_platform
  • updates the default build schema suffixes to be _stg_fivetran_platform and _fivetran_platform
  • changes the actual source name to fivetran_platform
  • updates README accordingly
  • adds missing documentation for _fivetran_synced in some sources
  • updates incremental strategy for the audit table model
    • where should we document this and tell people to run full refreshes every so often?
  • adds incremental runs to Buildkite
  • removes freshness tests from tables that probably don't update that often -- went somewhat rogue here but our tests are probably too strict
  • totally removes anything related to the account_membership table which was deprecated - [Feature] totally deprecate account_membership table #83

what this does NOT do:

  • change docs links
  • change the actual package project name (as this is what dbt hub looks at)
  • change the default schema

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt compile
  • dbt run –full-refresh
  • dbt run
  • dbt test
  • dbt run –vars (if applicable)

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked and tagged
  • You are assigned to the corresponding issue and this PR
  • BuildKite integration tests are passing

Detailed Validation

Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":

  • You have validated these changes and assure this PR will address the respective Issue/Feature.
  • You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
  • You have provided details below around the validation steps performed to gain confidence in these changes.

ran the two versions of the package simultaneously (since they have different project + model names). I looked at various models, but here I will focus on the audit table model, since i updated the the Bigquery + Spark incremental strategy for that model

  1. ran a full refresh and spot checked different tables + connectors, but here are the total aggregates compared

image

  1. ran an incremental run

image

everything checks out there -- should i validate any other way.. ?

Standard Updates

Please acknowledge that your PR contains the following standard updates:

  • Package versioning has been appropriately indexed in the following locations:
    • indexed within dbt_project.yml
    • indexed within integration_tests/dbt_project.yml
  • CHANGELOG has individual entries for each respective change in this PR
  • README updates have been applied (if applicable)
  • DECISIONLOG updates have been updated (if applicable)
  • Appropriate yml documentation has been added (if applicable)

dbt Docs

Please acknowledge that after the above were all completed the below were applied to your branch:

  • docs were regenerated (unless this PR does not include any code or yml updates)

If you had to summarize this PR in an emoji, which would it be?

📛

@fivetran-jamie fivetran-jamie self-assigned this May 8, 2023
@fivetran-jamie fivetran-jamie marked this pull request as ready for review May 8, 2023 23:51
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-jamie this looks great! I do have a few minor comments and suggestions in line before this is fully good go.

In addition to the in line comments, would you be able to show a validation example of the incremental strategy loading in only new rows on an incremental run. This will help us ensure the strategy is sound. You can do this my artificially filtering the source data on the full refresh, and then removing the filter on the next incremental run.

Finally, I feel this section (below) in the README would be the best place to inform Snowflake, Redshift, and Postgres users to periodically run a full refresh to ensure their incremental models capture updates. Would you be able to add a sub section there detailing this to the customers?
image

CHANGELOG.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
dbt_project.yml Outdated Show resolved Hide resolved
models/staging/src_fivetran.yml Outdated Show resolved Hide resolved
models/staging/stg_fivetran__log.sql Outdated Show resolved Hide resolved
@fivetran-jamie
Copy link
Contributor Author

@fivetran-joemarkiewicz added your changes! also for validation, i added some extra lines to the seed file

So, running before adding new dates:
image

I then add these new rows to the log seed file to simulate a sync of the media_insights table the next day (2021-12-10)
image

i dbt seed --full-refresh

i perform an incremental run, resulting in the following expected output (a new record for media_insights)
image

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks for applying the updates!

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-jamie thanks for working through this! I left a few comments and code change suggestions. Let me know if you have any questions following my review.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
dbt_project.yml Show resolved Hide resolved
dbt_project.yml Show resolved Hide resolved
models/fivetran_platform__audit_table.sql Outdated Show resolved Hide resolved
fivetran-jamie and others added 9 commits July 5, 2023 15:05
Co-authored-by: Joe Markiewicz <[email protected]>
Co-authored-by: Joe Markiewicz <[email protected]>
Co-authored-by: Joe Markiewicz <[email protected]>
Co-authored-by: Joe Markiewicz <[email protected]>
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment in the README, but other than that - this looks good to go!

README.md Outdated Show resolved Hide resolved
@fivetran-joemarkiewicz fivetran-joemarkiewicz merged commit 9655896 into main Aug 3, 2023
@fivetran-joemarkiewicz fivetran-joemarkiewicz deleted the update/rename-package branch August 3, 2023 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants