-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: update schema change capability #28200
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,27 @@ | ||
# Manage schema changes | ||
|
||
Once every 24 hours, Airbyte checks for changes in your source schema and allows you to review the changes and fix breaking changes. This process helps ensure accurate and efficient data syncs, minimizing errors and saving you time and effort in managing your data pipelines. | ||
You can specify for each connection how Airbyte should handle any change of schema in the source. This process helps ensure accurate and efficient data syncs, minimizing errors and saving you time and effort in managing your data pipelines. | ||
|
||
:::note | ||
Airbyte checks for any changes in your source schema before every sync or once every 24 hours, whichever is more frequent. | ||
|
||
Schema changes are flagged in your connection but are not propagated to your destination. | ||
|
||
::: | ||
Based on your configured settings for "Detect and propagate schema changes", Airbyte can automatically sync those changes or ignore them: | ||
* **Propagate all changes** automatically propagates stream changes (additions or deletions) or column changes (additions or deletions) detected in the source | ||
* **Propagate column changes only** automatically propagates column changes detected in the source | ||
* **Ignore** any schema change, in which case the schema you’ve set up will not change even if the source schema changes until you approve the changes manually | ||
* **Pause connection** disables the connection from syncing further once a change is detected | ||
|
||
When a new column is detected and propagated, values for that column will be filled in for the updated rows. If you are missing values for rows not updated, a backfill can be done by completing a full refresh. | ||
|
||
When a column is deleted, the values for that column will stop updating for the updated rows and be filled with Null values. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that we will actually now delete these columns immediately, based on the related convo in Normalization. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @alex-gron Was thinking about this more today after our chat - I agree that's the behavior we decided on. Does that mean the proposed changes will come with Destinations V2 - which I believe that's coming at the end of Q3? On the one hand, we could publish these with the foresight those changes will be coming, or re-publish with the updates when V2 is officially released. I'm leaning towards the latter but let me know if you think it's better to just update now! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry for the late reply! I'm good with that option :) |
||
|
||
When a new stream is detected and propagated, the first sync will fill all data in as if it is a historical sync. When a stream is deleted from the source, the stream will stop updating, and we leave any existing data in the destination. The rest of the enabled streams will continue syncing. | ||
|
||
In all cases, if a breaking change is detected, the connection will be paused for manual review to prevent future syncs from failing. Breaking schema changes occur when: | ||
* The data type of a field from the source changes | ||
* An existing primary key is removed from the source | ||
* An existing cursor is removed from the source | ||
|
||
See "Fix breaking schema changes" to understand how to resolve these types of changes. | ||
|
||
## Review non-breaking schema changes | ||
|
||
|
@@ -29,11 +44,10 @@ To review non-breaking schema changes: | |
|
||
## Fix breaking schema changes | ||
|
||
:::note | ||
|
||
Breaking changes can only occur in the **Cursor** or **Primary key** fields. | ||
|
||
::: | ||
Breaking schema changes occur when: | ||
* The data type of a field from the source changes | ||
* An existing primary key is removed from the source | ||
* An existing cursor is removed from the source | ||
|
||
To review and fix breaking schema changes: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the fix is the data type changes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From our chat I believe we decided to make any data type changes breaking changes. Not sure if that answers your question though - lmk if I missed something! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I had a typo in my original question. I meant to ask - what should a user do when they encounter a data type change? Is the only option in that case to run a reset? |
||
1. On the [Airbyte Cloud](http://cloud.airbyte.com/) dashboard, click **Connections** and select the connection with breaking changes (indicated by a **red exclamation mark** icon). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfsiega-airbyte I think I recall this being the case, but am not sure now if this is true. Is it still once every 24 hours or does it check on sync start as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to say every 24 hours
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would phrase it as:
Airbyte checks for any changes in your source schema before syncing, at most once every 24 hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, updated 👍