-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
187: Adding apache hudi support to dbt #210
Merged
Merged
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
3b781cb
initial working version
c3d11fe
Rebased and resolve all the merge conflicts.
022edba
Rebased and resolved merge conflicts.
cd22177
Removed hudi dep jar and used the released version via packages option
59b1370
Added insert overwrite unit tests for hudi
b0e45fd
Used unique_key as default value for hudi primaryKey option
10a50ca
Updated changelog.md with this new update.
705a777
Final round of testing and few minor fixes
9616bb0
Fixed lint issues
283c7d1
Fixed the integration tests
8f49b09
Fixed the circle ci env to add hudi packages
a4f0699
Updated hudi spark bundle to use scala 2.11
f521ca9
Fixed Hudi incremental strategy integration tests and other integrati…
7ba9b1b
Fixed the hudi hive sync hms integration test issues
46be053
Added sql HMS config to fix the integration tests.
d9e15a0
Added hudi hive sync mode conf to CI
ca588b2
Set the hms schema verification to false
2d5ba2e
Removed the merge update columns hence its not supported.
4b43b46
Passed the correct hiveconf to the circle ci build script
vingov aab2160
Disabled few incremental tests for spark2 and reverted to spark2 config
vingov ae3bfe3
Added hudi configs to the circle ci build script
vingov 0723de9
Commented out the Hudi integration test until we have the hudi 0.10.0…
202e88a
Fixed the macro which checks the table type.
22a2025
Disabled this model since hudi is not supported in databricks runtime…
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
spark.hadoop.datanucleus.autoCreateTables true | ||
spark.hadoop.datanucleus.schema.autoCreateTables true | ||
spark.hadoop.datanucleus.fixedDatastore false | ||
spark.serializer org.apache.spark.serializer.KryoSerializer | ||
spark.jars.packages org.apache.hudi:hudi-spark3-bundle_2.12:0.9.0 | ||
spark.sql.extensions org.apache.spark.sql.hudi.HoodieSparkSessionExtension | ||
spark.driver.userClassPathFirst true |
19 changes: 19 additions & 0 deletions
19
tests/integration/incremental_strategies/models_hudi/append.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
{{ config( | ||
materialized = 'incremental', | ||
incremental_strategy = 'append', | ||
file_format = 'hudi', | ||
) }} | ||
|
||
{% if not is_incremental() %} | ||
|
||
select cast(1 as bigint) as id, 'hello' as msg | ||
union all | ||
select cast(2 as bigint) as id, 'goodbye' as msg | ||
|
||
{% else %} | ||
|
||
select cast(2 as bigint) as id, 'yo' as msg | ||
union all | ||
select cast(3 as bigint) as id, 'anyway' as msg | ||
|
||
{% endif %} |
19 changes: 19 additions & 0 deletions
19
tests/integration/incremental_strategies/models_hudi/insert_overwrite_no_partitions.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
{{ config( | ||
materialized = 'incremental', | ||
incremental_strategy = 'insert_overwrite', | ||
file_format = 'hudi', | ||
) }} | ||
|
||
{% if not is_incremental() %} | ||
|
||
select cast(1 as bigint) as id, 'hello' as msg | ||
union all | ||
select cast(2 as bigint) as id, 'goodbye' as msg | ||
|
||
{% else %} | ||
|
||
select cast(2 as bigint) as id, 'yo' as msg | ||
union all | ||
select cast(3 as bigint) as id, 'anyway' as msg | ||
|
||
{% endif %} |
20 changes: 20 additions & 0 deletions
20
tests/integration/incremental_strategies/models_hudi/insert_overwrite_partitions.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
{{ config( | ||
materialized = 'incremental', | ||
incremental_strategy = 'insert_overwrite', | ||
partition_by = 'id', | ||
file_format = 'hudi', | ||
) }} | ||
|
||
{% if not is_incremental() %} | ||
|
||
select cast(1 as bigint) as id, 'hello' as msg | ||
union all | ||
select cast(2 as bigint) as id, 'goodbye' as msg | ||
|
||
{% else %} | ||
|
||
select cast(2 as bigint) as id, 'yo' as msg | ||
union all | ||
select cast(3 as bigint) as id, 'anyway' as msg | ||
|
||
{% endif %} |
19 changes: 19 additions & 0 deletions
19
tests/integration/incremental_strategies/models_hudi/merge_no_key.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
{{ config( | ||
materialized = 'incremental', | ||
incremental_strategy = 'merge', | ||
file_format = 'hudi', | ||
) }} | ||
|
||
{% if not is_incremental() %} | ||
|
||
select cast(1 as bigint) as id, 'hello' as msg | ||
union all | ||
select cast(2 as bigint) as id, 'goodbye' as msg | ||
|
||
{% else %} | ||
|
||
select cast(2 as bigint) as id, 'yo' as msg | ||
union all | ||
select cast(3 as bigint) as id, 'anyway' as msg | ||
|
||
{% endif %} |
20 changes: 20 additions & 0 deletions
20
tests/integration/incremental_strategies/models_hudi/merge_unique_key.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
{{ config( | ||
materialized = 'incremental', | ||
incremental_strategy = 'merge', | ||
file_format = 'hudi', | ||
unique_key = 'id', | ||
) }} | ||
|
||
{% if not is_incremental() %} | ||
|
||
select cast(1 as bigint) as id, 'hello' as msg | ||
union all | ||
select cast(2 as bigint) as id, 'goodbye' as msg | ||
|
||
{% else %} | ||
|
||
select cast(2 as bigint) as id, 'yo' as msg | ||
union all | ||
select cast(3 as bigint) as id, 'anyway' as msg | ||
|
||
{% endif %} |
22 changes: 22 additions & 0 deletions
22
tests/integration/incremental_strategies/models_hudi/merge_update_columns.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
{{ config( | ||
materialized = 'incremental', | ||
incremental_strategy = 'merge', | ||
file_format = 'hudi', | ||
unique_key = 'id', | ||
merge_update_columns = ['msg'], | ||
) }} | ||
|
||
{% if not is_incremental() %} | ||
|
||
select cast(1 as bigint) as id, 'hello' as msg, 'blue' as color | ||
union all | ||
select cast(2 as bigint) as id, 'goodbye' as msg, 'red' as color | ||
|
||
{% else %} | ||
|
||
-- msg will be updated, color will be ignored | ||
select cast(2 as bigint) as id, 'yo' as msg, 'green' as color | ||
union all | ||
select cast(3 as bigint) as id, 'anyway' as msg, 'purple' as color | ||
|
||
{% endif %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat! Out of curiosity, what's the change coming in v0.10 that will make this sail smoothly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark SQL DML support has been added to Apache Hudi recently with the 0.9.0 release, but there were a few gaps that got fixed after we released the last version, which is scheduled for the next release in a few weeks.
Most specifically, these commits are the ones that are relevant to making these tests run smoothly.