-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Structs evolution in MapType keys and values #1868
Add support for Structs evolution in MapType keys and values #1868
Conversation
spark/src/main/scala/org/apache/spark/sql/delta/UpdateExpressionsSupport.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix! I had some minor comments. From what I can tell, this also fixes schema evolution in arrays nested in maps. Could we have one additional test for this for completeness?
@@ -123,6 +123,56 @@ class MergeIntoSQLSuite extends MergeIntoSuiteBase with MergeIntoNotMatchedBySo | |||
update = "value.a.x = 2" :: "value.a.x = 3" :: Nil, | |||
errorStrs = "There is a conflict from these SET columns" :: Nil) | |||
|
|||
test("Complex Data Type - Map with Struct Values") { | |||
withTable("source") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: merge onto one line
withTable("source", "target") {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both tests in MergeIntoSQLSuite
are removed after I also implemented struct evolution inside of map keys and not just in map values. There actually was not good reason to limit it to map values only.
spark/src/test/scala/org/apache/spark/sql/delta/MergeIntoSQLSuite.scala
Outdated
Show resolved
Hide resolved
spark/src/main/scala/org/apache/spark/sql/delta/UpdateExpressionsSupport.scala
Outdated
Show resolved
Hide resolved
I added a number of tests to cover map keys evolution and two tests to cover arrays nested in maps. |
spark/src/main/scala/org/apache/spark/sql/delta/UpdateExpressionsSupport.scala
Outdated
Show resolved
Hide resolved
// scalastyle:on line.size.limit | ||
|
||
// Struct evolution inside of map keys. | ||
testEvolution("new source column in map struct key")( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use the testNestedStructsEvolution
helper method? It makes the tests super readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, testNestedStructsEvolution
relies on parsing JSON data which is what makes it more readable but JSON requires the keys in maps to be strings which is too limiting for these tests.
spark/src/test/scala/org/apache/spark/sql/delta/MergeIntoSuiteBase.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Description
This PR resolves issue #1641 to allow automatic schema evolution in structs that are inside maps.
Assuming the target and source tables have the following schemas:
target:
id string, map map<int, struct<a: int, b: int>>
source:
id string, map map<int, struct<a: int, b: int, c: int>>
returns an analysis error today:
With this change, the merge command succeeds and the target table schema evolves to include field
c
inside the map value. The same also works for map keys.How was this patch tested?
MergeIntoSuiteBase
andMergeIntoSQLSuite
to cover struct evolution inside of maps values and keys.Does this PR introduce any user-facing changes?
Yes, struct evolution inside of maps now succeeds instead of failing with an analysis error, see previous example.