-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32762][SQL][TEST] Enhance the verification of ExpressionsSchemaSuite to sql-expression-schema.md #29608
Changes from 1 commit
6d36603
5c8e29e
bf0ceea
cbb0fcd
85299b1
5a574c3
6e6489b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -152,7 +152,7 @@ class ExpressionsSchemaSuite extends QueryTest with SharedSparkSession { | |
|
||
val outputSize = outputs.size | ||
val headerSize = header.size | ||
val expectedOutputs: Seq[QueryOutput] = { | ||
val (expectedMissExamples, expectedOutputs): (Array[String], Seq[QueryOutput]) = { | ||
val expectedGoldenOutput = fileToString(resultFile) | ||
val lines = expectedGoldenOutput.split("\n") | ||
val expectedSize = lines.size | ||
|
@@ -161,14 +161,28 @@ class ExpressionsSchemaSuite extends QueryTest with SharedSparkSession { | |
s"Expected $expectedSize blocks in result file but got " + | ||
s"${outputSize + headerSize}. Try regenerate the result files.") | ||
|
||
Seq.tabulate(outputSize) { i => | ||
val numberOfQueries = lines(2).split(":")(1).trim.toInt | ||
val numberOfMissExample = lines(3).split(":")(1).trim.toInt | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. miss -> missed or missing, in most cases in this change There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done ~ |
||
val missExamples = lines(4).split(":")(1).trim.split(",") | ||
val expectedOutputs = Seq.tabulate(outputSize) { i => | ||
val segments = lines(i + headerSize).split('|') | ||
QueryOutput( | ||
className = segments(1).trim, | ||
funcName = segments(2).trim, | ||
sql = segments(3).trim, | ||
schema = segments(4).trim) | ||
} | ||
|
||
// Ensure consistency of the result file. | ||
assert(numberOfQueries == expectedOutputs.size, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think should put these assert on line 163 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @beliefer This assert place here to verify the consistency of the file contents to avoid inconsistency caused by manual modification, and I think line 189 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can try to mock a new function with comments and test this suite. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You are right, Spark-24884 already trigger this problem cause by incomplete manual modification. With the new function add scene, I think the right way is run this case with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Therefore, if the file is manually modified instead of automatically generated, I think the assertion failure caused by incorrect modification should be expected. Do we need to tell users clearly with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I got it. Thanks! Could you put these assert on line 163, so that it looks clear There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Address 6e6489b reorder the assertions. |
||
s"outputs size: ${expectedOutputs.size} not same as numberOfQueries: $numberOfQueries " + | ||
"record in result file. Try regenerate the result files.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. regenerate -> regenerating There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok ~ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done ~ |
||
assert(numberOfMissExample == missExamples.size, | ||
s"miss examples size: ${missExamples.size} not same as " + | ||
s"numberOfMissExample: $numberOfMissExample " + | ||
"record in result file. Try regenerate the result files.") | ||
|
||
(missExamples, expectedOutputs) | ||
} | ||
|
||
// Compare results. | ||
|
@@ -179,5 +193,13 @@ class ExpressionsSchemaSuite extends QueryTest with SharedSparkSession { | |
assert(expected.sql == output.sql, "SQL query did not match") | ||
assert(expected.schema == output.schema, s"Schema did not match for query ${expected.sql}") | ||
} | ||
|
||
// Compare expressions missing examples | ||
assert(expectedMissExamples.length == missingExamples.size, | ||
"The number of missing examples not equals the number of expected missing examples.") | ||
|
||
missingExamples.zip(expectedMissExamples).foreach { case (output, expected) => | ||
assert(expected == output, "Missing example expression not match") | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No big deal, but the types can just be added to the two tuple elements, instead of declaring them separately as a type for the whole tuple
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First changed to
val (expectedMissingExamples: Array[String], expectedOutputs: Seq[QueryOutput])
, but I feel the type declaration here is redundant, so changed toval (expectedMissingExamples, expectedOutputs)
. Is this acceptable?