Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO-NOT-MERGE][WIP][MINOR][SQL][TESTS] show tests in SQLQuerySuite correctly in Jenkins #25923

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ import org.apache.spark.tags.ExtendedSQLTest
*
* Each case is loaded from a file in "spark/sql/core/src/test/resources/sql-tests/inputs".
* Each case has a golden result file in "spark/sql/core/src/test/resources/sql-tests/results".
* Please note that the test name is constructed via actual file name with excluding '.sql' suffix.
*
* To run the entire test suite:
* {{{
Expand All @@ -49,7 +50,7 @@ import org.apache.spark.tags.ExtendedSQLTest
*
* To run a single test file upon change:
* {{{
* build/sbt "~sql/test-only *SQLQueryTestSuite -- -z inline-table.sql"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this because practically it was good since file names can be matched. Now it cannot be for cosmetic reasons.

I added a clue in the PR I pointed out in order to make debugging easier with the Jenkins output in Github. So currently people at least can identify which file was failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge win of this workaround is availability of looking into history of individual test. It would provide context which cannot be retrieve from test log, which is really helpful to track down test flakiness.
e.g. https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/111332/testReport/org.apache.spark.sql/SQLQueryTestSuite/typeCoercion_native_mapconcat/history/

Full tests in SQLQueryTestSuite:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/111332/testReport/org.apache.spark.sql/SQLQueryTestSuite/

Full tests in ThriftServerQueryTestSuite (same test can be matched with SQLQueryTestSuite):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/111332/testReport/org.apache.spark.sql.hive.thriftserver/ThriftServerQueryTestSuite/

So I'm afraid I'm not sure it's just only cosmetic issue. This workaround brings actual benefits, though it has pros and cons, unfortunately. I guess someone would be familiar with file path instead of file path excluding ".sql", so please treat this as a proposal and weigh on current state vs "workaround applied" state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this affects existing test way. Can't we find a way to override SBT's XML reporter or to use ScalaTest's XML reporter?

Fixing this in SBT, and upgrading the version should be the standard approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree that's the way to go, but if no one would like to go forward, that's just an ideal approach. (And that doesn't seem easy to fix the sbt issue with ensuring it doesn't break anything.) sbt/sbt#2949 is filed at Feb. 2017, which is over 2 years ago.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For the current approach in this PR, I am still not sure due to the downside of changing the existing test way.

I don't think many people actually know build/sbt "~sql/test-only *SQLQueryTestSuite -- -z xxx.sql" is based upon its test name and matching it to the exact file name is pretty good one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think many people actually know build/sbt "~sql/test-only *SQLQueryTestSuite -- -z xxx.sql" is based upon its test name

I'm not sure I could agree with this, as that means this feature is blindly used at all which it shouldn't (as it only applies to *SQLQueryTestSuite). This is even documented in Spark website. Please refer "Testing with SBT" in http://spark.apache.org/developer-tools.html

matching it to the exact file name is pretty good one.

Yes I totally agree it is more intuitive and better. So that's "trade-off". Test history of *SQLQueryTestSuite in Jenkins is just unusable as of now, so personally I think it's worth to weigh on both. Not a strong opinion as I commented before - it's just a proposal.

* build/sbt "~sql/test-only *SQLQueryTestSuite -- -z inline-table"
* }}}
*
* To re-generate golden files for entire suite, run:
Expand All @@ -59,7 +60,7 @@ import org.apache.spark.tags.ExtendedSQLTest
*
* To re-generate golden file for a single test, run:
* {{{
* SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/test-only *SQLQueryTestSuite -- -z describe.sql"
* SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/test-only *SQLQueryTestSuite -- -z describe"
* }}}
*
* The format for input files is simple:
Expand Down Expand Up @@ -141,7 +142,7 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession {

/** List of test cases to ignore, in lower cases. */
protected def blackList: Set[String] = Set(
"blacklist.sql" // Do NOT remove this one. It is here to test the blacklist functionality.
"blacklist" // Do NOT remove this one. It is here to test the blacklist functionality.
)

// Create all the test cases.
Expand Down Expand Up @@ -444,11 +445,19 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession {
.replaceAll("\\*\\(\\d+\\) ", "*") // remove the WholeStageCodegen codegenStageIds
}

protected def extractTestCaseName(absolutePath: String): String = {
// Removing last '.sql' is an workaround of sbt bug which removes test name
// prior to the last dot in JUnitXmlReportPlugin.
// Please refer https://github.com/sbt/sbt/issues/2949
absolutePath.stripPrefix(inputFilePath).stripPrefix(File.separator)
.stripSuffix(validFileExtensions)
}

protected def listTestCases(): Seq[TestCase] = {
listFilesRecursively(new File(inputFilePath)).flatMap { file =>
val resultFile = file.getAbsolutePath.replace(inputFilePath, goldenFilePath) + ".out"
val absPath = file.getAbsolutePath
val testCaseName = absPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
val testCaseName = extractTestCaseName(absPath)

if (file.getAbsolutePath.startsWith(
s"$inputFilePath${File.separator}udf${File.separator}pgSQL")) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,23 +84,23 @@ class ThriftServerQueryTestSuite extends SQLQueryTestSuite {

/** List of test cases to ignore, in lower cases. */
override def blackList: Set[String] = Set(
"blacklist.sql", // Do NOT remove this one. It is here to test the blacklist functionality.
"blacklist", // Do NOT remove this one. It is here to test the blacklist functionality.
// Missing UDF
"pgSQL/boolean.sql",
"pgSQL/case.sql",
"pgSQL/boolean",
"pgSQL/case",
// SPARK-28624
"date.sql",
"date",
// SPARK-28620
"pgSQL/float4.sql",
"pgSQL/float4",
// SPARK-28636
"decimalArithmeticOperations.sql",
"literals.sql",
"subquery/scalar-subquery/scalar-subquery-predicate.sql",
"subquery/in-subquery/in-limit.sql",
"subquery/in-subquery/in-group-by.sql",
"subquery/in-subquery/simple-in.sql",
"subquery/in-subquery/in-order-by.sql",
"subquery/in-subquery/in-set-operations.sql"
"decimalArithmeticOperations",
"literals",
"subquery/scalar-subquery/scalar-subquery-predicate",
"subquery/in-subquery/in-limit",
"subquery/in-subquery/in-group-by",
"subquery/in-subquery/simple-in",
"subquery/in-subquery/in-order-by",
"subquery/in-subquery/in-set-operations"
)

override def runQueries(
Expand Down Expand Up @@ -234,7 +234,7 @@ class ThriftServerQueryTestSuite extends SQLQueryTestSuite {
listFilesRecursively(new File(inputFilePath)).flatMap { file =>
val resultFile = file.getAbsolutePath.replace(inputFilePath, goldenFilePath) + ".out"
val absPath = file.getAbsolutePath
val testCaseName = absPath.stripPrefix(inputFilePath).stripPrefix(File.separator)
val testCaseName = extractTestCaseName(absPath)

if (file.getAbsolutePath.startsWith(s"$inputFilePath${File.separator}udf")) {
Seq.empty
Expand Down