-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12971] Fix Hive tests which fail in Hadoop-2.3 SBT build #10884
[SPARK-12971] Fix Hive tests which fail in Hadoop-2.3 SBT build #10884
Conversation
if (sqlContext.tableNames().contains("src")) { | ||
sqlContext.dropTempTable("src") | ||
} | ||
Seq((1, "")).toDF("key", "value").registerTempTable("src") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this suite was first added, I believe that src
referred to a temporary table loaded in TestHive, which is why I used its schema here.
Interestingly, all of the tests in this suite pass even if src
doesn't exist, which suggests to me that the tests aren't strong enough to detect certain classes of bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are testing query positionering errors here. The query is probably deemed invalid before we check that the src
table exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. The src
attributes are also resolved in some tests...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about the following tests:
positionTest("unresolved attribute 1",
"SELECT x FROM src", "x")
and
positionTest("unresolved attribute 3",
"SELECT key, x FROM src", "x")
and
positionTest("unresolved attribute 4",
"""SELECT key,
|x FROM src
""".stripMargin, "x")
In the second and third tests, it looks like the assert is trying to check that "x" is used when positioning the error message and not "key". Therefore, I think it matters that the table has a column named "key" in order for this test to be meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an experiment, I just tried adding a new test which drops the src
table, then runs
positionTest("unresolved attribute new test",
"""SELECT x,
|key FROM src
""".stripMargin, "key")
and it looks like this actually does fail with
[info] - unresolved attribute 4 *** FAILED *** (47 milliseconds)
[info] 1 did not equal 2 wrong line (ErrorPositionSuite.scala:153)
I guess that the attribute resolution order doesn't relate to the order in which the attributes appear in the query: I think x
is always resolved before key
, which is why this test always passes irrespective of whether key
exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, this limitation in the existing test is out-of-scope for the purposes of this patch, which is just to fix Jenkins so my test failure notifications stop blowing up with notifications about this suite (I get a chat message for every test failure).
Test build #49942 has finished for PR 10884 at commit
|
Jenkins retest this please |
(Retesting to see whether this also addresses a SBT Hadoop 2.4 failure, which I suspect has the same root cause) |
jenkins retest this please |
Test build #49946 has finished for PR 10884 at commit
|
Test build #49947 has finished for PR 10884 at commit
|
ErrorPositionSuite and one of the HiveComparisonTest tests have been consistently failing on the Hadoop 2.3 SBT build (but on no other builds). I believe that this is due to test isolation issues (e.g. tests sharing state via the sets of temporary tables that are registered to TestHive).
This patch attempts to improve the isolation of these tests in order to address this issue.