-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3529] [SQL] Delete the temp files after test exit #2393
Conversation
QA tests have started for PR 2393 at commit
|
QA tests have finished for PR 2393 at commit
|
extends TestHiveContext(new SparkContext("local[2]", "TestSQLContext", new SparkConf())) | ||
extends TestHiveContext(new SparkContext("local[2]", "TestSQLContext", new SparkConf())) { | ||
|
||
Signal.handle(new Signal("INT"), new SignalHandler() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes, this seems a whole lot more heavy handed than just implementing test lifecycle methods with annotations. Elsewhere in the test framework, temp files are reliably delete by:
- Invoking the standard method to get a temp dir
- ... which calls
deleteOnExit()
- ... which also cleans up the declared test dir in an annotated cleanup method
I would really avoid use of Signal
! It does not seem required and is inconsistent with other tests.
Thank you @srowen I think you're right, we should provide a mechanism to delete all of temp files in the test framework, not just for SQL on exit. I will investigate that.
|
@chenghao-intel For example, in
This makes a temp dir in the right place, asks the JVM to delete it on shutdown, but also tries to delete it explicitly after the test finishes. (Looks like I was not sure if The risk with |
+1 @srowen, using signal is too heavy handed i'm skeptical that the jvm can guarantee dir removal on failure (say kill -9 or a jvm segv). those cases are hopefully less frequent than the current leaking of tmp files. |
Thank you all, I've removed the |
+1 lgtm fyi, i checked, deleteOnExit isn't an option because it cannot recursively delete |
@@ -69,11 +82,22 @@ class TestHiveContext(sc: SparkContext) extends HiveContext(sc) { | |||
setConf("javax.jdo.option.ConnectionURL", | |||
s"jdbc:derby:;databaseName=$metastorePath;create=true") | |||
setConf("hive.metastore.warehouse.dir", warehousePath) | |||
HiveInterruptUtils.add(new HiveInterruptCallback() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really what InterruptUtils
is for? It looks likes its for interrupting a running query. Why aren't these just part of the shutdown hook?
@chenghao-intel any thoughts on avoiding the use of InterruptUtils? I'd like to avoid relying on more Hive APIs that we have to. |
e790423
to
63cffd5
Compare
Sorry for late reply. Thank you @marmbrus , it was quite error-pone by using the |
QA tests have started for PR 2393 at commit
|
Test FAILed. |
Sorry for the late reply as well, but -1 since this introduces again Commons IO Why a new callback trait instead of just a |
QA tests have finished for PR 2393 at commit
|
Test PASSed. |
ba1797a
to
3a6511f
Compare
@srowen thank you very much, this is quite informative. |
Perhaps you can debug a bit first to see if the shutdown hook is called and it attempts to delete the dir? is there are error while deleting it? this mechanism appears to work for unit test temp files. |
Test FAILed. |
Actually, I searched the code, seems we haven't set the shutdown hook for |
Ah, right. This is the shutdown hook: ... but it requires you to have used I have an outstanding PR that improves a few things here, including, just directly deleting everything seen by If that PR looks OK then, it would be good to get it committed, since it would make @chenghao-intel 's simpler solution here work. |
|
I'm not super tied to the temp dirs with prefix/suffix (though that is kind of nice). I'm find changing that code to use whatever standard mechanisms the rest of spark uses. |
Also I merged #2670. |
Thank you @marmbrus , you're right! So, I think this is ready to be merged too. |
QA tests have started for PR 2393 at commit
|
QA tests have finished for PR 2393 at commit
|
Thanks! Merged to master. |
There are lots of temporal files created by TestHive under the /tmp by default, which may cause potential performance issue for testing. This PR will automatically delete them after test exit.