Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3529] [SQL] Delete the temp files after test exit #2393

Closed

Conversation

chenghao-intel
Copy link
Contributor

There are lots of temporal files created by TestHive under the /tmp by default, which may cause potential performance issue for testing. This PR will automatically delete them after test exit.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

QA tests have started for PR 2393 at commit 4ecc9d4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 15, 2014

QA tests have finished for PR 2393 at commit 4ecc9d4.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

extends TestHiveContext(new SparkContext("local[2]", "TestSQLContext", new SparkConf()))
extends TestHiveContext(new SparkContext("local[2]", "TestSQLContext", new SparkConf())) {

Signal.handle(new Signal("INT"), new SignalHandler() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, this seems a whole lot more heavy handed than just implementing test lifecycle methods with annotations. Elsewhere in the test framework, temp files are reliably delete by:

  • Invoking the standard method to get a temp dir
  • ... which calls deleteOnExit()
  • ... which also cleans up the declared test dir in an annotated cleanup method

I would really avoid use of Signal! It does not seem required and is inconsistent with other tests.

@chenghao-intel
Copy link
Contributor Author

Thank you @srowen I think you're right, we should provide a mechanism to delete all of temp files in the test framework, not just for SQL on exit. I will investigate that.

Signal here is for manually kill the test by Ctrl^c, sorry, I am not sure your mean inconsistent with other tests

@srowen
Copy link
Member

srowen commented Sep 16, 2014

@chenghao-intel For example, in FileServerSuite:

  override def beforeAll() {
    super.beforeAll()
    tmpDir = Files.createTempDir()
    tmpDir.deleteOnExit()
    ...
  }
  ...

  override def afterAll() {
    super.afterAll()
    Utils.deleteRecursively(tmpDir)
  }

This makes a temp dir in the right place, asks the JVM to delete it on shutdown, but also tries to delete it explicitly after the test finishes.

(Looks like FileUtils.deleteDirectory duplicates Utils.deleteRecursively?)

I was not sure if deleteOnExit() works in the case of Ctrl-C, but a quick test here suggests it does. It won't work in the case of hard failures, but nothing will really.

The risk with Signal is that by trapping Ctrl-C (and this requires a non-public Sun API anyway) you become unkillable if the handler blocks for some reason.

@mattf
Copy link
Contributor

mattf commented Sep 16, 2014

+1 @srowen, using signal is too heavy handed

i'm skeptical that the jvm can guarantee dir removal on failure (say kill -9 or a jvm segv). those cases are hopefully less frequent than the current leaking of tmp files.

@liancheng
Copy link
Contributor

+1 for the deleteOnExit/deleteRecursively pattern.

@mattf According to its Javadoc, deleteOnExit only works for normal JVM termination.

@chenghao-intel
Copy link
Contributor Author

Thank you all, I've removed the Signal and use the Utils.deleteRecursively instead.

@mattf
Copy link
Contributor

mattf commented Sep 19, 2014

+1 lgtm

fyi, i checked, deleteOnExit isn't an option because it cannot recursively delete

@@ -69,11 +82,22 @@ class TestHiveContext(sc: SparkContext) extends HiveContext(sc) {
setConf("javax.jdo.option.ConnectionURL",
s"jdbc:derby:;databaseName=$metastorePath;create=true")
setConf("hive.metastore.warehouse.dir", warehousePath)
HiveInterruptUtils.add(new HiveInterruptCallback() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really what InterruptUtils is for? It looks likes its for interrupting a running query. Why aren't these just part of the shutdown hook?

@marmbrus
Copy link
Contributor

@chenghao-intel any thoughts on avoiding the use of InterruptUtils? I'd like to avoid relying on more Hive APIs that we have to.

@chenghao-intel
Copy link
Contributor Author

Sorry for late reply. Thank you @marmbrus , it was quite error-pone by using the HiveInterruptUtils. I've updated the code with our own callback function.

@SparkQA
Copy link

SparkQA commented Oct 9, 2014

QA tests have started for PR 2393 at commit ba1797a.

  • This patch merges cleanly.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21521/Test FAILed.

@srowen
Copy link
Member

srowen commented Oct 9, 2014

Sorry for the late reply as well, but -1 since this introduces again Commons IO FileUtils which broke the build recently. Use Utils.deleteRecursively.

Why a new callback trait instead of just a Runnable or Callable?
Also, Utils already has a mechanism for deleting files at shutdown from temp dirs. It would be good to avoid another shutdown hook.

@SparkQA
Copy link

SparkQA commented Oct 9, 2014

QA tests have finished for PR 2393 at commit ba1797a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21522/Test PASSed.

@chenghao-intel
Copy link
Contributor Author

@srowen thank you very much, this is quite informative.
I've updated the code with Utils.registerShutdownDeleteDir, the code is very clean now! However, the temp directory still there when test exit, seems the register doesn't work on purpose, should I file another jira issue for this?

@srowen
Copy link
Member

srowen commented Oct 9, 2014

Perhaps you can debug a bit first to see if the shutdown hook is called and it attempts to delete the dir? is there are error while deleting it? this mechanism appears to work for unit test temp files.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21526/Test FAILed.

@chenghao-intel
Copy link
Contributor Author

Actually, I searched the code, seems we haven't set the shutdown hook for Utils, you mean I have to do that myself ? I was thinking it should be set properly somewhere in spark core.

@srowen
Copy link
Member

srowen commented Oct 9, 2014

Ah, right. This is the shutdown hook:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L257

... but it requires you to have used createTempDir(). Is this a temp dir?

I have an outstanding PR that improves a few things here, including, just directly deleting everything seen by registerShutdownDeleteDir in a single shutdown hook:
#2670

If that PR looks OK then, it would be good to get it committed, since it would make @chenghao-intel 's simpler solution here work.

@chenghao-intel
Copy link
Contributor Author

TestHive creates temporal files/directories by specifying the prefix / suffix, it will be great if Utils support that as well, the code will be even more cleaner then. Currently I prefer wait for the #2670 to be merged.

@marmbrus
Copy link
Contributor

I'm not super tied to the temp dirs with prefix/suffix (though that is kind of nice). I'm find changing that code to use whatever standard mechanisms the rest of spark uses.

@marmbrus
Copy link
Contributor

Also I merged #2670.

@chenghao-intel
Copy link
Contributor Author

Thank you @marmbrus , you're right!
I've just rebased the latest master, and the temp directory will be deleted after unit test exit locally.

So, I think this is ready to be merged too.

@SparkQA
Copy link

SparkQA commented Oct 10, 2014

QA tests have started for PR 2393 at commit 3a6511f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 10, 2014

QA tests have finished for PR 2393 at commit 3a6511f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

Thanks! Merged to master.

@asfgit asfgit closed this in d3cdf91 Oct 13, 2014
@chenghao-intel chenghao-intel deleted the delete_temp_on_exit branch December 10, 2014 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants