Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-2098] All Spark processes should support spark-defaults.conf, config file #2379

Closed
wants to merge 3 commits into from

Conversation

witgo
Copy link
Contributor

@witgo witgo commented Sep 13, 2014

This is another implementation about #1256
cc @andrewor14 @vanzin

@SparkQA
Copy link

SparkQA commented Sep 13, 2014

QA tests have started for PR 2379 at commit 217280a.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 13, 2014

QA tests have finished for PR 2379 at commit 217280a.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class JavaSparkContext(val sc: SparkContext)
    • class JavaStreamingContext(val ssc: StreamingContext) extends Closeable

@vanzin
Copy link
Contributor

vanzin commented Sep 16, 2014

I think I have an opposite view from Andrew in that I dislike using sys.props as an IPC mechanism, but other than that, looks good.

initSecurity()
new HistoryServerArguments(conf, argStrings)
val arg = new HistoryServerArguments(argStrings)
Option(arg.propertiesFile).foreach { file =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is repeated 3 times, maybe add it to Utils? Also, wonder if this would look better:

Option(arg.propertiesFile)
  .map { f => Utils.getPropertiesFromFile(f) }
  .foreach { props =>
    props.filter { case (k, v) => k.startsWith("spark.") }
      .foreach { case (k, v) => sys.props.getOrElseUpdate(k, v) }
  }

@andrewor14
Copy link
Contributor

@vanzin @witgo Actually my intention was not to put everything you load from the properties file into sys.props. This is what SparkSubmit does, but only because it does not have access to the conf that the application will use. For Worker, Master, HistoryServer, however, this is not true; we create a SparkConf from the beginning and we will use that for the entire process. The way we set the properties through sys.props right now makes the initialization ordering of SparkConf and the respective *Argument class sensitive. Instead, it might be simpler if we do conf.set for those places instead (we already do this elsewhere in HistoryServerArgument).

@vanzin
Copy link
Contributor

vanzin commented Sep 16, 2014

@andrewor14 makes sense. Yeah, that would simplify the change a lot.

@vanzin
Copy link
Contributor

vanzin commented Sep 16, 2014

Just to make sure, a blind conf.set would override anything the use has set using system properties, which is probably not what we want. Probably beter to use conf.setIfMissing.

@andrewor14
Copy link
Contributor

Yes thanks you would want to use that one.

@witgo
Copy link
Contributor Author

witgo commented Sep 17, 2014

@andrewor14 @vanzin This is a very good idea, we create a SparkConf from the beginning and we will use that for the entire process. But it need to modify a lot of code ,Maybe we need to create a separate jira

@andrewor14
Copy link
Contributor

@witgo Not sure if I understand what you mean. The changes I am proposing actually changes less code (compared to Spark master), not more. Right now a problem in this PR is that the ordering between new SparkConf and new WorkerArgument is changed, but I'm saying it should not be. This is solving the same issue, so I don't see why we should create a new JIRA.

@@ -25,7 +25,7 @@ import org.apache.spark.SparkConf
/**
* Command-line parser for the worker.
*/
private[spark] class WorkerArguments(args: Array[String], conf: SparkConf) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned elsewhere, we shouldn't remove SparkConf as an argument.

@andrewor14
Copy link
Contributor

@witgo I have added a few more in-line comments to explain what I mean. Let me know if you have any more questions.

@SparkQA
Copy link

SparkQA commented Sep 17, 2014

QA tests have started for PR 2379 at commit dae0120.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 17, 2014

QA tests have finished for PR 2379 at commit dae0120.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@witgo
Copy link
Contributor Author

witgo commented Sep 18, 2014

OK, the code has been updated.

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have started for PR 2379 at commit c1824d2.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have finished for PR 2379 at commit c1824d2.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have started for PR 2379 at commit c4a77e9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have started for PR 2379 at commit 80b0b12.

  • This patch merges cleanly.

def getDefaultPropertiesFile(): String = {
sys.env.get("SPARK_CONF_DIR")
.orElse(sys.env.get("SPARK_HOME").map { t => s"$t${File.separator}conf"})
.map { t => new File(s"$t${File.separator}spark-defaults.conf")}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space before }

@vanzin
Copy link
Contributor

vanzin commented Oct 7, 2014

LGTM, just a couple of minor things.

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have finished for PR 2379 at commit 80b0b12.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21404/Test FAILed.

@andrewor14
Copy link
Contributor

Looks like there is a legitimate test failure though, can you resolve this @witgo?

@andrewor14
Copy link
Contributor

I just tested this on the Worker, Master and the HistoryServer and it works as expected. Awesome.

@SparkQA
Copy link

SparkQA commented Oct 8, 2014

QA tests have started for PR 2379 at commit e1bb650.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 8, 2014

QA tests have finished for PR 2379 at commit e1bb650.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21441/Test PASSed.

@witgo
Copy link
Contributor Author

witgo commented Oct 8, 2014

@andrewor14 The code has been updated.

@andrewor14
Copy link
Contributor

Thanks, I will look at this shortly. A minor request though: in the future could you keep around all the commits in your branch? It would help reviewing if I could see what lines have been changed in a particular commit that was pushed after my comments, for example.

@andrewor14
Copy link
Contributor

Hey sorry @witgo there are merge conflicts now. I'll test this again and merge this once you update it.

@SparkQA
Copy link

SparkQA commented Oct 15, 2014

QA tests have started for PR 2379 at commit 4ef1cbd.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 15, 2014

QA tests have finished for PR 2379 at commit 4ef1cbd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21759/
Test PASSed.

@witgo
Copy link
Contributor Author

witgo commented Oct 15, 2014

@andrewor14 The code has been updated.

@andrewor14
Copy link
Contributor

Alright, I'm merging this into master. Thanks @witgo!

@asfgit asfgit closed this in 293a0b5 Oct 15, 2014
@witgo witgo deleted the SPARK-2098-new branch October 15, 2014 05:22
@witgo
Copy link
Contributor Author

witgo commented Oct 15, 2014

Cool! Thanks @andrewor14 @vanzin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants