Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4109][CORE] Correctly deserialize Task.stageId #2971

Closed
wants to merge 1 commit into from

Conversation

luluorta
Copy link
Contributor

The two subclasses of Task, ShuffleMapTask and ResultTask, do not correctly deserialize stageId. Therefore, the accessing of TaskContext.stageId always returns zero value to the user.

@luluorta luluorta changed the title [SPARK-4109] Correctly deserialize Task.stageId [SPARK-4109][CORE] Correctly deserialize Task.stageId Oct 28, 2014
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@@ -89,13 +89,13 @@ private[spark] object ResultTask {
* input RDD's partitions).
*/
private[spark] class ResultTask[T, U](
stageId: Int,
_stageId: Int,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, does this need to be renamed? it's just a constructor parameter. Later the assignment to stageId is already setting the parent value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't actually need to rename this. scala will handle it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renaming is better, otherwise simply by accessing stageId you could upgrade it to a field accidentally

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scala is smart enough to not to do that actually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually hold on - I tested it again and the compiler is not doing the right thing. Let me verify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at here: https://gist.github.com/rxin/6be132f46b72c27d8f89

I think if the parent constructor declares the field as val or var, then it is fine. Otherwise it is not. So technically in this case it should be fine to not rename, but it is better to rename it explicitly just to avoid the compiler behavior change (since this might be a corner case) in the future.

@andrewor14
Copy link
Contributor

add to whitelist

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22370 has started for PR 2971 at commit ff35ee6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 28, 2014

Test build #22370 has finished for PR 2971 at commit ff35ee6.

  • This patch fails some tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22370/
Test FAILed.

@@ -128,7 +128,7 @@ private[spark] class ShuffleMapTask(
}

override def readExternal(in: ObjectInput) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as you're modifying this code, mind tossing a Utils.tryOrIOException here so that any errors that occur here are reported properly? See #2932 for explanation / context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I missed this case in my PR.

@rxin
Copy link
Contributor

rxin commented Nov 1, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 1, 2014

Test build #22689 has started for PR 2971 at commit ff35ee6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 1, 2014

Test build #22689 has finished for PR 2971 at commit ff35ee6.

  • This patch fails some tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22689/
Test FAILed.

@rxin
Copy link
Contributor

rxin commented Nov 1, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 1, 2014

Test build #22702 has started for PR 2971 at commit ff35ee6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 1, 2014

Test build #22702 has finished for PR 2971 at commit ff35ee6.

  • This patch fails some tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22702/
Test FAILed.

@rxin
Copy link
Contributor

rxin commented Nov 2, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22740 has started for PR 2971 at commit ff35ee6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22740 has finished for PR 2971 at commit ff35ee6.

  • This patch fails some tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22740/
Test FAILed.

@rxin
Copy link
Contributor

rxin commented Nov 2, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22751 has started for PR 2971 at commit ff35ee6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22751 has finished for PR 2971 at commit ff35ee6.

  • This patch fails some tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22751/
Test FAILed.

@rxin
Copy link
Contributor

rxin commented Nov 2, 2014

cc @marmbrus @liancheng any idea why the unit test for Python is failing? Are there some unit tests testing the stageId?

@marmbrus
Copy link
Contributor

marmbrus commented Nov 2, 2014

Hmmm, not sure what is going on here, especially since other SQL PRs seem to be passing at the moment. @davies any idea?

@davies
Copy link
Contributor

davies commented Nov 3, 2014

@marmbrus This PR is targeted for 1.0, but PySpark SQL 1.0 can not passed the tests in Python 2.6 (Pyrolite can not deserialize of array.array from Python 2.6).

We should change the default Python to 2.7 for testing spark 1.0 branch.

It's not related to this PR.

@rxin
Copy link
Contributor

rxin commented Nov 3, 2014

Ok I'm going to merge this and patch it for master as well.

asfgit pushed a commit that referenced this pull request Nov 3, 2014
The two subclasses of Task, ShuffleMapTask and ResultTask, do not correctly deserialize stageId. Therefore, the accessing of TaskContext.stageId always returns zero value to the user.

Author: luluorta <[email protected]>

Closes #2971 from luluorta/fix-task-ser and squashes the following commits:

ff35ee6 [luluorta] correctly deserialize Task.stageId
@rxin
Copy link
Contributor

rxin commented Nov 3, 2014

Actually this is only relevant for the old branches. In master we no longer do special serialization.

@asfgit asfgit closed this in d6e4c59 Nov 3, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants