Skip to content

Commit

Permalink
[SPARK-26682][SQL] Use taskAttemptID instead of attemptNumber for Had…
Browse files Browse the repository at this point in the history
…oop.

## What changes were proposed in this pull request?

Updates the attempt ID used by FileFormatWriter. Tasks in stage attempts use the same task attempt number and could conflict. Using Spark's task attempt ID guarantees that Hadoop TaskAttemptID instances are unique.

## How was this patch tested?

Existing tests. Also validated that we no longer detect this failure case in our logs after deployment.

Closes apache#23608 from rdblue/SPARK-26682-fix-hadoop-task-attempt-id.

Authored-by: Ryan Blue <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
  • Loading branch information
rdblue authored and jackylee-ch committed Feb 18, 2019
1 parent d9e0e80 commit 4918369
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ object FileFormatWriter extends Logging {
description = description,
sparkStageId = taskContext.stageId(),
sparkPartitionId = taskContext.partitionId(),
sparkAttemptNumber = taskContext.attemptNumber(),
sparkAttemptNumber = taskContext.taskAttemptId().toInt & Integer.MAX_VALUE,
committer,
iterator = iter)
},
Expand Down

0 comments on commit 4918369

Please sign in to comment.