Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25342][CORE][SQL]Support rolling back a result stage and rerunning all result tasks when writing files #37359

Closed
wants to merge 3 commits into from

Conversation

caican00
Copy link
Contributor

@caican00 caican00 commented Aug 1, 2022

What changes were proposed in this pull request?

From this pr:#22112, we learn that currently we can't rollback and rerun a result stage, and just fail.

And this new pr is designed to solve some scenarios of this problem. When the analysis result from the result stage of a job will be output to a storage system, it can be written to a file system or database system.

  1. If the result was written to a file system, it was stored in a temporary directory until the result stage run successfully. If the result stage whose map stage is indeterminate failed but had committed output for some partitions, we can delete these temporary files and roll back the result stage.
  2. If the result was written to a database system, it will be written directly to the database and therefore if the result stage whose map stage is indeterminate failed but some result tasks were successful, the result has been written successfully can not be rolled back
  3. Therefore, the main purpose of this new pr is to support Result Stage rollback in the scenarios of writing to any file system.
  4. I added a new identifier isResultStageRetryAllowed in RDD class to indicate whether its corresponding Result stage supports retries.
    It is a Boolean variable and the default value is false,indicating that result stage rollback is not supported and corresponds to the scenario of writing to the database.
    And in the case of writing to the file system, the result stage supports retries, and isResultStageRetryAllowed will be changed to true.

Does this PR introduce any user-facing change?

No

How was this patch tested?

new tests and manually test

write to hive
image

write to iceberg
image

write to hdfs
image

write to mysql
image

@caican00 caican00 changed the title Support rollback result stage [SPARK-25342][CORE][SQL]Support rolling back a result stage and rerunning all result tasks when writing files Aug 1, 2022
@caican00
Copy link
Contributor Author

caican00 commented Aug 1, 2022

gently ping @cloud-fan
Can you help to review this PR

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@caican00
Copy link
Contributor Author

caican00 commented Aug 9, 2022

gently ping @cloud-fan Can you help to review this PR

@cloud-fan Hi, could you help to review this pr? Thanks

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Nov 18, 2022
@github-actions github-actions bot closed this Nov 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants