-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(core): Handle GCS Response retriable errors #2588
Conversation
Response errors that GCS considers retriable are treated as fatal in our codebase. This causes the code to crash, when it could safely retry and continue it's execution. This PR treats GCS's response errors that are retriable as retriable. Alongside, `is_transient` has been renamed to `is_retriable` to better reflect it's usage. The change will reduce the amount of job crashes and peak execution times seen across prover subsystems (especially in BWGs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change LGTM, but I would avoid changing the established terminology.
If we want to go this way, I'd expect it to be done consistently across the codebase (though not sure if it's worth it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After merge please announce in the chat that you changed transient
to retriable
across the codebase to be consistent with GCP terminology and ask everyone to use the new term from now on.
🤖 I have created a release *beep* *boop* --- ## [24.15.0](core-v24.14.0...core-v24.15.0) (2024-08-07) ### Features * optimize LWG and NWG ([#2512](#2512)) ([0d00650](0d00650)) * Poll the main node API for attestation status - relaxed (BFT-496) ([#2583](#2583)) ([b45aa91](b45aa91)) * **zk_toolbox:** allow to run `zk_inception chain create` non-interactively ([#2579](#2579)) ([555fcf7](555fcf7)) ### Bug Fixes * **core:** Handle GCS Response retriable errors ([#2588](#2588)) ([4b74092](4b74092)) * **node:** respect namespaces configuration ([#2578](#2578)) ([e2d9060](e2d9060)) * **vm-runner:** Fix data race in storage loader ([1810b78](1810b78)) ### Reverts * "feat: Poll the main node for the next batch to sign (BFT-496)" ([#2574](#2574)) ([72d3be8](72d3be8)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: zksync-era-bot <[email protected]> Co-authored-by: Roman Brodetski <[email protected]>
Response errors that GCS considers retriable are treated as fatal in our codebase. This causes the code to crash, when it could safely retry and continue it's execution.
This PR treats GCS's response errors that are retriable as retriable. Alongside,
is_transient
has been renamed tois_retriable
to better reflect it's usage.The change will reduce the amount of job crashes and peak execution times seen across prover subsystems (especially in BWGs).