DataprocCreateBatchOperator in deferrable mode doesn't reattach with deferment. #32215
Closed
2 tasks done
Labels
area:providers
kind:bug
This is a clearly a bug
provider:google
Google (including GCP) related issues
Apache Airflow version
main (development)
What happened
The DataprocCreateBatchOperator (Google provider) handles the case when a batch_id already exists in the Dataproc API by 'reattaching' to a potentially running job.
Current reattachment logic uses the non-deferrable method even when the operator is in deferrable mode.
What you think should happen instead
The operator should reattach in deferrable mode.
How to reproduce
Create a DAG with a task of DataprocCreateBatchOperator that is long running. Make DataprocCreateBatchOperator deferrable in the constructor.
Restart local Airflow to simulate having to 'reattach' to a running job in Google Cloud Dataproc.
The operator resumes using the running job but in the code path for the non-derferrable logic.
Operating System
macOS 13.4.1 (22F82)
Versions of Apache Airflow Providers
Current main.
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: