Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQueryToPostgresOperator with replace=True is not functional #40189

Closed
2 tasks done
olivierdolle opened this issue Jun 12, 2024 · 3 comments
Closed
2 tasks done

BigQueryToPostgresOperator with replace=True is not functional #40189

olivierdolle opened this issue Jun 12, 2024 · 3 comments
Assignees
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:google Google (including GCP) related issues

Comments

@olivierdolle
Copy link

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

I am using 10.14.0
but this is true on the latest at the time of writing this issue : 10.19.0

Apache Airflow version

2.6.3

Operating System

macos (local)

Deployment

Google Cloud Composer

Deployment details

No response

What happened

The Postgres SQLHook method for insertion expects the argument replace_index if the argument replace is True
It is currently not possible to forward it via kwargs and therefore the replace=True cannot be used.

What you think should happen instead

The base class BigQueryToSqlBaseOperator does implement accept the replace argument and it is indeed taken into account until faced with an error from the PostgresHook :

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/google/cloud/transfers/bigquery_to_sql.py", line 130, in execute
    sql_hook.insert_rows(
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/common/sql/hooks/sql.py", line 539, in insert_rows
    sql = self._generate_insert_sql(table, values, target_fields, replace, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/postgres/hooks/postgres.py", line 310, in _generate_insert_sql
    raise ValueError("PostgreSQL ON CONFLICT upsert syntax requires an unique index")
ValueError: PostgreSQL ON CONFLICT upsert syntax requires an unique index

Any attempt to pass the replace_index argument is met with a dag parsing error:
airflow.exceptions.AirflowException: Invalid arguments were passed to BigQueryToPostgresOperator (task_id: transfer_table_vehicle). Invalid arguments were: **kwargs: {'replace_index': 'vh_license_plate'}

How to reproduce

Any task with

BigQueryToSqlBaseOperator(
  replace=True,
  ...
)

Anything else

This issue is 100% reproducible.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@olivierdolle olivierdolle added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jun 12, 2024
@vatsrahul1001 vatsrahul1001 added provider:google Google (including GCP) related issues and removed needs-triage label for new issues that we didn't triage yet labels Jun 12, 2024
@potiuk
Copy link
Member

potiuk commented Jun 12, 2024

As message says - you need unique index on your table that you want to execute upsert to. This is problem with the tabley you are using as target of your operation, not with the provider. Make sure your table has appropriate index.

@potiuk potiuk closed this as completed Jun 12, 2024
@potiuk potiuk added the invalid label Jun 12, 2024
@potiuk potiuk reopened this Jun 12, 2024
@potiuk
Copy link
Member

potiuk commented Jun 12, 2024

Ah sorry. My mistake - have not read it in detail. Yeah indeed passing the parameter is not done - likely passing them via hook_kwargs or similar might be a solution

@olivierdolle
Copy link
Author

fixed in #40278

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:google Google (including GCP) related issues
Projects
None yet
Development

No branches or pull requests

4 participants