Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination Bigquery: Scaffolding for destinations v2 #27268

Merged
merged 187 commits into from
Jun 29, 2023
Merged

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Jun 12, 2023

What

Add a bunch of stuff to run destinations v2. Doesn't affect anything in the normal code path.

See branch for the new spec option that triggers the new behavior.

How

...

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

No impact.

@edgao edgao changed the title Destination v2 irl [dnm] Destination v2 irl Jun 12, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jun 12, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan and you've followed all steps in the Breaking Changes Checklist
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • You, or an Airbyter, have run /test successfully on this PR - or on a non-forked branch
  • You've updated the connector's metadata.yaml file (new!)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 12, 2023

Affected Connector Report

NOTE ⚠️ Changes in this PR affect the following connectors. Make sure to do the following as needed:

  • Run integration tests
  • Bump connector or module version
  • Add changelog
  • Publish the new version

❌ Sources (33)

Connector Version Changelog Publish
source-alloydb 2.0.28
source-alloydb-strict-encrypt 2.0.28 🔵
(ignored)
🔵
(ignored)
source-azure-blob-storage 0.1.0
source-bigquery 0.2.3
source-clickhouse 0.1.17
source-clickhouse-strict-encrypt 0.1.17 🔵
(ignored)
🔵
(ignored)
source-cockroachdb 0.1.22
source-cockroachdb-strict-encrypt 0.1.22 🔵
(ignored)
🔵
(ignored)
source-db2 0.1.20
source-db2-strict-encrypt 0.1.19
(mismatch: 0.1.20)
🔵
(ignored)
🔵
(ignored)
source-dynamodb 0.1.2
source-e2e-test 2.1.4
source-e2e-test-cloud 2.1.4 🔵
(ignored)
🔵
(ignored)
source-elasticsearch 0.1.1
source-jdbc 0.3.5 🔵
(ignored)
🔵
(ignored)
source-kafka 0.2.3
source-mongodb-strict-encrypt 0.1.19 🔵
(ignored)
🔵
(ignored)
source-mongodb-v2 0.1.19
source-mssql 1.0.19
source-mssql-strict-encrypt 1.0.19 🔵
(ignored)
🔵
(ignored)
source-mysql 2.0.25
source-mysql-strict-encrypt 2.0.25 🔵
(ignored)
🔵
(ignored)
source-oracle 0.3.25
source-oracle-strict-encrypt 0.3.24
(mismatch: 0.3.25)
🔵
(ignored)
🔵
(ignored)
source-postgres 2.0.34
source-postgres-strict-encrypt 2.0.34 🔵
(ignored)
🔵
(ignored)
source-redshift 0.3.17
source-relational-db 0.3.1 🔵
(ignored)
🔵
(ignored)
source-scaffold-java-jdbc 0.1.0 🔵
(ignored)
🔵
(ignored)
source-sftp 0.1.2
source-snowflake 0.1.36
source-teradata 0.1.0
source-tidb 0.2.5
  • See "Actionable Items" below for how to resolve warnings and errors.

❌ Destinations (50)

Connector Version Changelog Publish
destination-azure-blob-storage 0.2.0
destination-bigquery 1.4.4
destination-bigquery-denormalized 1.4.1
destination-cassandra 0.1.4
destination-clickhouse 0.2.4
(diff seed version)
destination-clickhouse-strict-encrypt 0.2.4 🔵
(ignored)
🔵
(ignored)
destination-csv 1.0.0
destination-databricks 1.1.0
destination-dev-null 0.3.0 🔵
(ignored)
🔵
(ignored)
destination-doris 0.1.0
destination-dynamodb 0.1.7
destination-e2e-test 0.3.0
destination-elasticsearch 0.1.6
destination-elasticsearch-strict-encrypt 0.1.6 🔵
(ignored)
🔵
(ignored)
destination-exasol 0.1.1
destination-gcs 0.3.0
destination-iceberg 0.1.0
destination-kafka 0.1.10
destination-keen 0.2.4
destination-kinesis 0.1.5
destination-local-json 0.2.11
destination-mariadb-columnstore 0.1.7
destination-mongodb 0.1.9
destination-mongodb-strict-encrypt 0.1.9 🔵
(ignored)
🔵
(ignored)
destination-mqtt 0.1.3
destination-mssql 0.1.24
destination-mssql-strict-encrypt 0.1.24 🔵
(ignored)
🔵
(ignored)
destination-mysql 0.1.20
destination-mysql-strict-encrypt 0.1.21
(mismatch: 0.1.20)
🔵
(ignored)
🔵
(ignored)
destination-oracle 0.1.19
destination-oracle-strict-encrypt 0.1.19 🔵
(ignored)
🔵
(ignored)
destination-postgres 0.3.27
destination-postgres-strict-encrypt 0.3.27 🔵
(ignored)
🔵
(ignored)
destination-pubsub 0.2.0
destination-pulsar 0.1.3
destination-r2 0.1.0
destination-redis 0.1.4
destination-redpanda 0.1.0
destination-redshift 0.4.8
destination-rockset 0.1.4
destination-s3 0.4.1
destination-s3-glue 0.1.7
destination-scylla 0.1.3
destination-selectdb 0.1.0
destination-snowflake 1.0.5
destination-starburst-galaxy 0.0.1
destination-teradata 0.1.1
destination-tidb 0.1.3
destination-vertica 0.1.0
destination-yugabytedb 0.1.1
  • See "Actionable Items" below for how to resolve warnings and errors.

👀 Other Modules (1)

  • base-normalization

Actionable Items

(click to expand)

Category Status Actionable Item
Version
mismatch
The version of the connector is different from its normal variant. Please bump the version of the connector.

doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.
Changelog
doc not found
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug.

changelog missing
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog.
Publish
not in seed
The connector is not in the cloud or oss registry, so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that you have added a metadata.yaml file and the expected registries are enabled.

@octavia-squidington-iii

This comment was marked as outdated.

@octavia-squidington-iii

This comment was marked as outdated.

@octavia-squidington-iii

This comment was marked as outdated.

@octavia-squidington-iii

This comment was marked as outdated.

@octavia-squidington-iii

This comment was marked as outdated.

@edgao
Copy link
Contributor Author

edgao commented Jun 28, 2023

can't repro the test failure locally. rerunning.

for posterity:

#13 736.4 BigQueryDestinationTest > testWriteSuccess(String) > io.airbyte.integrations.destination.bigquery.BigQueryDestinationTest.testWriteSuccess(String)[3] FAILED
#13 736.4     java.lang.RuntimeException: Failed to upload buffer to stage and commit to destination
#13 736.4         at io.airbyte.integrations.destination.bigquery.BigQueryStagingConsumerFactory.lambda$flushBufferFunction$5(BigQueryStagingConsumerFactory.java:281)
#13 736.4         at io.airbyte.integrations.destination.record_buffer.SerializedBufferingStrategy.flushAllBuffers(SerializedBufferingStrategy.java:138)
#13 736.4         at io.airbyte.integrations.destination.buffered_stream_consumer.BufferedStreamConsumer.close(BufferedStreamConsumer.java:292)
#13 736.4         at io.airbyte.integrations.base.FailureTrackingAirbyteMessageConsumer.close(FailureTrackingAirbyteMessageConsumer.java:82)
#13 736.4         at io.airbyte.integrations.destination.bigquery.BigQueryDestinationTest.testWriteSuccess(BigQueryDestinationTest.java:303)
#13 736.4 
#13 736.4         Caused by:
#13 736.4         java.lang.RuntimeException: [JobId{project=dataline-integration-testing, job=29226c1b-f977-4470-94a1-7cdc6b2e5d96, location=US}] Failed to upload staging files to destination table GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=bq_dest_integration_test_eqnllkon, tableId=_airbyte_raw_users}} (bq_dest_integration_test_eqnllkon)
#13 736.4             at io.airbyte.integrations.destination.bigquery.BigQueryGcsOperations.lambda$copyIntoTableFromStage$0(BigQueryGcsOperations.java:156)
#13 736.4             at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
#13 736.4             at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720)
#13 736.4             at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
#13 736.4             at java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
#13 736.4             at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
#13 736.4             at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
#13 736.4             at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:686)
#13 736.4             at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
#13 736.4             at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)
#13 736.4             at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
#13 736.4             at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
#13 736.4             at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:765)
#13 736.4             at io.airbyte.integrations.destination.bigquery.BigQueryGcsOperations.copyIntoTableFromStage(BigQueryGcsOperations.java:136)
#13 736.4             at io.airbyte.integrations.destination.bigquery.BigQueryStagingConsumerFactory.lambda$flushBufferFunction$5(BigQueryStagingConsumerFactory.java:276)
#13 736.4             ... 4 more
#13 736.4 
#13 736.4             Caused by:
#13 736.4             com.google.cloud.bigquery.BigQueryException: Error is happened during execution for job: Job{job=JobId{project=dataline-integration-testing, job=29226c1b-f977-4470-94a1-7cdc6b2e5d96, location=US}, status=JobStatus{state=RUNNING, error=null, executionErrors=null}, statistics=LoadStatistics{creationTime=1687973136561, endTime=null, startTime=1687973136725, numChildJobs=null, parentJobId=null, scriptStatistics=null, reservationUsage=null, transactionInfo=null, sessionInfo=null, inputBytes=null, inputFiles=null, outputBytes=null, outputRows=null, badRecords=null}, userEmail=test-bigquery-user@dataline-integration-testing.iam.gserviceaccount.com, etag=1Lm65tObrUIDqz+AWHPGfg==, generatedId=dataline-integration-testing:US.29226c1b-f977-4470-94a1-7cdc6b2e5d96, selfLink=https://www.googleapis.com/bigquery/v2/projects/dataline-integration-testing/jobs/29226c1b-f977-4470-94a1-7cdc6b2e5d96?location=US, configuration=LoadJobConfiguration{type=LOAD, destinationTable=GenericData{classInfo=[datasetId, projectId, tableId], {datasetId=bq_dest_integration_test_eqnllkon, projectId=dataline-integration-testing, tableId=_airbyte_raw_users}}, decimalTargetTypes=null, destinationEncryptionConfiguration=null, createDisposition=null, writeDisposition=WRITE_APPEND, formatOptions=AvroOptions{type=AVRO, useAvroLogicalTypes=null}, nullMarker=null, maxBadRecords=null, schema=Schema{fields=[Field{name=_airbyte_ab_id, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null}, Field{name=_airbyte_emitted_at, type=TIMESTAMP, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null}, Field{name=_airbyte_data, type=STRING, mode=null, description=null, policyTags=null, maxLength=null, scale=null, precision=null, defaultValueExpression=null, collation=null}]}, ignoreUnknownValue=null, sourceUris=[gs://airbyte-integration-test-destination-gcs/test_path/bq_dest_integration_test_eqnllkon_users/2023/06/28/17/80405056-c473-42cf-adbf-77b526163f75/0.avro], schemaUpdateOptions=null, autodetect=null, timePartitioning=null, clustering=null, useAvroLogicalTypes=true, labels=null, jobTimeoutMs=null, rangePartitioning=null, hivePartitioningOptions=null, referenceFileSchemaUri=null, connectionProperties=null, createSession=null}}, 
#13 736.4              For more details see Big Query Error collection: BigQueryError{reason=invalid, location=gs://airbyte-integration-test-destination-gcs/test_path/bq_dest_integration_test_eqnllkon_users/2023/06/28/17/80405056-c473-42cf-adbf-77b526163f75/0.avro, message=Error while reading data, error message: Failed to expand table _airbyte_raw_users_bb3b02ea_6452_4aeb_bca3_4c3f98d2eb4c_source with file pattern gs://airbyte-integration-test-destination-gcs/test_path/bq_dest_integration_test_eqnllkon_users/2023/06/28/17/80405056-c473-42cf-adbf-77b526163f75/0.avro: matched no files. File: gs://airbyte-integration-test-destination-gcs/test_path/bq_dest_integration_test_eqnllkon_users/2023/06/28/17/80405056-c473-42cf-adbf-77b526163f75/0.avro}:
#13 736.4                 at app//io.airbyte.integrations.destination.bigquery.BigQueryUtils.waitForJobFinish(BigQueryUtils.java:443)
#13 736.4                 at app//io.airbyte.integrations.destination.bigquery.BigQueryGcsOperations.lambda$copyIntoTableFromStage$0(BigQueryGcsOperations.java:151)
#13 736.4                 ... 18 more
#13 736.4 
#13 736.4                 Caused by:
#13 736.4                 com.google.cloud.bigquery.BigQueryException: Error while reading data, error message: Failed to expand table _airbyte_raw_users_bb3b02ea_6452_4aeb_bca3_4c3f98d2eb4c_source with file pattern gs://airbyte-integration-test-destination-gcs/test_path/bq_dest_integration_test_eqnllkon_users/2023/06/28/17/80405056-c473-42cf-adbf-77b526163f75/0.avro: matched no files. File: gs://airbyte-integration-test-destination-gcs/test_path/bq_dest_integration_test_eqnllkon_users/2023/06/28/17/80405056-c473-42cf-adbf-77b526163f75/0.avro
#13 736.4                     at app//com.google.cloud.bigquery.Job.reload(Job.java:419)
#13 736.4                     at app//com.google.cloud.bigquery.Job.waitFor(Job.java:252)
#13 736.4                     at app//io.airbyte.integrations.destination.bigquery.BigQueryUtils.waitForJobFinish(BigQueryUtils.java:438)
#13 736.4                     ... 19 more
<snip...>

#13 1865.9 87 tests completed, 1 failed, 2 skipped

@octavia-squidington-iii
Copy link
Collaborator

destination-gcs test report (commit 5d3ae4f111) - ✅

⏲️ Total pipeline duration: 730 seconds

Step Result
Validate airbyte-integrations/connectors/destination-gcs/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-gcs docker image for platform linux/x86_64
Unit tests
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-gcs test

@octavia-squidington-iii
Copy link
Collaborator

destination-bigquery test report (commit 5d3ae4f111) - ✅

⏲️ Total pipeline duration: 1902 seconds

Step Result
Validate airbyte-integrations/connectors/destination-bigquery/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-bigquery docker image for platform linux/x86_64
Unit tests
Build airbyte/normalization:dev
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-bigquery test

@edgao edgao enabled auto-merge (squash) June 28, 2023 22:03
@octavia-squidington-iii
Copy link
Collaborator

destination-bigquery test report (commit 1b2756beb0) - ✅

⏲️ Total pipeline duration: 1939 seconds

Step Result
Validate airbyte-integrations/connectors/destination-bigquery/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-bigquery docker image for platform linux/x86_64
Unit tests
Build airbyte/normalization:dev
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-bigquery test

@octavia-squidington-iii
Copy link
Collaborator

destination-gcs test report (commit 1b2756beb0) - ✅

⏲️ Total pipeline duration: 772 seconds

Step Result
Validate airbyte-integrations/connectors/destination-gcs/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-gcs docker image for platform linux/x86_64
Unit tests
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-gcs test

@octavia-squidington-iii
Copy link
Collaborator

destination-bigquery test report (commit 7288461b70) - ✅

⏲️ Total pipeline duration: 1820 seconds

Step Result
Validate airbyte-integrations/connectors/destination-bigquery/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-bigquery docker image for platform linux/x86_64
Unit tests
Build airbyte/normalization:dev
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-bigquery test

@octavia-squidington-iii
Copy link
Collaborator

destination-gcs test report (commit 7288461b70) - ✅

⏲️ Total pipeline duration: 657 seconds

Step Result
Validate airbyte-integrations/connectors/destination-gcs/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-gcs docker image for platform linux/x86_64
Unit tests
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-gcs test

@octavia-squidington-iii
Copy link
Collaborator

destination-gcs test report (commit bef3f895a5) - ✅

⏲️ Total pipeline duration: 701 seconds

Step Result
Validate airbyte-integrations/connectors/destination-gcs/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-gcs docker image for platform linux/x86_64
Unit tests
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-gcs test

@octavia-squidington-iii
Copy link
Collaborator

destination-bigquery test report (commit bef3f895a5) - ✅

⏲️ Total pipeline duration: 1771 seconds

Step Result
Validate airbyte-integrations/connectors/destination-bigquery/metadata.yaml
Connector version semver check
Connector version increment check
QA checks
Build connector tar
Build destination-bigquery docker image for platform linux/x86_64
Unit tests
Build airbyte/normalization:dev
Integration tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-bigquery test

@edgao edgao disabled auto-merge June 28, 2023 23:54
@jbfbell jbfbell merged commit ba3e39b into master Jun 29, 2023
@jbfbell jbfbell deleted the destination-v2-irl branch June 29, 2023 15:44
edgao added a commit that referenced this pull request Jun 30, 2023
edgao added a commit that referenced this pull request Jul 1, 2023
…" (#27891)

* Revert "Destination Bigquery: Scaffolding for destinations v2 (#27268)"

This reverts commit ba3e39b.

* bump versions to 1.5.1 everywhere
edgao added a commit that referenced this pull request Jul 3, 2023
octavia-approvington added a commit that referenced this pull request Jul 14, 2023
* Revert "Revert "Destination Bigquery: Scaffolding for destinations v2 (#27268)""

This reverts commit 348c577.

* version bumps+changelog

* Speed up BQ by having 2 queries, and not an OR (#27981)

* 🐛 Destination Bigquery: fix bug in standard inserts for syncs >10K records (#27856)

* only run t+d code if it's enabled

* dockerfile+changelog

* remove changelog entry

* Destinations V2: handle optional fields for `object` and `array` types (#27898)

* catch null schema

* fix null properties

* clean up

* consolidate + add more tests

* try catch

* empty json test

* Automated Commit - Formatting Changes

* remove todo

* destination bigquery: misc updates to 1s1t code (#28057)

* switch to checkedconsumer

* add unit test for buildColumnId

* use flag

* restructure prefix check

* fix build

* more type-parsing fixes (#28100)

* more type-parsing fixes

* handle duplicates

* Automated Commit - Format and Process Resources Changes

* add tests for asColumns

* Automated Commit - Format and Process Resources Changes

* log warnings instead of throwing exception

* better log message

* error level

---------

Co-authored-by: edgao <[email protected]>

* Automated Commit - Formatting Changes

* Improve protocol type parsing (#28126)

* Automated Commit - Formatting Changes

* Change from T&D every 10k records to an increasing time based interval (#28130)

* fifteen minute t&d

* add typing and deduping operation valve for increased intervals of typing and deduping

* Automated Commit - Format and Process Resources Changes

* resolve bizarre merge conflict

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: jbfbell <[email protected]>

* Simplify and speed up CDC delete support [DestinationsV2] (#28029)

* Simplify and speed up CDC delete support [DestinationsV2]

* better QUOTE

* spotbugs?

* recompile dbt image for local arch and use that when building images

* things compile, but tests fail

* tests working-ish

* comment

* fix logic to re-insert deleted records for cursor comparison.

tests pass!

* remove comment

* Skip CDC re-include logic if there are no CDC columns

* stop hardcoding pk (#28092)

* wip

* remove TODOs

---------

Co-authored-by: Edward Gao <[email protected]>

* update method name

* Automated Commit - Formatting Changes

* depend on pinned normalization version

* implement 1s1t DATs for destination-bigquery (#27852)

* intiial implementation

* Automated Commit - Formatting Changes

* add second sync to test

* do concurrent things

* Automated Commit - Formatting Changes

* clarify comment

* minor tweaks

* more stuff

* Automated Commit - Formatting Changes

* minor cleanup

* lots of fixes

* handle sql vs json null better
* verify extra columns
* only check deleted_at if in DEDUP mode and the column exists
* add full refresh append test case

* Automated Commit - Formatting Changes

* add tests for the remaining sync modes

* Automated Commit - Formatting Changes

* readability stuff

* Automated Commit - Formatting Changes

* add test for gcs mode

* remove static fields

* Automated Commit - Formatting Changes

* add more test cases, tweak test scaffold

* cleanup

* Automated Commit - Formatting Changes

* extract recorddiffer

* and use it in the sql generator test

* fix

* comment

* naming+comment

* one more comment

* better assert

* remove unnecessary thing

* one last thing

* Automated Commit - Formatting Changes

* enable concurrent execution on all java integration tests

* add test for default namespace

* Automated Commit - Formatting Changes

* implement a 2-stream test

* Automated Commit - Formatting Changes

* extract methods

* invert jsonNodesNotEquivalent

* Automated Commit - Formatting Changes

* fix conditional

* pull out diffSingleRecord

* Automated Commit - Formatting Changes

* handle nulls correctly

* remove raw-specific handling; break up methods

* Automated Commit - Formatting Changes

---------

Co-authored-by: edgao <[email protected]>
Co-authored-by: octavia-approvington <[email protected]>

* Destinations V2: move create raw tables earlier (#28255)

* move create raw tables

* better log message

* stop building normalization (#28256)

* fix ability to run tests

* disable incremental t+d for now

* Automated Commit - Formatting Changes

---------

Co-authored-by: Evan Tahler <[email protected]>
Co-authored-by: Cynthia Yin <[email protected]>
Co-authored-by: cynthiaxyin <[email protected]>
Co-authored-by: edgao <[email protected]>
Co-authored-by: Joe Bell <[email protected]>
Co-authored-by: jbfbell <[email protected]>
Co-authored-by: octavia-approvington <[email protected]>
efimmatytsin pushed a commit to scentbird/airbyte that referenced this pull request Jul 27, 2023
* Revert "Revert "Destination Bigquery: Scaffolding for destinations v2 (airbytehq#27268)""

This reverts commit 348c577.

* version bumps+changelog

* Speed up BQ by having 2 queries, and not an OR (airbytehq#27981)

* 🐛 Destination Bigquery: fix bug in standard inserts for syncs >10K records (airbytehq#27856)

* only run t+d code if it's enabled

* dockerfile+changelog

* remove changelog entry

* Destinations V2: handle optional fields for `object` and `array` types (airbytehq#27898)

* catch null schema

* fix null properties

* clean up

* consolidate + add more tests

* try catch

* empty json test

* Automated Commit - Formatting Changes

* remove todo

* destination bigquery: misc updates to 1s1t code (airbytehq#28057)

* switch to checkedconsumer

* add unit test for buildColumnId

* use flag

* restructure prefix check

* fix build

* more type-parsing fixes (airbytehq#28100)

* more type-parsing fixes

* handle duplicates

* Automated Commit - Format and Process Resources Changes

* add tests for asColumns

* Automated Commit - Format and Process Resources Changes

* log warnings instead of throwing exception

* better log message

* error level

---------

Co-authored-by: edgao <[email protected]>

* Automated Commit - Formatting Changes

* Improve protocol type parsing (airbytehq#28126)

* Automated Commit - Formatting Changes

* Change from T&D every 10k records to an increasing time based interval (airbytehq#28130)

* fifteen minute t&d

* add typing and deduping operation valve for increased intervals of typing and deduping

* Automated Commit - Format and Process Resources Changes

* resolve bizarre merge conflict

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: jbfbell <[email protected]>

* Simplify and speed up CDC delete support [DestinationsV2] (airbytehq#28029)

* Simplify and speed up CDC delete support [DestinationsV2]

* better QUOTE

* spotbugs?

* recompile dbt image for local arch and use that when building images

* things compile, but tests fail

* tests working-ish

* comment

* fix logic to re-insert deleted records for cursor comparison.

tests pass!

* remove comment

* Skip CDC re-include logic if there are no CDC columns

* stop hardcoding pk (airbytehq#28092)

* wip

* remove TODOs

---------

Co-authored-by: Edward Gao <[email protected]>

* update method name

* Automated Commit - Formatting Changes

* depend on pinned normalization version

* implement 1s1t DATs for destination-bigquery (airbytehq#27852)

* intiial implementation

* Automated Commit - Formatting Changes

* add second sync to test

* do concurrent things

* Automated Commit - Formatting Changes

* clarify comment

* minor tweaks

* more stuff

* Automated Commit - Formatting Changes

* minor cleanup

* lots of fixes

* handle sql vs json null better
* verify extra columns
* only check deleted_at if in DEDUP mode and the column exists
* add full refresh append test case

* Automated Commit - Formatting Changes

* add tests for the remaining sync modes

* Automated Commit - Formatting Changes

* readability stuff

* Automated Commit - Formatting Changes

* add test for gcs mode

* remove static fields

* Automated Commit - Formatting Changes

* add more test cases, tweak test scaffold

* cleanup

* Automated Commit - Formatting Changes

* extract recorddiffer

* and use it in the sql generator test

* fix

* comment

* naming+comment

* one more comment

* better assert

* remove unnecessary thing

* one last thing

* Automated Commit - Formatting Changes

* enable concurrent execution on all java integration tests

* add test for default namespace

* Automated Commit - Formatting Changes

* implement a 2-stream test

* Automated Commit - Formatting Changes

* extract methods

* invert jsonNodesNotEquivalent

* Automated Commit - Formatting Changes

* fix conditional

* pull out diffSingleRecord

* Automated Commit - Formatting Changes

* handle nulls correctly

* remove raw-specific handling; break up methods

* Automated Commit - Formatting Changes

---------

Co-authored-by: edgao <[email protected]>
Co-authored-by: octavia-approvington <[email protected]>

* Destinations V2: move create raw tables earlier (airbytehq#28255)

* move create raw tables

* better log message

* stop building normalization (airbytehq#28256)

* fix ability to run tests

* disable incremental t+d for now

* Automated Commit - Formatting Changes

---------

Co-authored-by: Evan Tahler <[email protected]>
Co-authored-by: Cynthia Yin <[email protected]>
Co-authored-by: cynthiaxyin <[email protected]>
Co-authored-by: edgao <[email protected]>
Co-authored-by: Joe Bell <[email protected]>
Co-authored-by: jbfbell <[email protected]>
Co-authored-by: octavia-approvington <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants