Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(warehouse): unrecognized schema in warehouse #2638

Merged
merged 10 commits into from
Nov 3, 2022

Conversation

achettyiitr
Copy link
Member

@achettyiitr achettyiitr commented Nov 2, 2022

Description

The changes revolve around uploadSchema and schemaInWarehouse terminology.

  • uploadScheme It is being prepared by consolidating local schema (copy of warehouse schema) and staging files schema.
  • schemaInWarehouse It is being prepared by querying the warehouse destinations and mapping them with datatypes supported by Rudderstack. During the fetching of the schema from the warehouse destinations, if there are some datatypes that are not supported by RudderStack we skip those.

During the loading of the data, we calculate schema diff between the uploadScheme and schemaInWarehouse to create the necessary tables and columns. There are two things:

  1. Since schemaInWarehouse is skipping the unsupported datatypes, it is resulting in creating columns every time for unsupported datatypes. Errors are being suppressed.
  2. Since we started batching ALTER queries, we stopped suppressing the errors because one or more columns already present will result in other columns not getting altered.

So, we introduced one more parameter called unrecognizedSchema which will be populated during the fetching of the schema from unsupported datatypes. We are using it to ignore columns that are already present in the warehouse.

Notion Ticket

https://www.notion.so/rudderstacks/Warehouse-unrecognized-schema-4029829a49564d5087a2186464b63e63

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@codecov
Copy link

codecov bot commented Nov 2, 2022

Codecov Report

Base: 43.90% // Head: 43.83% // Decreases project coverage by -0.06% ⚠️

Coverage data is based on head (095d351) compared to base (3659e12).
Patch coverage: 0.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2638      +/-   ##
==========================================
- Coverage   43.90%   43.83%   -0.07%     
==========================================
  Files         187      187              
  Lines       40038    40088      +50     
==========================================
- Hits        17579    17574       -5     
- Misses      21363    21416      +53     
- Partials     1096     1098       +2     
Impacted Files Coverage Δ
warehouse/azure-synapse/azure-synapse.go 0.00% <0.00%> (ø)
warehouse/bigquery/bigquery.go 0.00% <0.00%> (ø)
warehouse/clickhouse/clickhouse.go 0.00% <0.00%> (ø)
warehouse/deltalake/deltalake.go 0.00% <0.00%> (ø)
warehouse/identities.go 1.04% <0.00%> (ø)
warehouse/mssql/mssql.go 0.00% <0.00%> (ø)
warehouse/postgres/postgres.go 0.00% <0.00%> (ø)
warehouse/redshift/redshift.go 0.00% <0.00%> (ø)
warehouse/schema.go 47.82% <0.00%> (ø)
warehouse/snowflake/snowflake.go 0.00% <0.00%> (ø)
... and 13 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@achettyiitr achettyiitr changed the title feat(warehouse): handle unrecognized schema in warehouse feat(warehouse): unrecognized schema in warehouse Nov 2, 2022
Copy link
Contributor

@abhimanyubabbar abhimanyubabbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Simple change in naming. Nothing special. If you feel it looks good handle it otherwise it's okay.

Good Job 🥳 🎉

warehouse/schema.go Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants