Cache the columns that are found in the clickhouse table #228
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The singer sdk SQLConnector assumes that get_table_columns is a pretty quick operation because it loops over every column in the schema an calls it twice. Once for getting the type
_get_column_type
and once forcolumn_exists
Roundtrip requests to clickhouse to get this information are not super fast, and tables with a large amount of columns can take a very long time to initialize.
This change caches the results of this column lookup so that we only have to inspect them once. It is possible that this operation is not safe if the sink is re-used because of meltano/sdk#2352 but given the simple use case that target clickhouse is using for this connector, this change is safe.