feat: clickhouse.JSON
Serializer interface
#1491
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
After running benchmarks (#1490), it's clear that the
clickhouse.JSON
type is the fastest way to append JSON data (excluding strings).Right now the only way to convert a struct to this JSON data is to use reflection magic and walk the struct recursively. A user could also write their own functions to convert their struct to
clickhouse.JSON
. To make this optimization more apparent, I have added interfaces forclickhouse.JSONSerializer
andclickhouse.JSONDeserializer
.If you have a custom struct you can implement these and the JSON column will make use of them when reading/writing data.
A helper function has also been included for easily reading these paths with a specific type in mind (
clickhouse.ExtractJSONPathAs[T](jsonObj, path)
). The user can also do this manually if they choose to.Other changes:
json.Marshal
to confirm it is using theNestedMap
funcExample
Example:
Then inside a batch append:
The underlying column implementation will then choose to use the user's
clickhouse.JSON
instead of building its own from reflection.The same applies to
Scan
:The test struct will be populated using the implemented interface.
Performance
By having the user choose exactly how the object is serialized/deserialized, we can save on CPU and memory allocations:
Serialization:
Deserialization:
Checklist
Delete items not relevant to your PR: