Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Make ColumnarBatchSerializer supports relocation #2051

Closed
WangGuangxin opened this issue Jun 23, 2023 · 0 comments · Fixed by #2052
Closed

[VL] Make ColumnarBatchSerializer supports relocation #2051

WangGuangxin opened this issue Jun 23, 2023 · 0 comments · Fixed by #2052

Comments

@WangGuangxin
Copy link
Contributor

WangGuangxin commented Jun 23, 2023

Spark supports fetch the contiguous shuffle blocks in batch, which is enabled by default (by conf spark.sql.adaptive.fetchShuffleBlocksInBatch). This feature has a big performance improvement in our production.

However, currently, since ColumnarBatchSerializer's supportsRelocationOfSerializedObjects return false, so that this feature cann't take effect.
In fact, the arrow serialization does support reloation if we don't write schema (which is default to true) and don't write EOS (which is an optional in arrow rpc serialization format)

https://wesm.github.io/arrow-site-test/format/IPC.html#streaming-format
image

https://github.com/apache/arrow/blob/maint-12.0.1/cpp/src/arrow/ipc/message.cc#L496

@zhouyuan zhouyuan changed the title Make ColumnarBatchSerializer supports relocation [VL] Make ColumnarBatchSerializer supports relocation Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant