-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the proposal about push down plan to storage to execute. #2530
Add the proposal about push down plan to storage to execute. #2530
Conversation
|
||
In storage side: | ||
|
||
The `Edge Computation` will append to the origin plan in storage, e.g. Append push-down plan into `GetNeighbors` origin plan. So storage scan data firstly, then complete the push-down computation and response data finally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the idea comes from coprocessor, as far as I know, both hbase and tidb is the query layer to generate the whole plan for tikv/region server, not only part of plan. The append is hard to control, for now there is only one output node for go
, what if there are more than one (e.g. lookup may have more than one when we lookup where e1.c1 == 1 or e2.c1 == 1)?
IMO, either just use flag like now, or the whole plan should be generate in query engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I known the tikv coprocessor only support a subset of executor, so maybe it can't push down the whole plan to storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I means a part of the whole plan generated by query layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your idea. This is quite interesting, imagine we generate the whole plan for storage in query layer not only part of them, many optimization will be easier (e.g. filter). Query layer only need to merge/reorder the executor in logic plan.
If we only want to push down a part of plan to storage, there is not a big difference with just do optimize in query layer as long as we support agg/sort/dedup and so on in each storage RPC interface. We could talk offline if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, most TP scenarios do not need to push down the graph side calculation(sql-like) to the storage side, if the index is implemented well (IO will not become the most critical bottleneck). So the point is to enhance the ability of the index rather than just push down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the capacity of current implementation is not complete. Transforming subplan to flags lose information.
No description provided.