Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner, executor: enable inline projection for merge join #15463

Merged
merged 15 commits into from
Apr 9, 2020

Conversation

lance6716
Copy link
Contributor

@lance6716 lance6716 commented Mar 18, 2020

What problem does this PR solve?

Problem Summary:

part of #14428

What is changed and how it works?

What's Changed:

enable inline projection for merge join

How it Works:

markChildrenUsedColsis applied to joiner.

Related changes

Check List

Tests

  • Unit test

Side effects

Release note


benchmark based on BenchmarkMergeJoinExec

two data sources with columns of type {longlong, double, varstring(5*2014)}

case 1

no modify:

BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4         	       1	2407966792 ns/op	58474936 B/op	    4070 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:3000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4           	       2	 514012846 ns/op	58471576 B/op	    4053 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:30,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4             	       2	 709402071 ns/op	62614336 B/op	    4007 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:330000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4         	       1	2381562052 ns/op	58466440 B/op	    4064 allocs/op

case 2

childrenUsedCols (and output schema) changed to:

{true, false, false},
{false, false, false},

significant improvement due to less column copy

BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4         	      10	 104559070 ns/op	  717604 B/op	    3940 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:3000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4           	      67	  18877980 ns/op	  727133 B/op	    3928 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:30,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4             	      54	  23297968 ns/op	 4873998 B/op	    3884 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:330000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4         	      10	 104550756 ns/op	  722180 B/op	    3940 allocs/op

case 3

childrenUsedCols (and output schema) changed to:

{true, false, false},
{false, true, false},

significant improvement, slightly slower than case 2, which is reasonable because output schema has an extra double column

BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4         	      10	 107155546 ns/op	  759140 B/op	    3957 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:3000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4           	      54	  22822921 ns/op	  761917 B/op	    3943 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:30,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4             	      37	  28515056 ns/op	 4910528 B/op	    3899 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:330000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)-4         	      10	 109264842 ns/op	  757220 B/op	    3954 allocs/op

@lance6716 lance6716 requested review from a team as code owners March 18, 2020 09:13
@sre-bot sre-bot added the contribution This PR is from a community contributor. label Mar 18, 2020
@ghost ghost requested review from francis0407, lzmhhh123, qw4990, XuHuaiyu and a team and removed request for a team March 18, 2020 09:13
@github-actions github-actions bot added the sig/execution SIG execution label Mar 18, 2020
@lance6716
Copy link
Contributor Author

/run-all-tests

@lance6716 lance6716 changed the title [WIP]planner, executor: enable inline projection for merge join planner, executor: enable inline projection for merge join Mar 18, 2020
Copy link
Contributor

@SunRunAway SunRunAway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

executor/benchmark_test.go Outdated Show resolved Hide resolved
executor/benchmark_test.go Outdated Show resolved Hide resolved
executor/executor_required_rows_test.go Show resolved Hide resolved
Copy link
Contributor

@SunRunAway SunRunAway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Benchmark data is appreciated.

@SunRunAway SunRunAway added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 29, 2020
@lance6716
Copy link
Contributor Author

/bench +tpch

@SunRunAway
Copy link
Contributor

Hi, @lance6716
You can take a look at BenchmarkMergeJoinExec and try to adjust to fit inline projection situation. This link may help you.

@sre-bot
Copy link
Contributor

sre-bot commented Mar 29, 2020

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: c8fa259b2e26eebc274a6637f1ee5dcc13102177
+++ tidb: 256a54df5a0609a561f438114233fbbe308222fd
tikv: ea0fd4d7688d1d462a94cfe4f7ab4d7cb0b30ab5
pd: a63cf42fe9687c0e580777a01facc4004da69b63
================================================================================
oltp_update_index:
    * QPS: 4346.24 ± 0.40% (std=12.51) delta: 0.29% (p=0.294)
    * Latency p50: 29.45 ± 0.41% (std=0.09) delta: -0.29%
    * Latency p99: 51.94 ± 0.00% (std=0.00) delta: -3.60%
            
oltp_insert:
    * QPS: 6917.50 ± 0.49% (std=23.50) delta: 0.12% (p=0.737)
    * Latency p50: 18.50 ± 0.47% (std=0.06) delta: -0.12%
    * Latency p99: 32.68 ± 2.26% (std=0.49) delta: 1.36%
            
oltp_read_write:
    * QPS: 15004.44 ± 0.30% (std=26.48) delta: 0.55% (p=0.573)
    * Latency p50: 170.97 ± 0.29% (std=0.30) delta: -0.55%
    * Latency p99: 342.80 ± 5.95% (std=13.45) delta: 2.04%
            
oltp_point_select:
    * QPS: 41951.37 ± 0.83% (std=208.27) delta: 1.65% (p=0.005)
    * Latency p50: 3.05 ± 0.90% (std=0.02) delta: -1.69%
    * Latency p99: 10.84 ± 0.00% (std=0.00) delta: 0.00%
            

@lance6716
Copy link
Contributor Author

Hi, @lance6716
You can take a look at BenchmarkMergeJoinExec and try to adjust to fit inline projection situation. This link may help you.

Done

@sre-bot
Copy link
Contributor

sre-bot commented Mar 30, 2020

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 514b1809286a9d5c71f6cf257a40c12750d1f3f7
+++ tidb: 256a54df5a0609a561f438114233fbbe308222fd
tikv: be05858e9b68a4ceb5312cf9816074e199b2e293
pd: a63cf42fe9687c0e580777a01facc4004da69b63
================================================================================
oltp_update_index:
    * QPS: 5991.08 ± 0.58% (std=22.41) delta: 0.72% (p=0.207)
    * Latency p50: 21.37 ± 0.59% (std=0.08) delta: -0.19%
    * Latency p99: 40.73 ± 0.90% (std=0.37) delta: 0.90%
            
oltp_insert:
    * QPS: 4851.73 ± 0.10% (std=3.25) delta: -0.15% (p=0.324)
    * Latency p50: 26.38 ± 0.09% (std=0.02) delta: 0.13%
    * Latency p99: 47.06 ± 2.73% (std=0.95) delta: 5.25%
            
oltp_read_write:
    * QPS: 14958.91 ± 0.89% (std=86.66) delta: 0.03% (p=0.937)
    * Latency p50: 171.02 ± 0.41% (std=0.44) delta: -0.29%
    * Latency p99: 309.15 ± 7.36% (std=13.88) delta: -2.65%
            
oltp_point_select:
    * QPS: 42126.29 ± 0.61% (std=165.79) delta: -0.33% (p=0.314)
    * Latency p50: 3.04 ± 0.74% (std=0.01) delta: 0.14%
    * Latency p99: 10.46 ± 0.00% (std=0.00) delta: -0.90%
            
oltp_update_non_index:
    * QPS: 4746.37 ± 0.20% (std=5.66) delta: 0.04% (p=0.749)
    * Latency p50: 26.96 ± 0.20% (std=0.03) delta: 0.04%
    * Latency p99: 42.24 ± 2.73% (std=0.85) delta: -1.37%
            

@SunRunAway
Copy link
Contributor

Hi, @lance6716
You can take a look at BenchmarkMergeJoinExec and try to adjust to fit inline projection situation. This link may help you.

Done

Hi, @lance6716
Well done.
Case 3 could persist in BenchmarkMergeJoinExec as a sub-bench.

@sre-bot
Copy link
Contributor

sre-bot commented Apr 2, 2020

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 959eca8f3a395169bb5d5938f54fddab4d0750f5
+++ tidb: 256a54df5a0609a561f438114233fbbe308222fd
tikv: 3b11b678830a06e67bca0e68bf2fbe1c336a060e
pd: 1a801edaddbbe73ab17256ac354301c1ead02cd8
================================================================================
oltp_update_index:
    * QPS: 4314.46 ± 0.32% (std=9.25) delta: -0.44% (p=0.329)
    * Latency p50: 29.62 ± 0.22% (std=0.05) delta: 0.30%
    * Latency p99: 54.37 ± 4.55% (std=1.77) delta: 1.56%
            
oltp_insert:
    * QPS: 7036.44 ± 0.93% (std=47.70) delta: 1.50% (p=0.032)
    * Latency p50: 18.19 ± 0.92% (std=0.12) delta: -1.49%
    * Latency p99: 31.66 ± 0.90% (std=0.29) delta: -1.81%
            
oltp_read_write:
    * QPS: 14317.86 ± 0.77% (std=93.34) delta: 0.06% (p=0.909)
    * Latency p50: 179.19 ± 0.81% (std=1.15) delta: -0.04%
    * Latency p99: 335.39 ± 8.29% (std=18.07) delta: 4.11%
            
oltp_point_select:
    * QPS: 42667.50 ± 0.76% (std=194.90) delta: 2.12% (p=0.005)
    * Latency p50: 3.00 ± 0.67% (std=0.01) delta: -2.04%
    * Latency p99: 10.46 ± 0.00% (std=0.00) delta: -2.94%
            
oltp_update_non_index:
    * QPS: 4736.98 ± 0.35% (std=12.64) delta: -0.20% (p=0.315)
    * Latency p50: 27.02 ± 0.37% (std=0.07) delta: 0.19%
    * Latency p99: 42.24 ± 2.73% (std=0.85) delta: 1.84%
            

@lance6716
Copy link
Contributor Author

Hi, @lance6716
You can take a look at BenchmarkMergeJoinExec and try to adjust to fit inline projection situation. This link may help you.

Done

Hi, @lance6716
Well done.
Case 3 could persist in BenchmarkMergeJoinExec as a sub-bench.

goos: darwin
goarch: amd64
pkg: github.com/pingcap/tidb/executor
BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:false-4         	       1	2449115485 ns/op	58474936 B/op	    4070 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:true-4          	      10	 108524066 ns/op	  759140 B/op	    3957 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:3000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:false-4           	       2	 680593549 ns/op	58471576 B/op	    4053 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:3000,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:true-4            	      51	  22117074 ns/op	  762900 B/op	    3943 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:30,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:false-4             	       3	 509314305 ns/op	62615528 B/op	    4005 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:30,_innerRows:300000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:true-4              	      43	  28141019 ns/op	 4911220 B/op	    3900 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:330000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:false-4         	       1	1472482612 ns/op	58485000 B/op	    4072 allocs/op
BenchmarkMergeJoinExec/merge_join_(outerRows:300000,_innerRows:330000,_concurency:4,_outerJoinKeyIdx:_[0_1],_innerJoinKeyIdx:_[0_1],_NeedOuterSort:false)_InlineProj:true-4          	      10	 108598788 ns/op	  758148 B/op	    3955 allocs/op

Reminiscent
Reminiscent previously approved these changes Apr 7, 2020
Copy link
Contributor

@Reminiscent Reminiscent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! LGTM

@Reminiscent Reminiscent added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 7, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 7, 2020

@lance6716 merge failed.

@Reminiscent
Copy link
Contributor

@lance6716 Thanks for your contribution. Please resolve the conflicts. Thanks!

@lance6716
Copy link
Contributor Author

lance6716 commented Apr 7, 2020

@Reminiscent conflict resolved, waiting for an approval 😄

@lance6716
Copy link
Contributor Author

ping @SunRunAway @Reminiscent

@SunRunAway SunRunAway added status/can-merge Indicates a PR has been approved by a committer. and removed status/can-merge Indicates a PR has been approved by a committer. labels Apr 9, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 9, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 9, 2020

@lance6716 merge failed.

@SunRunAway SunRunAway merged commit 49ee75a into pingcap:master Apr 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution This PR is from a community contributor. sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants