Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: slow-SQL-Q16-when subquery use not like and not in sql on 10GB data spend too much time #331

Closed
1 task done
adofsauron opened this issue Jul 28, 2022 · 3 comments · Fixed by #680
Closed
1 task done
Assignees
Labels
A-bug Something isn't working

Comments

@adofsauron
Copy link
Collaborator

Describe the problem

  1. use tpch data
  2. execute sql

`
select
p_brand,
p_type,
p_size,
count(distinct ps_suppkey) as supplier_cnt
from
partsupp,
part
where
p_partkey = ps_partkey
and p_brand <> 'Brand#45'
and p_type not like 'MEDIUM POLISHED%'
and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
and ps_suppkey not in (
select
s_suppkey
from
supplier
where
s_comment like '%Customer%Complaints%'
)
group by
p_brand,
p_type,
p_size
order by
supplier_cnt desc,
p_brand,
p_type,
p_size;

`

Expected behavior

No response

How To Reproduce

No response

Environment

No response

Are you interested in submitting a PR to solve the problem?

  • Yes, I will!
@adofsauron adofsauron added the A-bug Something isn't working label Jul 28, 2022
@adofsauron adofsauron self-assigned this Jul 28, 2022
@adofsauron adofsauron changed the title bug: slow-SQL-Q16-when subquery use not like and not in sql on 10GB data spend 40 mininus and happen OOM on 100GB data bug: slow-SQL-Q16-when subquery use not like and not in sql on 10GB data spend too much time Jul 28, 2022
@adofsauron
Copy link
Collaborator Author

adofsauron commented Aug 9, 2022

摘要:
由于100GB数据量时执行Q16发生了OOM,以10GB数据量作为测试

运行统计:

耗时:

explain执行计划分析:

mysql> explain select
-> p_brand,
-> p_type,
-> p_size,
-> count(distinct ps_suppkey) as supplier_cnt
-> from
-> partsupp,
-> part
-> where
-> p_partkey = ps_partkey
-> and p_brand <> 'Brand#45'
-> and p_type not like 'MEDIUM POLISHED%'
-> and p_size in (49,
-> 14,
-> 23,
-> 45,
-> 19,
-> 3,
-> 36,
-> 9)
-> and ps_suppkey not in (
-> select
-> s_suppkey
-> from
-> supplier
-> where
-> s_comment like '%Customer%Complaints%' )
-> group by
-> p_brand,
-> p_type,
-> p_size
-> order by
-> supplier_cnt desc,
-> p_brand,
-> p_type,
-> p_size\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: partsupp
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 8000000
filtered: 100.00
Extra: Using where with pushed condition (not(<in_optimizer>(tpch.partsupp.ps_suppkey,tpch.partsupp.ps_suppkey in ( (/* select#2 */ select tpch.supplier.s_suppkey from tpch.supplier where (tpch.supplier.s_comment like '%Customer%Complaints%') ), <primary_index_lookup>(tpch.partsupp.ps_suppkey in on <auto_key> where ((tpch.partsupp.ps_suppkey = materialized-subquery.s_suppkey)))))))(t0) Pckrows: 2, susp. 2 (0 empty 0 full). Conditions: 1; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: part
partitions: NULL
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tpch.partsupp.ps_partkey
rows: 1
filtered: 40.00
Extra: Using where with pushed condition ((tpch.part.p_brand <> 'Brand#45') and (not((tpch.part.p_type like 'MEDIUM POLISHED%'))) and (tpch.part.p_size in (49,14,23,45,19,3,36,9)))(t0) Pckrows: 31, susp. 31 (0 empty 0 full). Conditions: 3
*************************** 3. row ***************************
id: 2
select_type: SUBQUERY
table: supplier
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 100000
filtered: 11.11
Extra: Using where with pushed condition (tpch.supplier.s_comment like '%Customer%Complaints%')(t0) Pckrows: 2, susp. 2 (0 empty 0 full). Conditions: 1
3 rows in set, 1 warning (0.02 sec)

火焰图:

slow-16-hash-paral-20220807Aug081659879007 mysql

@adofsauron
Copy link
Collaborator Author

adofsauron commented Aug 9, 2022

对比mysql/innodb

摘要:
分析mysql/innodb如何处理Q17

硬件环境:

  1. 内存: 8GB
  2. cpu: 8核心, 3.2GHz
  3. mysql: 8.0.28

explain分析:

mysql> explain select
-> p_brand,
-> p_type,
-> p_size,
-> count(distinct ps_suppkey) as supplier_cnt
-> from
-> partsupp,
-> part
-> where
-> p_partkey = ps_partkey
-> and p_brand <> 'Brand#45'
-> and p_type not like 'MEDIUM POLISHED%'
-> and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
-> and ps_suppkey not in (
-> select
-> s_suppkey
-> from
-> supplier
-> where
-> s_comment like '%Customer%Complaints%'
-> )
-> group by
-> p_brand,
-> p_type,
-> p_size
-> order by
-> supplier_cnt desc,
-> p_brand,
-> p_type,
-> p_size\G
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id: 8
Current database: tpch

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: part
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 1887933
filtered: 40.00
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: partsupp
partitions: NULL
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: tpch.part.p_partkey
rows: 3
filtered: 100.00
Extra: Using index
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table:
partitions: NULL
type: eq_ref
possible_keys: <auto_distinct_key>
key: <auto_distinct_key>
key_len: 5
ref: tpch.partsupp.ps_suppkey
rows: 1
filtered: 100.00
Extra: Using where; Not exists
*************************** 4. row ***************************
id: 2
select_type: MATERIALIZED
table: supplier
partitions: NULL
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 99026
filtered: 100.00
Extra: Using where
4 rows in set, 1 warning (0.04 sec)

耗时:

supplier_cnt: 4
27840 rows in set (1 min 1.96 sec)

热力图:

0000

@adofsauron
Copy link
Collaborator Author

摘要:
分析Q16执行耗时究竟在哪里

分析:

火焰图:

分析:

  1. 占用时间的大头在物化中的aggregate

日志埋点:

[2022-08-09 21:58:35.661551] [61425] [INFO] [temp_table.cpp:2043] MSG: Timer 0.006423 : TempTable::Materialize over
[2022-08-09 21:58:40.601132] [61425] [INFO] [aggregation_algorithm.cpp:75] MSG: AggregationAlgorithm [AggregationAlgorithm::Aggregate start] memLeft: 250803432
[2022-08-09 21:59:02.272033] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.776923 : MultiDimensionalDistinctScan index: 7422784
[2022-08-09 21:59:06.900646] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.618125 : MultiDimensionalDistinctScan index: 7118920
[2022-08-09 21:59:11.049991] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.138833 : MultiDimensionalDistinctScan index: 6388664
[2022-08-09 21:59:14.729436] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.668982 : MultiDimensionalDistinctScan index: 5536700
[2022-08-09 21:59:18.003550] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.263696 : MultiDimensionalDistinctScan index: 4867184
[2022-08-09 21:59:20.815750] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.801977 : MultiDimensionalDistinctScan index: 4077864
[2022-08-09 21:59:23.193645] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.367468 : MultiDimensionalDistinctScan index: 3287136
[2022-08-09 21:59:25.129056] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.924884 : MultiDimensionalDistinctScan index: 2558128
[2022-08-09 21:59:26.635697] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.496256 : MultiDimensionalDistinctScan index: 1828284
[2022-08-09 21:59:27.625411] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.979455 : MultiDimensionalDistinctScan index: 916828
[2022-08-09 21:59:28.247990] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.611987 : MultiDimensionalDistinctScan index: 243348
[2022-08-09 21:59:46.427592] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.695173 : MultiDimensionalDistinctScan index: 7301112
[2022-08-09 21:59:50.880305] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.442508 : MultiDimensionalDistinctScan index: 6936244
[2022-08-09 21:59:54.887755] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.997080 : MultiDimensionalDistinctScan index: 6145216
[2022-08-09 21:59:58.459334] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.561216 : MultiDimensionalDistinctScan index: 5354064
[2022-08-09 22:00:01.555317] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.085863 : MultiDimensionalDistinctScan index: 4563312
[2022-08-09 22:00:04.354050] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.788534 : MultiDimensionalDistinctScan index: 3895536
[2022-08-09 22:00:06.712467] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.347400 : MultiDimensionalDistinctScan index: 3165648
[2022-08-09 22:00:08.510846] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.787973 : MultiDimensionalDistinctScan index: 2315076
[2022-08-09 22:00:09.853298] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.331898 : MultiDimensionalDistinctScan index: 1524584
[2022-08-09 22:00:10.709280] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.845440 : MultiDimensionalDistinctScan index: 673088
[2022-08-09 22:00:23.493576] [61196] [INFO] [engine.cpp:1352] MSG: Command: select 4/4, update 0/0, insert 0/0, load 0/0, queries 0/11
[2022-08-09 22:00:23.493703] [61196] [INFO] [engine.cpp:1364] MSG: Select: 0/1, Loaded: 0/0(0/0), dup: 0/0, insert: 0/0, failed insert: 0/0, update: 0/0
[2022-08-09 22:00:26.931971] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.573716 : MultiDimensionalDistinctScan index: 7118920
[2022-08-09 22:00:31.343835] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.401538 : MultiDimensionalDistinctScan index: 6875492
[2022-08-09 22:00:35.261567] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.907601 : MultiDimensionalDistinctScan index: 6023624
[2022-08-09 22:00:38.722789] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.450872 : MultiDimensionalDistinctScan index: 5232348
[2022-08-09 22:00:41.677207] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.942256 : MultiDimensionalDistinctScan index: 4320688
[2022-08-09 22:00:44.200292] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.512840 : MultiDimensionalDistinctScan index: 3591280
[2022-08-09 22:00:46.202083] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.991442 : MultiDimensionalDistinctScan index: 2679604
[2022-08-09 22:00:47.783475] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.571247 : MultiDimensionalDistinctScan index: 1949916
[2022-08-09 22:00:48.879044] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.085403 : MultiDimensionalDistinctScan index: 1099112
[2022-08-09 22:00:49.574514] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.685280 : MultiDimensionalDistinctScan index: 369464
[2022-08-09 22:01:03.876123] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.413452 : MultiDimensionalDistinctScan index: 6875492
[2022-08-09 22:01:08.099446] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.213053 : MultiDimensionalDistinctScan index: 6571080
[2022-08-09 22:01:11.822763] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.712914 : MultiDimensionalDistinctScan index: 5658540
[2022-08-09 22:01:15.064234] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.231342 : MultiDimensionalDistinctScan index: 4867320
[2022-08-09 22:01:17.856746] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.782235 : MultiDimensionalDistinctScan index: 4077708
[2022-08-09 22:01:20.225255] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.358306 : MultiDimensionalDistinctScan index: 3348056
[2022-08-09 22:01:22.078699] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.842210 : MultiDimensionalDistinctScan index: 2436356
[2022-08-09 22:01:23.378127] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.289039 : MultiDimensionalDistinctScan index: 1463748
[2022-08-09 22:01:23.494066] [61196] [INFO] [engine.cpp:1352] MSG: Command: select 5/9, update 0/0, insert 0/0, load 0/0, queries 12/23
[2022-08-09 22:01:23.494132] [61196] [INFO] [engine.cpp:1364] MSG: Select: 0/1, Loaded: 0/0(0/0), dup: 0/0, insert: 0/0, failed insert: 0/0, update: 0/0
[2022-08-09 22:01:24.266861] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.877328 : MultiDimensionalDistinctScan index: 733792
[2022-08-09 22:01:36.696847] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 4.241217 : MultiDimensionalDistinctScan index: 6632096
[2022-08-09 22:01:40.624744] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.917591 : MultiDimensionalDistinctScan index: 6084336
[2022-08-09 22:01:44.073692] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.438733 : MultiDimensionalDistinctScan index: 5232348
[2022-08-09 22:01:47.027979] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.943680 : MultiDimensionalDistinctScan index: 4381292
[2022-08-09 22:01:49.461790] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.423435 : MultiDimensionalDistinctScan index: 3469804
[2022-08-09 22:01:51.416958] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.944651 : MultiDimensionalDistinctScan index: 2619084
[2022-08-09 22:01:52.851393] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.423935 : MultiDimensionalDistinctScan index: 1706776
[2022-08-09 22:01:53.770386] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.908534 : MultiDimensionalDistinctScan index: 794732
[2022-08-09 22:02:04.250603] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.988361 : MultiDimensionalDistinctScan index: 6266932
[2022-08-09 22:02:07.795016] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.534161 : MultiDimensionalDistinctScan index: 5475808
[2022-08-09 22:02:10.759909] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.954689 : MultiDimensionalDistinctScan index: 4441772
[2022-08-09 22:02:13.189040] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.418797 : MultiDimensionalDistinctScan index: 3469804
[2022-08-09 22:02:15.051540] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.852128 : MultiDimensionalDistinctScan index: 2497300
[2022-08-09 22:02:16.336874] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.274991 : MultiDimensionalDistinctScan index: 1463748
[2022-08-09 22:02:17.061614] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.714381 : MultiDimensionalDistinctScan index: 430320
[2022-08-09 22:02:23.494450] [61196] [INFO] [engine.cpp:1352] MSG: Command: select 0/9, update 0/0, insert 0/0, load 0/0, queries 0/23
[2022-08-09 22:02:23.494525] [61196] [INFO] [engine.cpp:1364] MSG: Select: 0/1, Loaded: 0/0(0/0), dup: 0/0, insert: 0/0, failed insert: 0/0, update: 0/0
[2022-08-09 22:02:25.310746] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 3.435298 : MultiDimensionalDistinctScan index: 5414948
[2022-08-09 22:02:28.054595] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 2.733469 : MultiDimensionalDistinctScan index: 4138488
[2022-08-09 22:02:29.924268] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 1.859229 : MultiDimensionalDistinctScan index: 2558256
[2022-08-09 22:02:30.934135] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.999507 : MultiDimensionalDistinctScan index: 977548
[2022-08-09 22:02:35.003509] [61425] [INFO] [aggregation_algorithm.cpp:539] MSG: Timer 0.957001 : MultiDimensionalDistinctScan index: 912740
[2022-08-09 22:02:35.009082] [61425] [INFO] [aggregation_algorithm.cpp:253] MSG: Timer 234.252762 : AggregationAlgorithm::Aggregate MultiDimensionalGroupByScan
[2022-08-09 22:02:35.009615] [61425] [INFO] [aggregation_algorithm.cpp:267] MSG: Timer 234.408798 : AggregationAlgorithm::Aggregate over aggregate_time: 234.408798
[2022-08-09 22:02:35.009875] [61425] [INFO] [aggregation_algorithm.cpp:75] MSG: AggregationAlgorithm [AggregationAlgorithm::Aggregate over] memLeft: 250378324
[2022-08-09 22:02:36.327052] [61425] [INFO] [temp_table.cpp:2043] MSG: Timer 235.726242 : TempTable::Materialize over

核心函数:

AggregationAlgorithm::MultiDimensionalDistinctScan

void AggregationAlgorithm::MultiDimensionalDistinctScan(GroupByWrapper &gbw, MIIterator &mit) {
// NOTE: to maintain distinct cache compatibility, rows must be visited in the
// same order!
MEASURE_FET("TempTable::MultiDimensionalDistinctScan(GroupByWrapper& gbw)");
while (gbw.AnyOmittedByDistinct()) { // any distincts omitted? =>
// another pass needed
// Some displays
int64_t max_size_for_display = 0;
for (int i = gbw.NumOfGroupingAttrs(); i < gbw.NumOfAttrs(); i++)
if (gbw.distinct_watch.OmittedFilter(i) && gbw.distinct_watch.OmittedFilter(i)->NumOfOnes() > max_size_for_display)
max_size_for_display = gbw.distinct_watch.OmittedFilter(i)->NumOfOnes();
rccontrol.lock(m_conn->GetThreadID())
<< "Next distinct pass: " << max_size_for_display << " rows left" << system::unlock;

gbw.RewindDistinctBuffers();  // reset buffers for a new contents, rewind
                              // cache

std::chrono::high_resolution_clock::time_point outer_start = std::chrono::high_resolution_clock::now();

for (int distinct_attr = gbw.NumOfGroupingAttrs(); distinct_attr < gbw.NumOfAttrs(); distinct_attr++) {
  Filter *omit_filter = gbw.distinct_watch.OmittedFilter(distinct_attr);
  if (omit_filter && !omit_filter->IsEmpty()) {
    mit.Rewind();
    int64_t cur_tuple = 0;
    int64_t uniform_pos = common::NULL_VALUE_64;
    bool require_locking = true;

    std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();

    int index = 0;
    while (mit.IsValid()) {
      if (mit.PackrowStarted()) {
        if (m_conn->Killed()) throw common::KilledException();
        // All packrow-level operations
        omit_filter->Commit();
        gbw.ResetPackrow();
        bool skip_packrow = false;
        bool packrow_done = false;
        bool part_omitted = false;
        bool stop_all = false;
        int64_t packrow_length = mit.GetPackSizeLeft();
        // Check whether the packrow contain any not aggregated rows
        if (omit_filter->IsEmptyBetween(cur_tuple, cur_tuple + packrow_length - 1))
          skip_packrow = true;
        else {
          int64_t rows_in_pack = omit_filter->NumOfOnesBetween(cur_tuple, cur_tuple + packrow_length - 1);
          bool agg_not_changeable = false;
          AggregateRough(gbw, mit, packrow_done, part_omitted, agg_not_changeable, stop_all, uniform_pos,
                         rows_in_pack, 1, distinct_attr);
          if (packrow_done) {  // This packrow will not be needed any more
            omit_filter->ResetBetween(cur_tuple, cur_tuple + packrow_length - 1);
            gbw.OmitInCache(distinct_attr, packrow_length);
          }
        }
        if (skip_packrow) {
          mit.NextPackrow();
          cur_tuple += packrow_length;
          continue;
        }
        require_locking = true;  // a new packrow, so locking will be needed
      }

      // All row-level operations
      if (omit_filter->Get(cur_tuple)) {
        bool value_successfully_aggregated = false;
        if (gbw.CacheValid(distinct_attr)) {
          value_successfully_aggregated = gbw.PutCachedValue(distinct_attr);
        } else {
          // Locking etc.
          if (require_locking) {
            gbw.LockPack(distinct_attr, mit);
            if (uniform_pos != common::PLUS_INF_64)
              for (int gr_a = 0; gr_a < gbw.NumOfGroupingAttrs(); gr_a++) gbw.LockPack(gr_a, mit);
            require_locking = false;
          }

          int64_t pos = 0;
          bool existed = true;
          if (uniform_pos != common::PLUS_INF_64)
            pos = uniform_pos;  // existed == true, as above
          else {                // Construct the grouping vector
            for (int gr_a = 0; gr_a < gbw.NumOfGroupingAttrs(); gr_a++) {
              if (gbw.ColumnNotOmitted(gr_a)) gbw.PutGroupingValue(gr_a, mit);
            }
            existed = gbw.FindCurrentRow(pos);
          }
          ASSERT(existed && pos != common::NULL_VALUE_64, "row does not exist");
          value_successfully_aggregated = gbw.PutAggregatedValue(distinct_attr, pos, mit);
        }
        if (value_successfully_aggregated) omit_filter->ResetDelayed(cur_tuple);
        gbw.distinct_watch.NextRead(distinct_attr);
      }
      cur_tuple++;
      ++mit;

      ++index;
    }

    auto diff = std::chrono::duration_cast<std::chrono::duration<float>>(
        std::chrono::high_resolution_clock::now() - start);
    TIANMU_LOG(LogCtl_Level::INFO, "Timer %f : MultiDimensionalDistinctScan mit index: %d ",
               diff.count(), index);

    omit_filter->Commit();  // committing delayed resets
  }
}

auto outer_diff =
    std::chrono::duration_cast<std::chrono::duration<float>>(std::chrono::high_resolution_clock::now() - outer_diff);
TIANMU_LOG(LogCtl_Level::INFO, "Timer %f : MultiDimensionalDistinctScan outer", outer_diff.count());

gbw.UpdateDistinctCaches();  // take into account values already counted

}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant