Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unstable test "fullstack-test/mpp/rollup_tpcds.test" #9613

Closed
JaySon-Huang opened this issue Nov 14, 2024 · 2 comments · Fixed by #9628
Closed

unstable test "fullstack-test/mpp/rollup_tpcds.test" #9613

JaySon-Huang opened this issue Nov 14, 2024 · 2 comments · Fixed by #9628
Assignees
Labels
affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. component/storage severity/major type/bug The issue is confirmed as a bug.

Comments

@JaySon-Huang
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

https://do.pingcap.net/jenkins/blue/organizations/jenkins/pingcap%2Ftiflash%2Frelease-8.5%2Fpull_integration_test/detail/pull_integration_test/14/pipeline/

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiFlash version? (Required)

master, release-8.5 branch

@JaySon-Huang JaySon-Huang added the type/bug The issue is confirmed as a bug. label Nov 14, 2024
@JaySon-Huang JaySon-Huang self-assigned this Nov 14, 2024
@JaySon-Huang
Copy link
Contributor Author

fullstack-test-logs.tar.gz


# 3 MPP requests for the same table began
[2024/11/14 16:01:22.923 +08:00] [DEBUG] [executeQuery.cpp:397] ["(from 172.18.0.5:54170, query_id: 7a777a3e-39a2-474f-bc57-cbe5c3baf8d4) ..."] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:6>"] [thread_id=875]
[2024/11/14 16:01:22.930 +08:00] [DEBUG] [executeQuery.cpp:397] ["(from 172.18.0.5:54168, query_id: 729469fb-b733-41b9-8530-dd9209e4af30) ..."] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:1>"] [thread_id=886]
[2024/11/14 16:01:22.936 +08:00] [DEBUG] [executeQuery.cpp:397] ["(from 172.18.0.5:54144, query_id: 03a9f7c5-a3cb-4a47-b79a-9735ff280fc1) ..."] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:11>"] [thread_id=896]

# all of the requests will try to create the table on TiFlash
[2024/11/14 16:01:22.941 +08:00] [INFO] [TiDBSchemaSyncer.cpp:261] ["Sync table schema begin, table_id=386"] [source="keyspace=4294967295"] [thread_id=875]
[2024/11/14 16:01:22.957 +08:00] [INFO] [TiDBSchemaSyncer.cpp:261] ["Sync table schema begin, table_id=386"] [source="keyspace=4294967295"] [thread_id=886]
[2024/11/14 16:01:22.957 +08:00] [INFO] [TiDBSchemaSyncer.cpp:261] ["Sync table schema begin, table_id=386"] [source="keyspace=4294967295"] [thread_id=896]
[2024/11/14 16:01:23.001 +08:00] [INFO] [SchemaBuilder.cpp:1156] ["Create table db_2.date_dim (database_id=2 table_id=386) with statement: CREATE TABLE `db_2`.`t_386`..."] [source="keyspace=4294967295"] [thread_id=875]
[2024/11/14 16:01:23.008 +08:00] [INFO] [SchemaBuilder.cpp:1156] ["Create table db_2.date_dim (database_id=2 table_id=386) with statement: CREATE TABLE `db_2`.`t_386`..."] [source="keyspace=4294967295"] [thread_id=886]
[2024/11/14 16:01:23.014 +08:00] [INFO] [SchemaBuilder.cpp:1156] ["Create table db_2.date_dim (database_id=2 table_id=386) with statement: CREATE TABLE `db_2`.`t_386`..."] [source="keyspace=4294967295"] [thread_id=896]

# thread 886, thread 896 start to wait for the creation complete
[2024/11/14 16:01:23.031 +08:00] [WARN] [InterpreterCreateQuery.cpp:585] ["createTable failed with error code is 337, error info is DB::Exception: Table db_2.t_386 is creating or attaching right now, ..."] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=886]
[2024/11/14 16:01:23.032 +08:00] [ERROR] [InterpreterCreateQuery.cpp:597] ["createTable failed but table not exist now, \nWe will sleep for 20 ms and try again."] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=886]

# thread 875 acquired the DDL guard and began to create table
[2024/11/14 16:01:23.040 +08:00] [INFO] [StorageDeltaMerge.cpp:127] ["updateTableColumnInfo, table_name=t_386 ordinary=\"columns format version: 1\n28 columns:\n`d_date_sk` Int32\n`d_date_id` String\n`d_date` Nullable(MyDate)\n`d_month_seq` Nullable(Int32)\n`d_week_seq` Nullable(Int32)\n`d_quarter_seq` Nullable(Int32)\n`d_year` Nullable(Int32)\n`d_dow` Nullable(Int32)\n`d_moy` Nullable(Int32)\n`d_dom` Nullable(Int32)\n`d_qoy` Nullable(Int32)\n`d_fy_year` Nullable(Int32)\n`d_fy_quarter_seq` Nullable(Int32)\n`d_fy_week_seq` Nullable(Int32)\n`d_day_name` Nullable(String)\n`d_quarter_name` Nullable(String)\n`d_holiday` Nullable(String)\n`d_weekend` Nullable(String)\n`d_following_holiday` Nullable(String)\n`d_first_dom` Nullable(Int32)\n`d_last_dom` Nullable(Int32)\n`d_same_day_ly` Nullable(Int32)\n`d_same_day_lq` Nullable(Int32)\n`d_current_day` Nullable(String)\n`d_current_week` Nullable(String)\n`d_current_month` Nullable(String)\n`d_current_quarter` Nullable(String)\n`d_current_year` Nullable(String)\n\" materialized=\"columns format version: 1\n0 columns:\n\""] [source=db_2.t_386] [thread_id=875]
[2024/11/14 16:01:23.041 +08:00] [INFO] [StorageDeltaMerge.cpp:334] ["updateTableColumnInfo finished, table_name=t_386 table_column_defines=d_date_sk Int32, d_date_id String, d_date Nullable, d_month_seq Nullable, d_week_seq Nullable, d_quarter_seq Nullable, d_year Nullable, d_dow Nullable, d_moy Nullable, d_dom Nullable, d_qoy Nullable, d_fy_year Nullable, d_fy_quarter_seq Nullable, d_fy_week_seq Nullable, d_day_name Nullable, d_quarter_name Nullable, d_holiday Nullable, d_weekend Nullable, d_following_holiday Nullable, d_first_dom Nullable, d_last_dom Nullable, d_same_day_ly Nullable, d_same_day_lq Nullable, d_current_day Nullable, d_current_week Nullable, d_current_month Nullable, d_current_quarter Nullable, d_current_year Nullable"] [source=db_2.t_386] [thread_id=875]

# other threads wait... until timeout for 1 seconds
[2024/11/14 16:01:23.041 +08:00] [WARN] [InterpreterCreateQuery.cpp:585] ["createTable failed with error code is 337, error info is DB::Exception: Table db_2.t_386 is creating or attaching right now, ..."] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=896]
[2024/11/14 16:01:23.042 +08:00] [ERROR] [InterpreterCreateQuery.cpp:597] ["createTable failed but table not exist now, \nWe will sleep for 20 ms and try again."] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=896]
[2024/11/14 16:01:23.052 +08:00] [ERROR] [InterpreterCreateQuery.cpp:597] ["createTable failed but table not exist now, \nWe will sleep for 20 ms and try again."] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=886]
...
[2024/11/14 16:01:24.023 +08:00] [ERROR] [InterpreterCreateQuery.cpp:597] ["createTable failed but table not exist now, \nWe will sleep for 20 ms and try again."] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=886]
[2024/11/14 16:01:24.032 +08:00] [ERROR] [InterpreterCreateQuery.cpp:597] ["createTable failed but table not exist now, \nWe will sleep for 20 ms and try again."] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=896]
[2024/11/14 16:01:24.043 +08:00] [ERROR] [InterpreterCreateQuery.cpp:603] ["still failed to createTable in InterpreterCreateQuery for retry 50 times"] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=886]
[2024/11/14 16:01:24.049 +08:00] [ERROR] [MPPTask.cpp:654] ["task running meets error: Code: 337, e.displayText() = DB::Exception: Table db_2.t_386 is creating or attaching right now, e.what() = DB::Exception, Stack trace:\n\n\n       0x43cba6e\tStackTrace::StackTrace() [tiflash+71088750]\n       0x43bb312\tDB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, int) [tiflash+71021330]\n       0xca1009c\tDB::DDLGuard::DDLGuard(std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>&, std::__1::mutex&, std::__1::unique_lock<std::__1::mutex>&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) [tiflash+211878044]\n       0xca1d068\tstd::__1::__unique_if<DB::DDLGuard>::__unique_single std::__1::make_unique[abi:ue170006]<DB::DDLGuard, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>&, std::__1::mutex&, std::__1::unique_lock<std::__1::mutex>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>&, std::__1::mutex&, std::__1::unique_lock<std::__1::mutex>&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) [tiflash+211931240]\n       0xca101e6\tDB::Context::getDDLGuard(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) const [tiflash+211878374]\n       0xca10384\tDB::Context::getDDLGuardIfTableDoesntExist(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) const [tiflash+211878788]\n       0xcca4be0\tDB::InterpreterCreateQuery::createTable(DB::ASTCreateQuery&) [tiflash+214584288]\n       0xcca6427\tDB::InterpreterCreateQuery::execute() [tiflash+214590503]\n       0xe5641c3\tDB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyCreateStorageInstance(long, std::__1::shared_ptr<TiDB::TableInfo> const&, bool, std::__1::basic_string_view<char, std::__1::char_traits<char>>) [tiflash+240533955]\n       0xe562ac8\tDB::SchemaBuilder<DB::SchemaGetter, DB::SchemaNameMapper>::applyTable(long, long, long, bool) [tiflash+240528072]\n       0xe536832\tDB::TiDBSchemaSyncer<false, false>::trySyncTableSchema(DB::Context&, long, DB::SchemaGetter&, bool, char const*) [tiflash+240347186]\n       0xe535b1d\tDB::TiDBSchemaSyncer<false, false>::syncTableSchema(DB::Context&, long) [tiflash+240343837]\n       0xcb15648\tDB::TiDBSchemaSyncerManager::syncTableSchema(DB::Context&, unsigned int, long) [tiflash+212948552]\n       0xee7bd62\tDB::DAGStorageInterpreter::getAndLockStorages(long)::$_0::operator()(long) const [tiflash+250068322]\n       0xee75250\tDB::DAGStorageInterpreter::getAndLockStorages(long) [tiflash+250040912]\n       0xee6df47\tDB::DAGStorageInterpreter::prepare() [tiflash+250011463]\n       0xee6edc1\tDB::DAGStorageInterpreter::execute(DB::PipelineExecutorContext&, DB::PipelineExecGroupBuilder&) [tiflash+250015169]\n       0xee68295\tDB::PhysicalTableScan::buildPipeline(DB::PipelineBuilder&, DB::Context&, DB::PipelineExecutorContext&) [tiflash+249987733]\n       0xeb5b091\tDB::PhysicalPlanNode::buildPipeline(DB::PipelineBuilder&, DB::Context&, DB::PipelineExecutorContext&) [tiflash+246788241]\n       0xeb5b091\tDB::PhysicalPlanNode::buildPipeline(DB::PipelineBuilder&, DB::Context&, DB::PipelineExecutorContext&) [tiflash+246788241]\n       0xeb5b091\tDB::PhysicalPlanNode::buildPipeline(DB::PipelineBuilder&, DB::Context&, DB::PipelineExecutorContext&) [tiflash+246788241]\n       0xeb49452\tDB::PhysicalPlan::toPipeline(DB::PipelineExecutorContext&, DB::Context&) [tiflash+246715474]\n       0xeaa47e4\tDB::PipelineExecutor::PipelineExecutor(std::__1::shared_ptr<MemoryTracker> const&, DB::AutoSpillTrigger*, std::__1::function<void (std::__1::shared_ptr<DB::OperatorSpillContext> const&)> const&, DB::Context&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) [tiflash+246040548]\n       0xe8fcd13\tstd::__1::__unique_if<DB::PipelineExecutor>::__unique_single std::__1::make_unique[abi:ue170006]<DB::PipelineExecutor, std::__1::shared_ptr<MemoryTracker>&, std::nullptr_t, std::nullptr_t, DB::Context&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&>(std::__1::shared_ptr<MemoryTracker>&, std::nullptr_t&&, std::nullptr_t&&, DB::Context&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) [tiflash+244305171]\n       0xe8f8336\tDB::(anonymous namespace)::executeAsPipeline(DB::Context&, bool) [tiflash+244286262]\n       0xe8f7e3f\tDB::queryExecute(DB::Context&, bool) [tiflash+244284991]\n       0xe9e1dc8\tDB::MPPTask::preprocess() [tiflash+245243336]\n       0xe9e2882\tDB::MPPTask::runImpl() [tiflash+245246082]\n       0xe9e854d\tDB::MPPTask::run()::$_0::operator()() const [tiflash+245269837]\n       0xe9e8525\tdecltype(std::declval<DB::MPPTask::run()::$_0&>()()) std::__1::__invoke[abi:ue170006]<DB::MPPTask::run()::$_0&>(DB::MPPTask::run()::$_0&) [tiflash+245269797]\n       0xe9e84e5\tvoid std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ue170006]<DB::MPPTask::run()::$_0&>(DB::MPPTask::run()::$_0&) [tiflash+245269733]\n       0xe9e84bd\tstd::__1::__function::__alloc_func<DB::MPPTask::run()::$_0, std::__1::allocator<DB::MPPTask::run()::$_0>, void ()>::operator()[abi:ue170006]() [tiflash+245269693]"] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:1>"] [thread_id=886]
[2024/11/14 16:01:24.050 +08:00] [WARN] [MPPTask.cpp:752] ["Begin abort task: MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:1>, abort type: ONERROR"] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:1>"] [thread_id=886]
[2024/11/14 16:01:24.052 +08:00] [ERROR] [InterpreterCreateQuery.cpp:603] ["still failed to createTable in InterpreterCreateQuery for retry 50 times"] [source="InterpreterCreateQuery db_2 t_386"] [thread_id=896]
[2024/11/14 16:01:24.057 +08:00] [WARN] [MPPTask.cpp:752] ["Begin abort task: MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:11>, abort type: ONERROR"] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:11>"] [thread_id=896]
...

# thread 875 finish create table, took about 4.075 seconds
[2024/11/14 16:01:27.014 +08:00] [INFO] [SchemaBuilder.cpp:1182] ["Create table db_2.date_dim end, database_id=2 table_id=386 action=ApplyTable"] [source="keyspace=4294967295"] [thread_id=875]
[2024/11/14 16:01:27.015 +08:00] [INFO] [TiDBSchemaSyncer.cpp:270] ["Sync table schema end, table_id=386"] [source="keyspace=4294967295"] [thread_id=875]
[2024/11/14 16:01:27.015 +08:00] [DEBUG] [DAGStorageInterpreter.cpp:1442] ["Table schema sync done, keyspace=4294967295 table_id=386 cost=4075 ms"] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:6>"] [thread_id=875]
[2024/11/14 16:01:27.015 +08:00] [INFO] [DAGStorageInterpreter.cpp:1464] ["syncing schemas done, time cost = 4075 ms."] [source="MPP<gather_id:1, query_ts:1731571282891795535, local_query_id:331, server_id:271, start_ts:453921022363566085,task_id:6>"] [thread_id=875]


@JaySon-Huang
Copy link
Contributor Author

According to the log, it took about 4 seconds to create an empty table when the CI environment is busy.

ti-chi-bot bot pushed a commit that referenced this issue Nov 18, 2024
ti-chi-bot bot pushed a commit that referenced this issue Nov 27, 2024
ti-chi-bot bot pushed a commit that referenced this issue Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. component/storage severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant