Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KUDU-2388. Fix TSAN race in MaintenanceManager::UnregisterOp
This fixes the following TSAN race: WARNING: ThreadSanitizer: data race (pid=17822) Read of size 1 at 0x7b4c000054e8 by thread T59 (mutexes: write M1750): ... #3 strings::internal::SubstituteArg::SubstituteArg(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/gutil/strings/substitute.h:76 (libtserver.so+0x9edb0) #4 kudu::MaintenanceManager::LogPrefix() const /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/maintenance_manager.cc:545:31 (libkudu_util.so+0x167791) #5 kudu::MaintenanceManager::UnregisterOp(kudu::MaintenanceOp*) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/maintenance_manager.cc:235:3 (libkudu_util.so+0x165963) #6 kudu::MaintenanceOp::Unregister() /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/maintenance_manager.cc:123:13 (libkudu_util.so+0x1654fe) #7 kudu::tablet::Tablet::UnregisterMaintenanceOps() /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tablet/tablet.cc:1405:9 (libtablet.so+0xfb5af) #8 kudu::tablet::TabletReplica::Stop() /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tablet/tablet_replica.cc:271:25 (libtablet.so+0x146e66) #9 kudu::tserver::TSTabletManager::DeleteTablet(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c Previous write of size 8 at 0x7b4c000054e8 by main thread: #0 memcpy /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/src/llvm-4.0.0.src/projects/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:655 (kudu-tserver+0x449e4c) #1 std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__move_assign(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::integral_constant<bool, true>) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/string:2044:18 (libkudu_util.so+0x16664d) #2 std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::operator=(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&) /data/somelongdirectorytoavoidrpathissues/src/kudu/thirdparty/installed/tsan/include/c++/v1/string:2055 (libkudu_util.so+0x16664d) #3 kudu::MaintenanceManager::Init(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >) /data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/util/maintenance_manager.cc:169 (libkudu_util.so+0x16664d) ... The race is on the 'server_uuid_' field in the MaintenanceManager. This is set during startup, but was being set _after_ calls such as UnregisterOp could be made as seen above. That means the UnregisterOp call could either see an empty UUID or even crash due to the above race. This simply rejiggers the MaintenanceManager startup to take the UUID in as a constructor parameter instead, and to instantiate the object slightly later during startup. Change-Id: Id06731f56eb98146f7b88541b936c0026b781b16 Reviewed-on: http://gerrit.cloudera.org:8080/9866 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo <[email protected]>
- Loading branch information