Skip to content

2.23.0.0-b527

Summary:
In working on diff https://phorge.dev.yugabyte.com/D32682, I found one deadlock
that caused a test to be flaky. The dead lock involved 3 threads.

Thread A is executing yb::RWCLock::UpgradeToCommitLock:
```
void RWCLock::UpgradeToCommitLock() {
  lock_.lock();
  DCHECK(write_locked_);
#ifndef NDEBUG
  bool first_wait = true;
  while (reader_count_ > 0) {
    if (!no_readers_.TimedWait(first_wait ? kFirstWait : kSecondWait)) {
      std::ostringstream ss;
      ss << "Too long commit lock wait, num readers: " << reader_count_
         << ", current thread stack: " << GetStackTrace();
```

Thread A blocks because `reader_count_ > 0`, while it holds the write lock:

```
  DCHECK(write_locked_);
```

Thread A is executing the work flow of an ALTER TABLE statement with a call chain

```
yb::RWCLock::UpgradeToCommitLock
yb::CowObject<yb::master::PersistentTableInfo>::CommitMutation
yb::CowWriteLock<yb::master::PersistentTableInfo>::Commit
yb::master::CatalogManager::ClearYsqlDdlTxnState
yb::master::CatalogManager::HandleSuccessfulYsqlDdlTxn
yb::master::CatalogManager::YsqlDdlTxnCompleteCallbackInternal

```
From test log I see the thread (Thread B) that caused `reader_count_ > 0` is
calling `GetTableSchemaInternal`:

Thread B:

```
Status CatalogManager::GetTableSchemaInternal(const GetTableSchemaRequestPB* req,
                                              GetTableSchemaResponsePB* resp,
                                              bool get_fully_applied_indexes) {
  VLOG(1) << "Servicing GetTableSchema request for " << req->ShortDebugString();

  // Lookup the table and verify if it exists.
  TRACE("Looking up table");
  scoped_refptr<TableInfo> table = VERIFY_RESULT(FindTable(req->table()));

  TRACE("Locking table");
  auto l = table->LockForRead();

......
  auto nsinfo = FindNamespaceById(table->namespace_id());
```

Thread B causes `reader_count_ > 0` via `table->LockForRead()` and it is blocked
inside `FindNamespaceById` in order to get a shared lock on `mutex_` of `CatalogManager`:

```
Result<scoped_refptr<NamespaceInfo>> CatalogManager::FindNamespaceById(
    const NamespaceId& id) const {
  SharedLock lock(mutex_);
  return FindNamespaceByIdUnlocked(id);
}
```

The `mutex_` is held exclusively by Thread C from `CatalogManager::CreateTable`:
```
  {
    LockGuard lock(mutex_);
    auto ns_lock = ns->LockForRead();
......
        if (IsIndex(req)) {
          TRACE("Locking indexed table");
          indexed_table_write_lock = indexed_table->LockForWrite();
        }
......
  }
```

Note Thread C exclusively holds `mutex_` via `LockGuard`, and it is blocked on
`LockForWrite`, which calls `RWCLock::WriteLock` and cannot proceed because
`write_locked_` (owned by Thread A).

```
void RWCLock::WriteLock() {
  ThreadRestrictions::AssertWaitAllowed();

  MutexLock l(lock_);
  // Wait for any other mutations to finish.
#ifndef NDEBUG
  bool first_wait = true;
  while (write_locked_) {
    if (!no_mutators_.TimedWait(first_wait ? kFirstWait : kSecondWait)) {
```

In summary, we have a deadlock cycle:
Thread A (holds write_locked_) waits for Thread B to release reader_count_
Thread B (holds reader_count_) waits for Thread C to release mutex_
Thread C (holds mutex_) waits for Thread A to release write_locked_

To break the dead lock cycle, I made changes in Thread C, so that it will release
the exclusive lock on `mutex_` before calling `LockForWrite`. In addition,
it does not need to re-acquire `mutex_` after `LockForWrite` because the rest
code in the relevant scope does not need to be protected by having an exclusive
lock on `mutex_`.
Jira: DB-11784

Test Plan: Tested in the context of https://phorge.dev.yugabyte.com/D32682.

Reviewers: hsunder, asrivastava, zdrudi

Reviewed By: asrivastava, zdrudi

Subscribers: zdrudi, ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D35843
Assets 2
Loading