Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(server): Fix bugs #797

Merged
merged 2 commits into from
Feb 14, 2023
Merged

fix(server): Fix bugs #797

merged 2 commits into from
Feb 14, 2023

Conversation

dranikpg
Copy link
Contributor

@dranikpg dranikpg commented Feb 14, 2023

  1. Fix replica offset
  2. Remove old tx offset

util::fibers_ext::Fiber write_fb_{};
JournalWriter writer_{this};

std::atomic_uint64_t record_cnt_{0};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not initialized and thus its value was just random

Comment on lines +90 to +92

print(" offset", syncid.decode(), r_offset, m_offset)
return r_offset == m_offset
Copy link
Contributor Author

@dranikpg dranikpg Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now print to so we can see whether the replica is deadlocked, just slow or whether they're out of sync at all

Comment on lines +102 to +106
tasks = (asyncio.create_task(check_replica_finished_exec(c, c_master)) for c in waiting_for)
finished_list = await asyncio.gather(*tasks)

# Remove clients that finished from waiting list
waiting_for = [c for (c, finished) in zip(waiting_for, finished_list) if not finished]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to avoid printing a lot, I now check each time only the remaining replicas, not all

@@ -382,7 +382,6 @@ void EngineShard::PollExecution(const char* context, Transaction* trans) {
// after trans in the queue, hence it's safe to run trans out of order.
if (trans && should_run) {
DCHECK(trans != head);
DCHECK(!trans->IsMulti()); // multi, global transactions can not be OOO.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this check now?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. it fails on python tests. which is good because
  2. We added an option to run multi transactions as OOO if at moment of their scheduling, they are exclusive owners of the lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we now support OOO for multi, but this branch was never called with a lot of shards because of low throughput, so I didn't notice it

@romange
Copy link
Collaborator

romange commented Feb 14, 2023

LGTM for c++ fixes.

@dranikpg dranikpg marked this pull request as ready for review February 14, 2023 10:41
@dranikpg dranikpg merged commit 25db011 into dragonflydb:main Feb 14, 2023
@dranikpg dranikpg deleted the fix-bugs branch February 27, 2023 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants