feat(replica): Support FlushDB command for replication #580 #591

adiholden · 2022-12-22T12:32:41Z

No description provided.

Signed-off-by: adi_holden <[email protected]>

romange · 2022-12-22T15:27:12Z

src/server/replica.cc

+  }
+
+  // Multi shard command flow:
+  // step 1: Fiber wait untill all the fibers that should execute this tranaction got


untill -> until here and below

romange · 2022-12-22T15:28:05Z

src/server/replica.cc

+  multi_shard_exe_->map_mu.lock();
+  auto [it, was_insert] = multi_shard_exe_->tx_sync_execution.emplace(entry.txid, entry.shard_cnt);
+
+  // Note: we must release the mutex befor calling wait on berrier


berrier -> barrier

romange · 2022-12-22T15:28:17Z

src/server/replica.cc

+          << " was_insert: " << was_insert;
+
+  // step 1
+  it->second.berrier.wait();


romange · 2022-12-22T15:52:05Z

src/server/replica.cc

+  // Note: erase from map can be done only after all fibers returned from wait.
+  // The last fiber which will decrease the counter to 0 will be the one to erase the data from map
+  auto val = --it->second.counter;
+  VLOG(2) << "txid: " << entry.txid << " unique_shard_cnt_: " << entry.shard_cnt


since you move the entry above, reading its fields can technically show wrong values.

You do not use was_inserted to check that only a single fiber called flushdb. why?

Please use fetch_sub when decrementing atomics. you will need to compare val with 1.

We can relax the interface on Execution to accept a mut-ref. We need mut because of the mutable CmdArgList in Dispatch (which can be changed with ToUpper etc inside the command)

Because we can't tell now what kind of command that is - sharded part or global. We can, theoretically, check with the number of arguments it has or pass additional info. For now, the other executions on FLUSHDB will be no-ops (but redundant transactions)

Regarding number 2, I will but not in this PR.
There are 2 TODO notes which I added.
I will work on another PR to support the execution of the commands by a single fiber, which will require different flow for global commands and non global commands.
I will work on another PR to support the cancellation flow, once the PR I created for heilo with barrier will be merged

dranikpg · 2022-12-24T09:15:38Z

src/server/replica.h

+
+    std::unordered_map<TxId, TxExecutionSync> tx_sync_execution;


Lets not interleave members and struct declarations, so all fields at the bottom and all decls at the top - easier to read

dranikpg · 2022-12-24T09:17:01Z

src/server/replica.h

@@ -80,12 +84,28 @@ class Replica {
  void DefaultErrorHandler(const GenericError& err);

 private: /* Main dlfly flow mode functions */
+  struct MultiShardExecution {


Lets move this struct declaration to the top to the other private section

dranikpg · 2022-12-24T09:17:28Z

src/server/replica.h

+  std::shared_ptr<MultiShardExecution> multi_shard_exe_;
+


This to the bottom

dranikpg · 2022-12-24T09:17:40Z

src/server/replica.h


+  void ExecuteEntry(JournalExecutor* executor, journal::ParsedEntry&& entry);
+


This belongs to the flow section (below StableSyncDflyFb)

Having a correctly structured header makes the code a lot easier to understand

I did all the other changes in header apart from this one. The flow section is public and this function is private

dranikpg · 2022-12-24T09:28:44Z

src/server/replica.cc

+  // Note: erase from map can be done only after all fibers returned from wait.
+  // The last fiber which will decrease the counter to 0 will be the one to erase the data from map
+  auto val = --it->second.counter;
+  VLOG(2) << "txid: " << entry.txid << " unique_shard_cnt_: " << entry.shard_cnt


We can relax the interface on Execution to accept a mut-ref. We need mut because of the mutable CmdArgList in Dispatch (which can be changed with ToUpper etc inside the command)

Because we can't tell now what kind of command that is - sharded part or global. We can, theoretically, check with the number of arguments it has or pass additional info. For now, the other executions on FLUSHDB will be no-ops (but redundant transactions)

dranikpg · 2022-12-24T09:40:56Z

src/server/replica.cc

+  // By step 1 we enforce that replica will execute multi shard commands that finished on master
+  // Step 3 ensures the correctness of flushall/flushdb commands
+  // TODO: this implemantaion does not support atomicity in replica


Lets divide those parts and indent the steps, its easier to read

Signed-off-by: adi_holden <[email protected]>

feat(replica): Support FlushDB command for replication #580

86a74ab

Signed-off-by: adi_holden <[email protected]>

adiholden requested review from romange and dranikpg December 22, 2022 12:32

adiholden added 2 commits December 22, 2022 14:36

fix typo

18132c1

Signed-off-by: adi_holden <[email protected]>

reuse berrier

0268907

Signed-off-by: adi_holden <[email protected]>

romange reviewed Dec 22, 2022

View reviewed changes

src/server/replica.cc Outdated

<< " was_insert: " << was_insert;

// step 1

it->second.berrier.wait();

Copy link

Collaborator

romange Dec 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here

romange reviewed Dec 22, 2022

View reviewed changes

dranikpg reviewed Dec 24, 2022

View reviewed changes

PR fix

28b9f82

Signed-off-by: adi_holden <[email protected]>

adiholden requested review from romange and dranikpg December 25, 2022 07:23

dranikpg approved these changes Dec 25, 2022

View reviewed changes

romange approved these changes Dec 25, 2022

View reviewed changes

adiholden merged commit 5d39521 into main Dec 25, 2022

romange deleted the support_flush_db_for_replication branch December 25, 2022 12:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(replica): Support FlushDB command for replication #580 #591

feat(replica): Support FlushDB command for replication #580 #591

adiholden commented Dec 22, 2022

romange Dec 22, 2022

romange Dec 22, 2022

romange Dec 22, 2022

romange Dec 22, 2022

dranikpg Dec 24, 2022

adiholden Dec 25, 2022

dranikpg Dec 24, 2022

dranikpg Dec 24, 2022

dranikpg Dec 24, 2022

dranikpg Dec 24, 2022

adiholden Dec 25, 2022

dranikpg Dec 24, 2022

dranikpg Dec 24, 2022


		std::unordered_map<TxId, TxExecutionSync> tx_sync_execution;


		void ExecuteEntry(JournalExecutor* executor, journal::ParsedEntry&& entry);

feat(replica): Support FlushDB command for replication #580 #591

feat(replica): Support FlushDB command for replication #580 #591

Conversation

adiholden commented Dec 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment