feat(replication): Use a ring buffer with messages to serve replication. #1835

royjacobson · 2023-09-10T08:20:47Z

This is infrastructure changes needed for partial sync, but no actual replication protocol changes are implemented here.
The journal ring buffer is enabled, some functionality is introduced, and the existing replication infrastructure is migrated to use the ring buffer instead of serializing in-place.

chakaz

Please see my comments, mostly questions because of my lack of context

chakaz · 2023-09-10T09:04:04Z

src/server/journal/journal_slice.cc

 #include "base/logging.h"
+#include "server/journal/serializer.h"
+
+ABSL_FLAG(int, shard_changes_log_size, 1 << 16, "The size of the circular changes log per shard");


Maybe add the work "replication" in the flag name, or at least in the description?

happy to bike shed here :)

For reference, the equivalent redis configuration is called repl-backlog-size and it's measured in bytes (not entries)

I don't think of it as bike-shedding, it's actually something the users might look for. So ctrl-f in the output of --helpfull or whatever can be useful to contain that word..

Changed to shard_repl_backlog_len

src/server/journal/journal_slice.cc

src/server/rdb_load.cc

src/server/rdb_save.cc

dranikpg

Just a few cleaning comments

src/server/rdb_save.cc

src/server/journal/streamer.cc

dranikpg · 2023-09-11T18:48:36Z

A thought I had. Instead of using RingBufferstd::string we can use two rotating byte buffers (like the streamer does) to control offset not only in entries, but also in bytes (for example a max lag of 5mb is allowed). Also, we don't need separate allocations for all entries (which are likely fairly cheap either way). The downside that memory usage is twice higher and you drop half of the entries when filling up (or you keep track with an external vector which makes it difficult)

romange · 2023-09-12T05:14:35Z

Why did we punt on the design review? I am curious to know if we still maintain back-pressure when replicating via ring buffer.

src/server/rdb_load.cc

romange · 2023-09-12T05:30:41Z

src/server/journal/journal_slice.cc

+    item->opcode = entry.opcode;
+    item->data = "";
+  } else {
+    item = ring_buffer_->GetTail(true);


Please add comment saying, it provide a pointer to next tail, overriding the head if buffer is full.

I would like to understand how you envision product behavior in this case. Are you saying, there is no backpressure whatsoever, and if this item has not been consumed by a replica, then cest la vie?

How replica will know that its stable sync channel was overridden and its missing some entries?

As I wrote below - currently no change, backpressure is applied through the callbacks so we can't override an entry that hasn't been sent.

romange · 2023-09-12T05:33:40Z

src/server/journal/journal_slice.cc

-    item.txid = entry.txid;
-    VLOG(1) << "Writing item [" << item.lsn << "]: " << entry.ToString();
-    ring_buffer_->EmplaceOrOverride(move(item));
+    io::BufSink buf_sink{&ring_serialize_buf_};


I think using StringSink is better here.
you serialize into a string and then just move into item->data saving a copy at line 155.

Keeping a big backing buffer will prevent grow allocations

I can also just serialize directly into item->data without moving, no?

nvm, StringSink owns its string

So - if I always move out of a StringBuf I always have to allocate a new string. Where as currently I'm reusing the strings from the ring buffer. I think less allocations is better than less copies in this case

Why? item is a pointer to an object on the ring buffer and I'm calling std::string::operator=(std::string_view)

Its currently allocating a single string per entry - but exactly once. A moved-from string sink can cause multiple allocations because the string might need to grow after multiple write calls

I don't understand where any allocation happens here (except for the first 16k entries)

I did not remember we reuse items in the ring buffer. It's worth adding a comment because it can bite us later by hiding allocated memory.

So, I thought about this again. I agree that memory usage can be an issue. I like @dranikpg's suggestion about two buffers (one for entries, one for bytes) but it's a bit more tricky to handle.
So I suggest for now to lower the default size (which is currently 64k, I didn't remember correctly) to 1k and later we can adjust this better and add visibility into the memory consumption.

src/server/journal/journal_slice.cc

royjacobson · 2023-09-12T08:33:04Z

Why did we punt on the design review? I am curious to know if we still maintain back-pressure when replicating via ring buffer.

I left it as an open question.

We still have backpressure and I don't plan to change that right now, but it would be pretty easy to do so.

royjacobson · 2023-09-12T08:56:28Z

A thought I had. Instead of using RingBufferstd::string we can use two rotating byte buffers (like the streamer does) to control offset not only in entries, but also in bytes (for example a max lag of 5mb is allowed). Also, we don't need separate allocations for all entries (which are likely fairly cheap either way). The downside that memory usage is twice higher and you drop half of the entries when filling up (or you keep track with an external vector which makes it difficult)

I agree. I think it's a pretty simple change but I'd rather do it in another PR if we decide we want to.

src/server/journal/journal_slice.cc

dranikpg

LGTM. I don't have any further comments and agree that we can try out different ring buffer variations later

feat(replication): Use a ring buffer with messages to serve replication.

f35a5ea

royjacobson requested review from chakaz and adiholden September 10, 2023 08:20

chakaz reviewed Sep 10, 2023

View reviewed changes

royjacobson added 2 commits September 10, 2023 14:57

Fix libraries dep graph

e90e6da

Address PR feedback

b688080

royjacobson force-pushed the ring_buffer_improvements branch from 5d8599d to b688080 Compare September 11, 2023 12:29

dranikpg reviewed Sep 11, 2023

View reviewed changes

src/server/rdb_save.cc Show resolved Hide resolved

src/server/journal/streamer.cc Show resolved Hide resolved

romange reviewed Sep 12, 2023

View reviewed changes

src/server/rdb_load.cc Show resolved Hide resolved

romange reviewed Sep 12, 2023

View reviewed changes

src/server/journal/journal_slice.cc Outdated Show resolved Hide resolved

nits

ecead8d

adiholden reviewed Sep 13, 2023

View reviewed changes

src/server/journal/journal_slice.cc Show resolved Hide resolved

adiholden previously approved these changes Sep 13, 2023

View reviewed changes

add a comment

5b25d9e

royjacobson dismissed adiholden’s stale review via 5b25d9e September 13, 2023 09:15

dranikpg previously approved these changes Sep 14, 2023

View reviewed changes

Lower the default log length

c4140d6

royjacobson dismissed dranikpg’s stale review via c4140d6 September 18, 2023 06:45

dranikpg approved these changes Sep 18, 2023

View reviewed changes

royjacobson merged commit db21b73 into main Sep 18, 2023

royjacobson deleted the ring_buffer_improvements branch September 18, 2023 10:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(replication): Use a ring buffer with messages to serve replication. #1835

feat(replication): Use a ring buffer with messages to serve replication. #1835

royjacobson commented Sep 10, 2023 •

edited

Loading

chakaz left a comment

chakaz Sep 10, 2023

royjacobson Sep 11, 2023

chakaz Sep 11, 2023

royjacobson Sep 14, 2023

dranikpg left a comment

dranikpg commented Sep 11, 2023

romange commented Sep 12, 2023

romange Sep 12, 2023 •

edited

Loading

royjacobson Sep 12, 2023

romange Sep 12, 2023

dranikpg Sep 12, 2023

royjacobson Sep 12, 2023

royjacobson Sep 12, 2023

royjacobson Sep 12, 2023

royjacobson Sep 12, 2023 •

edited

Loading

dranikpg Sep 13, 2023

royjacobson Sep 13, 2023

romange Sep 14, 2023

royjacobson Sep 14, 2023

royjacobson commented Sep 12, 2023

royjacobson commented Sep 12, 2023

dranikpg left a comment

feat(replication): Use a ring buffer with messages to serve replication. #1835

feat(replication): Use a ring buffer with messages to serve replication. #1835

Conversation

royjacobson commented Sep 10, 2023 • edited Loading

chakaz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dranikpg left a comment

Choose a reason for hiding this comment

dranikpg commented Sep 11, 2023

romange commented Sep 12, 2023

romange Sep 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

royjacobson Sep 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

royjacobson commented Sep 12, 2023

royjacobson commented Sep 12, 2023

dranikpg left a comment

Choose a reason for hiding this comment

royjacobson commented Sep 10, 2023 •

edited

Loading

romange Sep 12, 2023 •

edited

Loading

royjacobson Sep 12, 2023 •

edited

Loading