rewrite buffer implementation to eliminate evbuffer dependency #5441

brian-pane · 2018-12-31T17:55:00Z

Description:
Replace the use of libevent's evbuffer in Buffer::OwnedImpl with a new buffer implementation.

Risk Level: high
Testing: I ran the unit tests
Docs Changes: N/A
Release Notes: N/A
Related Issues: #4952

Signed-off-by: Brian Pane [email protected]

Signed-off-by: Brian Pane <[email protected]>

mattklein123 · 2019-01-03T02:55:31Z

Thanks @brian-pane. @envoyproxy/maintainers any takers on reviewing this? @jmarantz this seems like something you might enjoy. 😉

jmarantz · 2019-01-03T03:36:44Z

I need to dig into this but my first thought is that I'd like to see a perf test with this PR. My pattern lately has been to write the perf-test in a separate branch from master, and see how the current implementation performs. Then you can patch the perf test into this branch and measure any improvement.

I'm actually not sure if your motivation is to improve performance or simply to cut a troublesome dependency (and why is it troublesome?) but even if it's the latter, we should quantify the performance impact. In any case you should put the motivation for this PR in the description.

One such perf test is https://github.com/envoyproxy/envoy/blob/master/test/common/stats/thread_local_store_speed_test.cc

jmarantz · 2019-01-03T03:22:05Z

source/common/buffer/buffer_impl.cc

-              "RawSlice != evbuffer_iovec");
-static_assert(offsetof(RawSlice, len_) == offsetof(evbuffer_iovec, iov_len),
-              "RawSlice != evbuffer_iovec");
+static uint64_t SliceSize(uint64_t data_size) {


If you make this a static method of OwnedBufferSlice then you can unit-test it. Also it should be sliceSize() whether a static method or static function.

jmarantz · 2019-01-03T03:26:34Z

source/common/buffer/buffer_impl.cc

-      buffer_.get(), fragment.data(), fragment.size(),
-      [](const void*, size_t, void* arg) { static_cast<BufferFragment*>(arg)->done(); }, &fragment);
+  OwnedBufferSlice(const void* data, uint64_t size) : OwnedBufferSlice(size) {
+    memcpy(&(base_[0]), data, size);


base_.get() ?

source/common/buffer/buffer_impl.cc

brian-pane · 2019-01-03T04:24:52Z

Thanks for looking at this, @jmarantz! I'll put together a performance test tomorrow. My motivation for the PR is to remove one of the obstacles to using a different event multiplexer library in the future (e.g., #4952), but I do want to match or improve upon the current implementation's speed and memory footprint.

Signed-off-by: Brian Pane <[email protected]>

lizan · 2019-01-03T07:04:34Z

source/common/buffer/buffer_impl.h

+ */
+class BufferSlice {
+public:
+  using Reservation = std::pair<void*, uint64_t>;


Instead of std::pair, use absl::string_view? It is more readable than .first/.second and comes with some utility function.

I thought of that too...only difference is that Reservation.first is non-const, so you'd need to const-cast whenever you need to write.

In general I wish there was less const-casting in this impl as that always requires extra scrutiny during reviews and debugging. However maybe you can structure that with wrappers to isolate the non-const uses.

If you don't do that, I do think you should use uint8_t* rather than void* to reduce casting all over the place. You can always return a void* without casting if you have a uint8_t* but not vice versa.

Ah I think you're right string_view is const pointer. Then perhaps a simple struct with non-const pointer and size will work, I think in general we prefer named struct than std::pair.

+1 on struct vs pair. Thus I suggest

struct Reservation { uint8_t* data_; uint64_t size_; };

This may increase readability by reducing casts? See also RawSlice in include/envoy/buffer/buffer.h which is similar, but uses void* so will require casting for all uses.

Maybe RawSlice itself should be the data type that represents a reservation. I'm thinking through the pros and cons now...

jmarantz

Some more comments...I have not looked at all the code but not deeply ingested it.

Would love to see perf tests.

jmarantz · 2019-01-03T20:29:45Z

source/common/buffer/buffer_impl.h

+ */
+class BufferSlice {
+public:
+  using Reservation = std::pair<void*, uint64_t>;


I thought of that too...only difference is that Reservation.first is non-const, so you'd need to const-cast whenever you need to write.

In general I wish there was less const-casting in this impl as that always requires extra scrutiny during reviews and debugging. However maybe you can structure that with wrappers to isolate the non-const uses.

If you don't do that, I do think you should use uint8_t* rather than void* to reduce casting all over the place. You can always return a void* without casting if you have a uint8_t* but not vice versa.

jmarantz · 2019-01-04T01:48:03Z

source/common/buffer/buffer_impl.cc

@@ -1,6 +1,7 @@
 #include "common/buffer/buffer_impl.h"

 #include <cstdint>
+#include <iostream>


Where are you using this?

That's left over from debugging. I'll post an update to remove it.

jmarantz · 2019-01-04T01:50:34Z

source/common/buffer/buffer_impl.cc

+  /**
+   * Compute a slice size big enough to hold a specified amount of data.
+   * @param data_size the minimum amount of data the slice must be able to store, in bytes.
+   * @ return a recommended slice size, in bytes.


@return uint64_t a recommended slice size, in bytes

(Remove the intervening space, specify the return type. That's the envoy style for doxygen-like return comments although it seems redundant to me)

Thanks for the catch on the the extraneous space; I'll fix that in my next update. With regard to listing the type after the @return, that style is apparently "outdated," based on the discussion here: #2612 (comment)

excluding the type makes sense to me; I hadn't noticed that thread.

jmarantz · 2019-01-04T01:51:37Z

source/common/buffer/buffer_impl.cc

+  }
+
+  void* data() override {
+    return static_cast<uint8_t*>(const_cast<void*>(fragment_.data())) + data_;


consider defining the const data() in terms of the non-const data() so you don't have to repeat the logic. Or vice versa.

jmarantz · 2019-01-04T01:52:24Z

source/common/buffer/buffer_impl.cc

+  }
+
+  uint64_t reservableSize() const override { return 0; }
+


Group the one-liners together and drop the intervening extra newlines.

jmarantz · 2019-01-04T02:00:49Z

source/common/buffer/buffer_impl.cc

+  while (!other.slices_.empty()) {
+    if (length == 0) {
+      break;
+    }


while (length != 0 && !other.slices_.empty()) { ... more concise and possibly faster due to making easier check first.

jmarantz · 2019-01-04T02:03:04Z

source/common/buffer/buffer_impl.cc

+    if (copy_size == 0) {
+      other.slices_.pop_front();
+    } else if (copy_size < other.slices_.front()->dataSize()) {
+      // TODO add reference-counting to allow slices to share their storage


TODO(brian-pane): add reference-counter.... here and below.

source/common/buffer/buffer_impl.cc

jmarantz · 2019-01-04T02:10:30Z

source/common/buffer/buffer_impl.cc

-  evbuffer_ptr start_ptr;
-  if (-1 == evbuffer_ptr_set(buffer_.get(), &start_ptr, start, EVBUFFER_PTR_SET)) {
-    return -1;
+  // This implementation uses the same search algorithm as evbuffer_search(), a naive


Does this method get called a lot in Envoy? It does seem like this could be linear without that much difficulty. If it gets called on the data path it seems worth optimizing.

if it doesn't get called on the data path you could just replace all this with buffer->toString().find(... ... even that would be linear although doing an allocation to implement a search seems wrong :)

I haven't seen this method show up in CPU profiles from production systems, so as far as I can tell it doesn't get used much in Envoy.

There are three calls to the method in Envoy currently, one of which appears to search untrusted inputs - and that one has a fixed pattern of length 12.

It's probably worth noting that, in the common case, both the current evbuffer_search implementation and this new code already benefit from an existing optimization: they use memchr to find the first character of the pattern, and for a large input size n, a typical vectorized memchr implementation only needs to do n/v conditional branches, where v is 16 or 32 depending on the size of the available SIMD registers.

I was in error thinking this is easy to linearize, and I see the value in being able to do the vectorized memchr, so this looks like a good framework to start with until we see a perf problem.

But since the algorithm is very long with lots of corner-cases I think you need to add unit tests to cover a bunch of them. The current buffer_test.cc doesn't try to cover the underlying evbuffer_search impl which is assumed to be tested in the libevent repo, but now that you own it, you need to test it.

test/test_common/utility.cc

jmarantz · 2019-01-04T03:51:01Z

test/common/ssl/ssl_socket_test.cc

-        iovecs[0].len_ = 0;
-        iovecs[1].len_ = 0;
-        data.commit(iovecs, 2);
+        Buffer::RawSlice iovec1[1];


why does this test need to change?

This test depended on an implementation detail of evbuffer: that particular call to reserve resulted in the creation of two separate slices inside the buffer. The new OwnedImpl implementation satisfies the reservation with a single slice in this case. One of the TODO items in buffer_impl.cc will probably result in the new implementation using two slices in this case in the future, to match how evbuffer works today, but I still wanted to make the test less dependent on implementation details.

is that still true? or have you fixed this now?

The original code here will now work once again, so I'll revert to that.

Signed-off-by: Brian Pane <[email protected]>

jmarantz

I feel @alyssawilk should probably also have a look at this RE performance, given her related watermark work.

jmarantz · 2019-01-04T13:03:02Z

source/common/buffer/buffer_impl.h

+ */
+class BufferSlice {
+public:
+  using Reservation = std::pair<void*, uint64_t>;


+1 on struct vs pair. Thus I suggest

struct Reservation { uint8_t* data_; uint64_t size_; };

This may increase readability by reducing casts? See also RawSlice in include/envoy/buffer/buffer.h which is similar, but uses void* so will require casting for all uses.

jmarantz · 2019-01-04T13:07:44Z

source/common/buffer/buffer_impl.cc

-                                        reinterpret_cast<evbuffer_iovec*>(iovecs), num_iovecs);
-  ASSERT(ret >= 1);
-  return ret;
+  // TODO(brian-pane) support the use of space in more than one slice.


It looks like this only works for num_iovecs==1? That seems like a problem if it's used in Envoy. If it's not, why not just remove the argument here and at the call-sites?

It's more memory-efficient to support multiple iovecs, given the way Envoy does its reservations for socket reads. The only reason this PR doesn't support the multiple-iovec case yet is that the semantics are ill-defined. For example, there are cases where Envoy gets a reservation from an evbuffer with num_iovecs==2 and then calls commit on just the first iovec. It's unclear whether that is supposed to release the reservation on the second slice; I'll probably have to end up digging into the evbuffer implementation to find out the intended semantics. (I'm planning to do that in the very near term.)

jmarantz · 2019-01-04T13:12:06Z

source/common/buffer/buffer_impl.cc

-  evbuffer_ptr start_ptr;
-  if (-1 == evbuffer_ptr_set(buffer_.get(), &start_ptr, start, EVBUFFER_PTR_SET)) {
-    return -1;
+  // This implementation uses the same search algorithm as evbuffer_search(), a naive


I was in error thinking this is easy to linearize, and I see the value in being able to do the vectorized memchr, so this looks like a good framework to start with until we see a perf problem.

But since the algorithm is very long with lots of corner-cases I think you need to add unit tests to cover a bunch of them. The current buffer_test.cc doesn't try to cover the underlying evbuffer_search impl which is assumed to be tested in the libevent repo, but now that you own it, you need to test it.

test/test_common/utility.cc

jmarantz · 2019-01-04T13:20:14Z

source/common/buffer/buffer_impl.cc

 }

-void OwnedImpl::add(absl::string_view data) {
-  evbuffer_add(buffer_.get(), data.data(), data.size());
+void OwnedImpl::add(const void* data, uint64_t size) { append(data, size); }


there's a comment in the base-class saying this method is deprecated; it's worth checking if there are remaining call-sites.

I just checked, and there are some remaining call-sites within Envoy itself that depend on this method. include/envoy/buffer/buffer.h itself won't even compile with the deprecated method removed.

jmarantz · 2019-01-04T13:49:16Z

source/common/buffer/buffer_impl.h

+   * Commit a Reservation that was previously obtained from a call to reserve().
+   * The Reservation's size is added to the Data section.
+   * @param reservation a reservation obtained from a previous call to reserve().
+   *        If the reservation is not from this BufferSlice, commit() will return false.


Under what circumstances would a user of this class commit a reservation from outside the BufferSlice? Just explain in the comment, especially in light of the comment above "the result of calling commit() out of order is undefined"

source/common/buffer/buffer_impl.h

Signed-off-by: Brian Pane <[email protected]>

brian-pane · 2019-01-05T18:05:34Z

I just added a couple of significant changes:

Reservations from BufferSlice now use RawSlice rather than std::pair.
In RawSlice, the pointer to the start of the data is now immutable (void* const), with some comments in buffer.h to explain why. The old implementation also depended on the caller not modifying mem_ between the reserve and commit calls, so this change helps enforce pre-existing semantics.

Over the next couple of days, I plan to:

Address the rest of the outstanding review comments.
Add support for reservations that span more than one iovec (currently listed as a TODO comment in the PR, and important because otherwise this new implementation will use more memory than evbuffer).

jmarantz · 2019-01-05T20:31:24Z

If you have something functional, but whose performance/memory usage has yet to be proven to be the same or better as libevent's impl, you might want to check it in as an alternate implementation that's tested but not in use. That way you can iterate on it, with more frequent and smaller PRs.

One approach toward this is to parameterize your test, as shown in PR #5414

envoy/test/common/stats/symbol_table_impl_test.cc

Line 31 in 2209115

class StatNameTest : public testing::TestWithParam<SymbolTableType> {

Note that checking in a not-quite-good-enough impl will not go down well with folks that are deploying Envoy from HEAD every week, as opposed to waiting for a numbered release.

Signed-off-by: Brian Pane <[email protected]>

brian-pane · 2019-01-06T19:26:36Z

I have the multi-iovec reservations working now (not yet posted to this PR). Based on the benchmark results, there probably are two more changes needed to achieve perf parity with evbuffer: reference-counting to eliminate the copy in the partial-move case, and the elimination of one of the mallocs during the slice creation. And I might have to give up on std::deque. Those will be substantial design changes, so I'll iterate on them offline and post an update here once the perf numbers are on par with libevent.

------------------------------------------------------------------------
Benchmark                             Original           New
------------------------------------------------------------------------
BufferCreateEmpty                        80 ns          7 ns
BufferCreate/1                          163 ns        382 ns
BufferCreate/4096                       246 ns        543 ns
BufferCreate/16384                      577 ns       1076 ns
BufferCreate/65536                     3782 ns       4488 ns
BufferAddString/1                        18 ns         29 ns
BufferAddString/4096                    140 ns        322 ns
BufferAddString/16384                   535 ns        910 ns
BufferAddString/65536                  3418 ns       4289 ns
BufferAddBuffer/1                        40 ns         50 ns
BufferAddBuffer/4096                    164 ns        340 ns
BufferAddBuffer/16384                   570 ns        948 ns
BufferAddBuffer/65536                  3467 ns       4324 ns
BufferDrain/1                            40 ns         50 ns
BufferDrain/4096                        183 ns        378 ns
BufferDrain/16384                       575 ns        987 ns
BufferDrain/65536                      3702 ns       4626 ns
BufferMove/1                             48 ns         21 ns
BufferMove/4096                          48 ns         21 ns
BufferMove/16384                         48 ns         21 ns
BufferMove/65536                         48 ns         21 ns
BufferMovePartial/1                      61 ns         33 ns
BufferMovePartial/4096               118961 ns   11968450 ns
BufferMovePartial/16384              473925 ns   95152762 ns
BufferMovePartial/65536             1895537 ns  302529200 ns
BufferReserveCommit/1                    97 ns        215 ns
BufferReserveCommit/4096                 98 ns        250 ns
BufferReserveCommit/16384                99 ns        457 ns
BufferReserveCommit/65536               986 ns       1772 ns
BufferReserveCommitPartial/1             95 ns        225 ns
BufferReserveCommitPartial/4096          97 ns        248 ns
BufferReserveCommitPartial/16384         98 ns        459 ns
BufferReserveCommitPartial/65536        993 ns       1750 ns
BufferSearch/1                           39 ns         28 ns
BufferSearch/4096                       124 ns        124 ns
BufferSearch/16384                      385 ns        382 ns
BufferSearch/65536                     1761 ns       1764 ns
BufferSearchPartialMatch/1              467 ns        282 ns
BufferSearchPartialMatch/4096         41133 ns      24597 ns
BufferSearchPartialMatch/16384       162766 ns      97562 ns
BufferSearchPartialMatch/65536       651514 ns     387966 ns

jmarantz · 2019-01-06T23:33:49Z

OK...you sure you don't want to try to check this in as an alternate impl, so you can iterate in small PRs? I think it'd be easier to review, rather than waiting till performance perfection?

What's your specific issue with deque? As it happens, a while back on a different project, I had a measured performance issue with having lots of small deques and wound up re-implementing a subset of the std::deque API using a vector and getting a lot of performance benefit for my use-case: https://github.com/apache/incubator-pagespeed-mod/blob/master/pagespeed/kernel/base/vector_deque.h , with a speed-test at https://github.com/apache/incubator-pagespeed-mod/blob/master/pagespeed/kernel/util/deque_speed_test.cc

Signed-off-by: Brian Pane <[email protected]>

jmarantz · 2019-03-04T03:43:04Z

source/common/buffer/buffer_impl.cc

 }

+void OwnedImpl::useOldImpl(bool use_old_impl) { use_old_impl_ = use_old_impl; }


To ensure that this only gets called during the quiescent state between tests, there's a simple way now.

Have a static bool committed_ alongside use_old_impl_. In this method do:

void OwnedImpl::useOldImpl(bool use_old_impl) { ASSERT(!committed_); use_old_impl_ = use_old_impl; }

then you can write:

void OwnedImpl::clearCommitted() { committed_ = false; }

which you can call from TestListener::OnTestEnd():

envoy/test/test_listener.cc

Line 15 in 741df7a

}

That will allow switching implementations between test methods.

Actually I'm surprised this didn't bite you before, aren't you using TEST_P to test with both mechanisms in the same test binary?

Up until cfd9d4a, the buffer unit tests used a separate, test-only method that enabled the implementation to be changed.

Starting with that change, it's valid to switch back and forth between the implementations in the same process. The one remaining restriction is that two buffers can only interact (e.g., with buffer1.move(buffer2)) if they use the same implementation.

That enables the coverage test to progress further, but eventually it still hangs with an exception somewhere. It will take some more digging to find out where that's happening, as the hang happens in some test code that has caught the exception and is trying to log it.

Right; what I'm trying to suggest is that it still be invalid to switch between implementations within a process, allowing only for switching after a test method.

That should cover the case of the coverage tests, and also fuzz tests, both of which will work with the test listener as described.

I think we probably want to RELEASE_ASSERT, actually, that we only call useOldImpl once per process/test-method.

My thought was that like catching things at this level would be a stronger guarantee of the invariant that only one impl be alive at a time.

Of course you still have to finish your debugging :) There's a chance this suggestion may catch it, but it's probably something else.

jmarantz · 2019-03-04T04:13:42Z

test/common/buffer/utility.h

+class BufferImplementationParamTest : public testing::TestWithParam<BufferImplementation> {
+protected:
+  BufferImplementationParamTest() {
+    OwnedImpl::useOldImpl(GetParam() == BufferImplementation::Old);


independent of coverage testing, how did this even work with just this one file? Wouldn't this have hit your assert?

Does this suggest you were not successfully testing both impls with this structure? It seems you should've been hitting your assert within that one test binary.

Prior to the changes I added yesterday, the test code called a different method, useOldImplForTest, which didn't do the assert.

The test code has a call to verifyImplementation(buffer); in every test case to ensure that it's using the expected implementation.

Now that the code allows switching buffer implementations in non-test code, useOldImplForTest is no more, and the unit tests just use the same useOldImpl that the production code uses.

OK, this is fine. My suggestion was based around a theory that we should have safeguards against switching modes except at the start of a binary and between test-methods, but it's really a belt and suspenders argument and as long as you don't actually call it after startup in production it should be fine.

Basic trace fuzzer for the Buffer interface operations. Verifies some basic properties and compares with a shadow buffer based on std::string. Intended to support validation of envoyproxy#5441. Risk level: Low Testing: Simple manual corpus that provides 95.1% coverage of buffer_impl.cc. Will expand in future PRs. Signed-off-by: Harvey Tuch <[email protected]> Signed-off-by: Fred Douglas <[email protected]>

jmarantz

ready for senior review.

jmarantz · 2019-03-06T03:16:12Z

test/common/buffer/utility.h

+class BufferImplementationParamTest : public testing::TestWithParam<BufferImplementation> {
+protected:
+  BufferImplementationParamTest() {
+    OwnedImpl::useOldImpl(GetParam() == BufferImplementation::Old);


OK, this is fine. My suggestion was based around a theory that we should have safeguards against switching modes except at the start of a binary and between test-methods, but it's really a belt and suspenders argument and as long as you don't actually call it after startup in production it should be fine.

alyssawilk

Apologies but I'm not going to be able to make it through this whole PR today. Two quick thoughts and I'll start fording through the rest tomorrow.

alyssawilk · 2019-03-06T15:52:07Z

docs/root/intro/version_history.rst

@@ -31,6 +31,7 @@ Version history
 * mysql: added a MySQL proxy filter that is capable of parsing SQL queries over MySQL wire protocol. Refer to ::ref:`MySQL proxy<config_network_filters_mysql_proxy>` for more details.
 * http: added :ref:`max request headers size <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.max_request_headers_kb>`. The default behaviour is unchanged.
 * http: added modifyDecodingBuffer/modifyEncodingBuffer to allow modifying the buffered request/response data.
+* performance: new buffer implementation (disabled by default; to test it, add "--use-libevent-buffers 0" to the command-line arguments when starting Envoy).


This may have been addressed earlier in review but I'm going to be lazy here and ask you to catch me up. :-)

Do we have a plan for roll out? Are we submitting this, flipping it and seeing if it sticks? I wish we had a better plan for testing high risk changes than asking Lyft to smoke test, but that has often been what we do with and AFIK they prefer runtime over command line flags (which was part of the impetus behind #6134)

@rgs1 would you be interested in testing this with production traffic in your environment? It requires a command-line flag to turn it on.

I think the plan here should be:

Let's ship this PR

Let's set up fuzzing via @htuch. I agree with @htuch that if fuzzing doesn't uncover any issues I will feel pretty confident.

Once any fuzz issues are fixed, let's see if we can get a couple of production users to smoke test (envoy-dev/envoy-users email maybe).

Then remove the CLI flag, flip to default true, and config guard with the new stuff @alyssawilk has in runtime: codifying runtime guarded features #6134

WDYT?

Having finally at least skimmed through this whole PR, I'm not sure using a reloadable flag is appropriate. Many of the functions only work when comparing old buffer to old buffer and new to new, so I think we'd either have to do some plumbing to make sure the state was consistent throughout all buffers on a given connection (which might not even work then given muxed client connections can share a different muxed upstream connection) or we'd have to implement restart flags as well.

Rather one can latch the value of the runtime once where it's currently set, but then it's not actually doing new-style buffers until after a restart. internally we make clear the differences between reloadable and restart flags so you know the difference between when you push things and when they take effect. Alternately we could use the new config guard code it'd just be fairly misleading and we'd want to communicate it doesn't work as advertised in this case.

This PR used to latch the value just once per process, except for a test override method. But the problem with that was that it broke the coverage test, which expected to be able to concatenate together a lot of tests that originally were designed to be separate programs with their own control over the choice of buffer implementation.

source/common/http/http2/metadata_decoder.cc

alyssawilk · 2019-03-06T20:42:14Z

test/common/buffer/buffer_fuzz_test.cc

@@ -274,7 +274,7 @@ uint32_t bufferAction(Context& ctxt, char insert_value, uint32_t max_alloc, Buff
    Buffer::RawSlice slices[1];
    const uint64_t slices_used = target_buffer.getRawSlices(slices, 1);
    if (linearize_size > 0) {
-      FUZZ_ASSERT(slices_used == 1);
+      FUZZ_ASSERT(slices_used >= 1);


Just to sanity check, have we verfied that the in-envoy code sites either linearize(full buffer) or lineraize(n)/write(n) i.e. this change will end up being a no-op?

mattklein123

I skimmed through the general configuration, etc. and that part looks sane. I'm not going to have time to do an in-depth review for a while, but given the other comment I just made about rollout plan, I think we should ship the PR and then iterate. @envoyproxy/maintainers any objections? If not, then the only thing I would like is to move the metadata stuff out into another PR.

@brian-pane thank you for your awesome work here!!!

/wait

mattklein123 · 2019-03-06T23:00:09Z

@brian-pane also needs a master merge.

Signed-off-by: Brian Pane <[email protected]>

source/common/buffer/buffer_impl.cc

alyssawilk

Wow, that's a lot of code - thanks for tackling this huge feature. I'm with Matt on this one where I trust the fuzzers more than I trust the reviewers, but here's my pass in any case :-)

alyssawilk · 2019-03-07T14:44:36Z

source/common/buffer/buffer_impl.cc

+    // Next, scan forward and attempt to match the slices against iovecs.
+    uint64_t num_slices_committed = 0;
+    while (num_slices_committed < num_iovecs) {
+      if (slices_[slice_index]->commit(iovecs[num_slices_committed])) {


I belive with ev_buffer if commit failed, or if not all num_iovecs were comitted it would be considered an error and fail ASSERT.
If I'm reading that right we keep that ASSERT so we're equally likely to catch bugs in test, and sanity check other ev_buffer error handling code?

If there are license concerns with Brian looking at ev_buffer code I can go through the rest.

Right, the new implementation keeps the ASSERT because the failure to commit all of the supplied iovecs is probably a sign of a major bug. The two failure scenarios I can think of are:

The caller did the right thing and tried to properly commit iovecs that it got from buffer.reserve(), but the commit didn't work due to some internal bug in the buffer code.

Or the caller did the wrong thing and tried to buffer.commit() something that didn't come from that buffer.

FWIW, having as many ASSERTs as possible for consistency and debug checks has some history of paying off with the buffers + fuzzers. Please keep adding!

alyssawilk · 2019-03-07T14:58:23Z

source/common/buffer/buffer_impl.cc

+      if (i < out_size) {
+        out[i].mem_ = slice->data();
+        out[i].len_ = slice->dataSize();
+      }


Worth an else with an early return here? If we have a buffer with many slices and we ask for the first, we don't want to loop through all the slices

We have to loop through the all the slices because of the pre-existing semantics defined for the method in include/envoy/buffer/buffer.h

* @return the actual number of slices needed, which may be greater than out_size. Passing * nullptr for out and 0 for out_size will just return the size of the array needed * to capture all of the slice data.

I'll add a comment in the impl to explain why it does the counterintuitive thing.

alyssawilk · 2019-03-07T15:04:49Z

source/common/buffer/buffer_impl.cc

+  } else {
+    const char* src = static_cast<const char*>(data);
+    bool new_slice_needed = slices_.empty();
+    while (size != 0) {


Before evbuffer does any appending it sanity checks for total_len overflow. Should we do the same or do we think that's an artifact of their size_t vs our consistent use of uint64_t?

I don't think there's any case where there's going to be 2**64 bytes added but I could imagine cases where there was overflow in a size calculation that got plumbed through to add and it might be a useful place to fast-fail

There are several places in the buffer implementation where the length could overflow (e.g., adding buffer A to buffer B if they both contain 2**63 bytes), so I just pushed an update that wraps all the buffer length arithmetic operations in an overflow/underflow detector that clang and g++ provide.

Based on the CircleCI errors, the builtin overflow/underflow detector seems to have some portability problems. I'll have to write one from scratch.

alyssawilk · 2019-03-07T15:13:25Z

source/common/buffer/buffer_impl.cc

+      if (linearized_size >= size) {
+        break;
+      }
+    }


ev-buffer fast-fails and returns null if size > total_len, which would then result in the RELEASE_ASSERT above. Should we retain that?

Okay, I'll add a RELEASE_ASSERT that covers both the old and new impls.

alyssawilk · 2019-03-07T15:18:27Z

source/common/buffer/buffer_impl.cc

+    ASSERT(static_cast<uint64_t>(rc) == length);
+    static_cast<LibEventInstance&>(rhs).postProcess();
+  } else {
+    // See move() above for why we do the static cast.


static -> dynamic

Now that you mention it, the dynamic cast is redundant now, because the ASSERT(isSameBufferImpl(rhs)) at the start of the method already encapsulates the same check (and does additional, more thorough checks). I'll remove the dynamic_cast here.

alyssawilk · 2019-03-07T15:25:51Z

source/common/buffer/buffer_impl.cc

+    if (result.rc_ < 0) {
+      return {static_cast<int>(result.rc_), result.errno_};
+    }
+    uint64_t num_slices_to_commit = 0;


It looks bulk of this function is duplicated in both the if and else branch.
What do we think of moving the duplicated code before we check the buffer type?

Fixed in the latest push

alyssawilk · 2019-03-07T16:51:30Z

source/common/buffer/buffer_impl.cc

-      iov[num_slices_to_write].iov_base = slices[i].mem_;
-      iov[num_slices_to_write].iov_len = slices[i].len_;
-      num_slices_to_write++;
+  if (old_impl_) {


Ditto here - old impl and new impl look identical. Can we avoid the copy paste?

alyssawilk · 2019-03-07T17:00:36Z

docs/root/intro/version_history.rst

@@ -31,6 +31,7 @@ Version history
 * mysql: added a MySQL proxy filter that is capable of parsing SQL queries over MySQL wire protocol. Refer to ::ref:`MySQL proxy<config_network_filters_mysql_proxy>` for more details.
 * http: added :ref:`max request headers size <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.max_request_headers_kb>`. The default behaviour is unchanged.
 * http: added modifyDecodingBuffer/modifyEncodingBuffer to allow modifying the buffered request/response data.
+* performance: new buffer implementation (disabled by default; to test it, add "--use-libevent-buffers 0" to the command-line arguments when starting Envoy).


Having finally at least skimmed through this whole PR, I'm not sure using a reloadable flag is appropriate. Many of the functions only work when comparing old buffer to old buffer and new to new, so I think we'd either have to do some plumbing to make sure the state was consistent throughout all buffers on a given connection (which might not even work then given muxed client connections can share a different muxed upstream connection) or we'd have to implement restart flags as well.

Rather one can latch the value of the runtime once where it's currently set, but then it's not actually doing new-style buffers until after a restart. internally we make clear the differences between reloadable and restart flags so you know the difference between when you push things and when they take effect. Alternately we could use the new config guard code it'd just be fairly misleading and we'd want to communicate it doesn't work as advertised in this case.

* Deduplicated the code paths where the old and new impl were the same * Added some clarifying comments * Removed some now-extraneous dynamic_casts Signed-off-by: Brian Pane <[email protected]>

Signed-off-by: Brian Pane <[email protected]>

…underflow Signed-off-by: Brian Pane <[email protected]>

Signed-off-by: Brian Pane <[email protected]>

brian-pane · 2019-03-10T18:28:59Z

I've addressed all the review comments, so this is ready for re-review.

mattklein123 · 2019-03-12T22:25:48Z

@jmarantz @alyssawilk @envoyproxy/maintainers any additional thoughts on this one? I think we should ship and iterate.

alyssawilk · 2019-03-13T13:01:57Z

+1.

I'm for checking this in, fuzzing the heck out of it, (ideally but optionally) getting a post-fuzzing review pass from someone security minded, canarying somewhere, then defaulting true.

jmarantz · 2019-03-13T13:45:25Z

+1 -- having a security-minded review would be good; do we have someone on the project that would be suitable?

mattklein123 · 2019-03-13T16:00:40Z

OK let's merge and iterate. We can discuss next steps once initial fuzzing is complete and any bugs are fixed. @brian-pane thank you for all your hard work here and being such a trooper with reviews.

* master: (59 commits) http fault: add response rate limit injection (envoyproxy#6267) xds: introduce initial_fetch_timeout option to limit initialization time (envoyproxy#6048) test: fix cpuset-threads tests (envoyproxy#6278) server: add an API for registering for notifications for server instance life… (envoyproxy#6254) remove remains of TestBase (envoyproxy#6286) dubbo_proxy: Implement the routing of Dubbo requests (envoyproxy#5973) Revert "stats: add new BoolIndicator stat type (envoyproxy#5813)" (envoyproxy#6280) runtime: codifying runtime guarded features (envoyproxy#6134) mysql_filter: fix integration test flakes (envoyproxy#6272) tls: update BoringSSL to debed9a4 (3683). (envoyproxy#6273) rewrite buffer implementation to eliminate evbuffer dependency (envoyproxy#5441) Remove the dependency from TimeSystem to libevent by using the Event::Scheduler abstraction as a delegate. (envoyproxy#6240) fuzz: fix use of literal in default initialization. (envoyproxy#6268) http: add HCM functionality required for rate limiting (envoyproxy#6242) Disable mysql_integration_test until it is deflaked. (envoyproxy#6250) test: use ipv6_only IPv6 addresses in custom cluster integration tests. (envoyproxy#6260) tracing: If parent span is propagated with empty string, it causes th… (envoyproxy#6263) upstream: fix oss-fuzz issue envoyproxy#11095. (envoyproxy#6220) Wire up panic mode subset to receive updates (envoyproxy#6221) docs: clarify xds docs with warming information (envoyproxy#6236) ...

brian-pane added 3 commits December 31, 2018 09:45

rewrite buffer implementation to eliminate evbuffer dependency

8d8090f

Signed-off-by: Brian Pane <[email protected]>

fix clang-tidy warnings

b038d9b

Signed-off-by: Brian Pane <[email protected]>

fix additional clang-tidy warnings

41b9397

Signed-off-by: Brian Pane <[email protected]>

jmarantz self-assigned this Jan 3, 2019

jmarantz reviewed Jan 3, 2019

View reviewed changes

cleanups based on code review feedback

7845cb6

Signed-off-by: Brian Pane <[email protected]>

lizan reviewed Jan 3, 2019

View reviewed changes

ggreenway mentioned this pull request Jan 3, 2019

buffer: fix vulnerabilities when allocation fails #5468

Merged

jmarantz reviewed Jan 4, 2019

View reviewed changes

brian-pane mentioned this pull request Jan 4, 2019

add a performance benchmark test for the Buffer implementation #5474

Merged

jmarantz reviewed Jan 4, 2019

View reviewed changes

cleanups based on code review feedback

14722be

Signed-off-by: Brian Pane <[email protected]>

jmarantz reviewed Jan 4, 2019

View reviewed changes

brian-pane added 3 commits January 4, 2019 19:54

clean up the BufferSlice interface

7840114

Signed-off-by: Brian Pane <[email protected]>

Merge remote-tracking branch 'upstream/master' into buffer

bfca4f9

Signed-off-by: Brian Pane <[email protected]>

formatting fixes and additional comments

7094216

Signed-off-by: Brian Pane <[email protected]>

enable reserve() to use multiple slices

fa94625

Signed-off-by: Brian Pane <[email protected]>

brian-pane added 6 commits January 9, 2019 19:05

redesign the slice implementation for better performance

b38b8da

Signed-off-by: Brian Pane <[email protected]>

add a faster deque implementation

33fdb6f

Signed-off-by: Brian Pane <[email protected]>

Merge remote-tracking branch 'upstream/master' into buffer

059c366

Signed-off-by: Brian Pane <[email protected]>

speed up prepend operations and small reserve/commit operations

0af83c1

Signed-off-by: Brian Pane <[email protected]>

add unit tests for Slice and SliceDeque

5b3f606

Signed-off-by: Brian Pane <[email protected]>

fix comment typos

4f95325

Signed-off-by: Brian Pane <[email protected]>

formatting fix

3bcd656

Signed-off-by: Brian Pane <[email protected]>

jmarantz reviewed Mar 4, 2019

View reviewed changes

jmarantz previously approved these changes Mar 6, 2019

View reviewed changes

jmarantz assigned alyssawilk Mar 6, 2019

alyssawilk reviewed Mar 6, 2019

View reviewed changes

mattklein123 requested changes Mar 6, 2019

View reviewed changes

repokitteh-read-only bot added the waiting label Mar 6, 2019

brian-pane added 2 commits March 6, 2019 19:25

revert changes to metadata_encoder

d405218

Signed-off-by: Brian Pane <[email protected]>

Merge remote-tracking branch 'upstream/master' into buffer

cf37183

Signed-off-by: Brian Pane <[email protected]>

brian-pane dismissed jmarantz’s stale review via cf37183 March 7, 2019 04:01

repokitteh-read-only bot removed the waiting label Mar 7, 2019

lizan reviewed Mar 7, 2019

View reviewed changes

source/common/buffer/buffer_impl.cc Outdated Show resolved Hide resolved

alyssawilk reviewed Mar 7, 2019

View reviewed changes

brian-pane added 5 commits March 9, 2019 13:56

changes based on code review feedback:

c40a58a

* Deduplicated the code paths where the old and new impl were the same * Added some clarifying comments * Removed some now-extraneous dynamic_casts Signed-off-by: Brian Pane <[email protected]>

Merge remote-tracking branch 'upstream/master' into buffer

0d3e1c4

Signed-off-by: Brian Pane <[email protected]>

add runtime overflow checking for buffer length

267d269

Signed-off-by: Brian Pane <[email protected]>

switch to a compiler-independent detection of buffer length overflow/…

a9ff465

…underflow Signed-off-by: Brian Pane <[email protected]>

workaround for the coverage build

4d552f2

Signed-off-by: Brian Pane <[email protected]>

mattklein123 approved these changes Mar 13, 2019

View reviewed changes

mattklein123 merged commit dfd291f into envoyproxy:master Mar 13, 2019

brian-pane deleted the buffer branch May 25, 2019 21:18

		}

		void OwnedImpl::useOldImpl(bool use_old_impl) { use_old_impl_ = use_old_impl; }

rewrite buffer implementation to eliminate evbuffer dependency #5441

rewrite buffer implementation to eliminate evbuffer dependency #5441

Conversation

brian-pane commented Dec 31, 2018

mattklein123 commented Jan 3, 2019

jmarantz commented Jan 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brian-pane commented Jan 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz Jan 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brian-pane commented Jan 5, 2019

jmarantz commented Jan 5, 2019

brian-pane commented Jan 6, 2019

jmarantz commented Jan 6, 2019

jmarantz Mar 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alyssawilk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattklein123 left a comment

Choose a reason for hiding this comment

mattklein123 commented Mar 6, 2019

alyssawilk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmarantz Jan 4, 2019 •

edited

Loading

jmarantz Mar 4, 2019 •

edited

Loading