Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite buffer implementation to eliminate evbuffer dependency #5441

Merged
merged 48 commits into from
Mar 13, 2019
Merged

rewrite buffer implementation to eliminate evbuffer dependency #5441

merged 48 commits into from
Mar 13, 2019

Conversation

brian-pane
Copy link
Contributor

Description:
Replace the use of libevent's evbuffer in Buffer::OwnedImpl with a new buffer implementation.

Risk Level: high
Testing: I ran the unit tests
Docs Changes: N/A
Release Notes: N/A
Related Issues: #4952

Signed-off-by: Brian Pane [email protected]

@mattklein123
Copy link
Member

Thanks @brian-pane. @envoyproxy/maintainers any takers on reviewing this? @jmarantz this seems like something you might enjoy. 😉

@jmarantz jmarantz self-assigned this Jan 3, 2019
@jmarantz
Copy link
Contributor

jmarantz commented Jan 3, 2019

I need to dig into this but my first thought is that I'd like to see a perf test with this PR. My pattern lately has been to write the perf-test in a separate branch from master, and see how the current implementation performs. Then you can patch the perf test into this branch and measure any improvement.

I'm actually not sure if your motivation is to improve performance or simply to cut a troublesome dependency (and why is it troublesome?) but even if it's the latter, we should quantify the performance impact. In any case you should put the motivation for this PR in the description.

One such perf test is https://github.com/envoyproxy/envoy/blob/master/test/common/stats/thread_local_store_speed_test.cc

"RawSlice != evbuffer_iovec");
static_assert(offsetof(RawSlice, len_) == offsetof(evbuffer_iovec, iov_len),
"RawSlice != evbuffer_iovec");
static uint64_t SliceSize(uint64_t data_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you make this a static method of OwnedBufferSlice then you can unit-test it. Also it should be sliceSize() whether a static method or static function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

buffer_.get(), fragment.data(), fragment.size(),
[](const void*, size_t, void* arg) { static_cast<BufferFragment*>(arg)->done(); }, &fragment);
OwnedBufferSlice(const void* data, uint64_t size) : OwnedBufferSlice(size) {
memcpy(&(base_[0]), data, size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_.get() ?

source/common/buffer/buffer_impl.cc Outdated Show resolved Hide resolved
@brian-pane
Copy link
Contributor Author

Thanks for looking at this, @jmarantz! I'll put together a performance test tomorrow. My motivation for the PR is to remove one of the obstacles to using a different event multiplexer library in the future (e.g., #4952), but I do want to match or improve upon the current implementation's speed and memory footprint.

*/
class BufferSlice {
public:
using Reservation = std::pair<void*, uint64_t>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of std::pair, use absl::string_view? It is more readable than .first/.second and comes with some utility function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that too...only difference is that Reservation.first is non-const, so you'd need to const-cast whenever you need to write.

In general I wish there was less const-casting in this impl as that always requires extra scrutiny during reviews and debugging. However maybe you can structure that with wrappers to isolate the non-const uses.

If you don't do that, I do think you should use uint8_t* rather than void* to reduce casting all over the place. You can always return a void* without casting if you have a uint8_t* but not vice versa.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think you're right string_view is const pointer. Then perhaps a simple struct with non-const pointer and size will work, I think in general we prefer named struct than std::pair.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on struct vs pair. Thus I suggest

struct Reservation {
  uint8_t* data_;
  uint64_t size_;
};

This may increase readability by reducing casts? See also RawSlice in include/envoy/buffer/buffer.h which is similar, but uses void* so will require casting for all uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe RawSlice itself should be the data type that represents a reservation. I'm thinking through the pros and cons now...

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments...I have not looked at all the code but not deeply ingested it.

Would love to see perf tests.

*/
class BufferSlice {
public:
using Reservation = std::pair<void*, uint64_t>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that too...only difference is that Reservation.first is non-const, so you'd need to const-cast whenever you need to write.

In general I wish there was less const-casting in this impl as that always requires extra scrutiny during reviews and debugging. However maybe you can structure that with wrappers to isolate the non-const uses.

If you don't do that, I do think you should use uint8_t* rather than void* to reduce casting all over the place. You can always return a void* without casting if you have a uint8_t* but not vice versa.

@@ -1,6 +1,7 @@
#include "common/buffer/buffer_impl.h"

#include <cstdint>
#include <iostream>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you using this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's left over from debugging. I'll post an update to remove it.

/**
* Compute a slice size big enough to hold a specified amount of data.
* @param data_size the minimum amount of data the slice must be able to store, in bytes.
* @ return a recommended slice size, in bytes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@return uint64_t a recommended slice size, in bytes

(Remove the intervening space, specify the return type. That's the envoy style for doxygen-like return comments although it seems redundant to me)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch on the the extraneous space; I'll fix that in my next update. With regard to listing the type after the @return, that style is apparently "outdated," based on the discussion here: #2612 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excluding the type makes sense to me; I hadn't noticed that thread.

}

void* data() override {
return static_cast<uint8_t*>(const_cast<void*>(fragment_.data())) + data_;
Copy link
Contributor

@jmarantz jmarantz Jan 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider defining the const data() in terms of the non-const data() so you don't have to repeat the logic. Or vice versa.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

}

uint64_t reservableSize() const override { return 0; }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group the one-liners together and drop the intervening extra newlines.

while (!other.slices_.empty()) {
if (length == 0) {
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while (length != 0 && !other.slices_.empty()) { ... more concise and possibly faster due to making easier check first.

if (copy_size == 0) {
other.slices_.pop_front();
} else if (copy_size < other.slices_.front()->dataSize()) {
// TODO add reference-counting to allow slices to share their storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO(brian-pane): add reference-counter.... here and below.

source/common/buffer/buffer_impl.cc Outdated Show resolved Hide resolved
evbuffer_ptr start_ptr;
if (-1 == evbuffer_ptr_set(buffer_.get(), &start_ptr, start, EVBUFFER_PTR_SET)) {
return -1;
// This implementation uses the same search algorithm as evbuffer_search(), a naive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this method get called a lot in Envoy? It does seem like this could be linear without that much difficulty. If it gets called on the data path it seems worth optimizing.

if it doesn't get called on the data path you could just replace all this with buffer->toString().find(... ... even that would be linear although doing an allocation to implement a search seems wrong :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen this method show up in CPU profiles from production systems, so as far as I can tell it doesn't get used much in Envoy.

There are three calls to the method in Envoy currently, one of which appears to search untrusted inputs - and that one has a fixed pattern of length 12.

It's probably worth noting that, in the common case, both the current evbuffer_search implementation and this new code already benefit from an existing optimization: they use memchr to find the first character of the pattern, and for a large input size n, a typical vectorized memchr implementation only needs to do n/v conditional branches, where v is 16 or 32 depending on the size of the available SIMD registers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was in error thinking this is easy to linearize, and I see the value in being able to do the vectorized memchr, so this looks like a good framework to start with until we see a perf problem.

But since the algorithm is very long with lots of corner-cases I think you need to add unit tests to cover a bunch of them. The current buffer_test.cc doesn't try to cover the underlying evbuffer_search impl which is assumed to be tested in the libevent repo, but now that you own it, you need to test it.

test/test_common/utility.cc Show resolved Hide resolved
iovecs[0].len_ = 0;
iovecs[1].len_ = 0;
data.commit(iovecs, 2);
Buffer::RawSlice iovec1[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this test need to change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test depended on an implementation detail of evbuffer: that particular call to reserve resulted in the creation of two separate slices inside the buffer. The new OwnedImpl implementation satisfies the reservation with a single slice in this case. One of the TODO items in buffer_impl.cc will probably result in the new implementation using two slices in this case in the future, to match how evbuffer works today, but I still wanted to make the test less dependent on implementation details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that still true? or have you fixed this now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code here will now work once again, so I'll revert to that.

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel @alyssawilk should probably also have a look at this RE performance, given her related watermark work.

*/
class BufferSlice {
public:
using Reservation = std::pair<void*, uint64_t>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on struct vs pair. Thus I suggest

struct Reservation {
  uint8_t* data_;
  uint64_t size_;
};

This may increase readability by reducing casts? See also RawSlice in include/envoy/buffer/buffer.h which is similar, but uses void* so will require casting for all uses.

reinterpret_cast<evbuffer_iovec*>(iovecs), num_iovecs);
ASSERT(ret >= 1);
return ret;
// TODO(brian-pane) support the use of space in more than one slice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this only works for num_iovecs==1? That seems like a problem if it's used in Envoy. If it's not, why not just remove the argument here and at the call-sites?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more memory-efficient to support multiple iovecs, given the way Envoy does its reservations for socket reads. The only reason this PR doesn't support the multiple-iovec case yet is that the semantics are ill-defined. For example, there are cases where Envoy gets a reservation from an evbuffer with num_iovecs==2 and then calls commit on just the first iovec. It's unclear whether that is supposed to release the reservation on the second slice; I'll probably have to end up digging into the evbuffer implementation to find out the intended semantics. (I'm planning to do that in the very near term.)

evbuffer_ptr start_ptr;
if (-1 == evbuffer_ptr_set(buffer_.get(), &start_ptr, start, EVBUFFER_PTR_SET)) {
return -1;
// This implementation uses the same search algorithm as evbuffer_search(), a naive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was in error thinking this is easy to linearize, and I see the value in being able to do the vectorized memchr, so this looks like a good framework to start with until we see a perf problem.

But since the algorithm is very long with lots of corner-cases I think you need to add unit tests to cover a bunch of them. The current buffer_test.cc doesn't try to cover the underlying evbuffer_search impl which is assumed to be tested in the libevent repo, but now that you own it, you need to test it.

test/test_common/utility.cc Show resolved Hide resolved
}

void OwnedImpl::add(absl::string_view data) {
evbuffer_add(buffer_.get(), data.data(), data.size());
void OwnedImpl::add(const void* data, uint64_t size) { append(data, size); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a comment in the base-class saying this method is deprecated; it's worth checking if there are remaining call-sites.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked, and there are some remaining call-sites within Envoy itself that depend on this method. include/envoy/buffer/buffer.h itself won't even compile with the deprecated method removed.

* Commit a Reservation that was previously obtained from a call to reserve().
* The Reservation's size is added to the Data section.
* @param reservation a reservation obtained from a previous call to reserve().
* If the reservation is not from this BufferSlice, commit() will return false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstances would a user of this class commit a reservation from outside the BufferSlice? Just explain in the comment, especially in light of the comment above "the result of calling commit() out of order is undefined"

source/common/buffer/buffer_impl.h Show resolved Hide resolved
@brian-pane
Copy link
Contributor Author

I just added a couple of significant changes:

  • Reservations from BufferSlice now use RawSlice rather than std::pair.
  • In RawSlice, the pointer to the start of the data is now immutable (void* const), with some comments in buffer.h to explain why. The old implementation also depended on the caller not modifying mem_ between the reserve and commit calls, so this change helps enforce pre-existing semantics.

Over the next couple of days, I plan to:

  • Address the rest of the outstanding review comments.
  • Add support for reservations that span more than one iovec (currently listed as a TODO comment in the PR, and important because otherwise this new implementation will use more memory than evbuffer).

@jmarantz
Copy link
Contributor

jmarantz commented Jan 5, 2019

If you have something functional, but whose performance/memory usage has yet to be proven to be the same or better as libevent's impl, you might want to check it in as an alternate implementation that's tested but not in use. That way you can iterate on it, with more frequent and smaller PRs.

One approach toward this is to parameterize your test, as shown in PR #5414

class StatNameTest : public testing::TestWithParam<SymbolTableType> {

Note that checking in a not-quite-good-enough impl will not go down well with folks that are deploying Envoy from HEAD every week, as opposed to waiting for a numbered release.

@brian-pane
Copy link
Contributor Author

I have the multi-iovec reservations working now (not yet posted to this PR). Based on the benchmark results, there probably are two more changes needed to achieve perf parity with evbuffer: reference-counting to eliminate the copy in the partial-move case, and the elimination of one of the mallocs during the slice creation. And I might have to give up on std::deque. Those will be substantial design changes, so I'll iterate on them offline and post an update here once the perf numbers are on par with libevent.

------------------------------------------------------------------------
Benchmark                             Original           New
------------------------------------------------------------------------
BufferCreateEmpty                        80 ns          7 ns
BufferCreate/1                          163 ns        382 ns
BufferCreate/4096                       246 ns        543 ns
BufferCreate/16384                      577 ns       1076 ns
BufferCreate/65536                     3782 ns       4488 ns
BufferAddString/1                        18 ns         29 ns
BufferAddString/4096                    140 ns        322 ns
BufferAddString/16384                   535 ns        910 ns
BufferAddString/65536                  3418 ns       4289 ns
BufferAddBuffer/1                        40 ns         50 ns
BufferAddBuffer/4096                    164 ns        340 ns
BufferAddBuffer/16384                   570 ns        948 ns
BufferAddBuffer/65536                  3467 ns       4324 ns
BufferDrain/1                            40 ns         50 ns
BufferDrain/4096                        183 ns        378 ns
BufferDrain/16384                       575 ns        987 ns
BufferDrain/65536                      3702 ns       4626 ns
BufferMove/1                             48 ns         21 ns
BufferMove/4096                          48 ns         21 ns
BufferMove/16384                         48 ns         21 ns
BufferMove/65536                         48 ns         21 ns
BufferMovePartial/1                      61 ns         33 ns
BufferMovePartial/4096               118961 ns   11968450 ns
BufferMovePartial/16384              473925 ns   95152762 ns
BufferMovePartial/65536             1895537 ns  302529200 ns
BufferReserveCommit/1                    97 ns        215 ns
BufferReserveCommit/4096                 98 ns        250 ns
BufferReserveCommit/16384                99 ns        457 ns
BufferReserveCommit/65536               986 ns       1772 ns
BufferReserveCommitPartial/1             95 ns        225 ns
BufferReserveCommitPartial/4096          97 ns        248 ns
BufferReserveCommitPartial/16384         98 ns        459 ns
BufferReserveCommitPartial/65536        993 ns       1750 ns
BufferSearch/1                           39 ns         28 ns
BufferSearch/4096                       124 ns        124 ns
BufferSearch/16384                      385 ns        382 ns
BufferSearch/65536                     1761 ns       1764 ns
BufferSearchPartialMatch/1              467 ns        282 ns
BufferSearchPartialMatch/4096         41133 ns      24597 ns
BufferSearchPartialMatch/16384       162766 ns      97562 ns
BufferSearchPartialMatch/65536       651514 ns     387966 ns

@jmarantz
Copy link
Contributor

jmarantz commented Jan 6, 2019

OK...you sure you don't want to try to check this in as an alternate impl, so you can iterate in small PRs? I think it'd be easier to review, rather than waiting till performance perfection?

What's your specific issue with deque? As it happens, a while back on a different project, I had a measured performance issue with having lots of small deques and wound up re-implementing a subset of the std::deque API using a vector and getting a lot of performance benefit for my use-case: https://github.com/apache/incubator-pagespeed-mod/blob/master/pagespeed/kernel/base/vector_deque.h , with a speed-test at https://github.com/apache/incubator-pagespeed-mod/blob/master/pagespeed/kernel/util/deque_speed_test.cc

Signed-off-by: Brian Pane <[email protected]>
}

void OwnedImpl::useOldImpl(bool use_old_impl) { use_old_impl_ = use_old_impl; }
Copy link
Contributor

@jmarantz jmarantz Mar 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure that this only gets called during the quiescent state between tests, there's a simple way now.

Have a static bool committed_ alongside use_old_impl_. In this method do:

void OwnedImpl::useOldImpl(bool use_old_impl) {
  ASSERT(!committed_);
  use_old_impl_ = use_old_impl;
}

then you can write:

void OwnedImpl::clearCommitted() { committed_ = false; }

which you can call from TestListener::OnTestEnd():

That will allow switching implementations between test methods.

Actually I'm surprised this didn't bite you before, aren't you using TEST_P to test with both mechanisms in the same test binary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up until cfd9d4a, the buffer unit tests used a separate, test-only method that enabled the implementation to be changed.

Starting with that change, it's valid to switch back and forth between the implementations in the same process. The one remaining restriction is that two buffers can only interact (e.g., with buffer1.move(buffer2)) if they use the same implementation.

That enables the coverage test to progress further, but eventually it still hangs with an exception somewhere. It will take some more digging to find out where that's happening, as the hang happens in some test code that has caught the exception and is trying to log it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right; what I'm trying to suggest is that it still be invalid to switch between implementations within a process, allowing only for switching after a test method.

That should cover the case of the coverage tests, and also fuzz tests, both of which will work with the test listener as described.

I think we probably want to RELEASE_ASSERT, actually, that we only call useOldImpl once per process/test-method.

My thought was that like catching things at this level would be a stronger guarantee of the invariant that only one impl be alive at a time.

Of course you still have to finish your debugging :) There's a chance this suggestion may catch it, but it's probably something else.

class BufferImplementationParamTest : public testing::TestWithParam<BufferImplementation> {
protected:
BufferImplementationParamTest() {
OwnedImpl::useOldImpl(GetParam() == BufferImplementation::Old);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

independent of coverage testing, how did this even work with just this one file? Wouldn't this have hit your assert?

Does this suggest you were not successfully testing both impls with this structure? It seems you should've been hitting your assert within that one test binary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to the changes I added yesterday, the test code called a different method, useOldImplForTest, which didn't do the assert.

The test code has a call to verifyImplementation(buffer); in every test case to ensure that it's using the expected implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the code allows switching buffer implementations in non-test code, useOldImplForTest is no more, and the unit tests just use the same useOldImpl that the production code uses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is fine. My suggestion was based around a theory that we should have safeguards against switching modes except at the start of a binary and between test-methods, but it's really a belt and suspenders argument and as long as you don't actually call it after startup in production it should be fine.

fredlas pushed a commit to fredlas/envoy that referenced this pull request Mar 5, 2019
Basic trace fuzzer for the Buffer interface operations. Verifies some
basic properties and compares with a shadow buffer based on std::string. Intended to support validation of
envoyproxy#5441.

Risk level: Low
Testing: Simple manual corpus that provides 95.1% coverage of
buffer_impl.cc. Will expand in future PRs.

Signed-off-by: Harvey Tuch <[email protected]>
Signed-off-by: Fred Douglas <[email protected]>
jmarantz
jmarantz previously approved these changes Mar 6, 2019
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ready for senior review.

class BufferImplementationParamTest : public testing::TestWithParam<BufferImplementation> {
protected:
BufferImplementationParamTest() {
OwnedImpl::useOldImpl(GetParam() == BufferImplementation::Old);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is fine. My suggestion was based around a theory that we should have safeguards against switching modes except at the start of a binary and between test-methods, but it's really a belt and suspenders argument and as long as you don't actually call it after startup in production it should be fine.

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies but I'm not going to be able to make it through this whole PR today. Two quick thoughts and I'll start fording through the rest tomorrow.

@@ -31,6 +31,7 @@ Version history
* mysql: added a MySQL proxy filter that is capable of parsing SQL queries over MySQL wire protocol. Refer to ::ref:`MySQL proxy<config_network_filters_mysql_proxy>` for more details.
* http: added :ref:`max request headers size <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.max_request_headers_kb>`. The default behaviour is unchanged.
* http: added modifyDecodingBuffer/modifyEncodingBuffer to allow modifying the buffered request/response data.
* performance: new buffer implementation (disabled by default; to test it, add "--use-libevent-buffers 0" to the command-line arguments when starting Envoy).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may have been addressed earlier in review but I'm going to be lazy here and ask you to catch me up. :-)

Do we have a plan for roll out? Are we submitting this, flipping it and seeing if it sticks? I wish we had a better plan for testing high risk changes than asking Lyft to smoke test, but that has often been what we do with and AFIK they prefer runtime over command line flags (which was part of the impetus behind #6134)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rgs1 would you be interested in testing this with production traffic in your environment? It requires a command-line flag to turn it on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the plan here should be:

  1. Let's ship this PR
  2. Let's set up fuzzing via @htuch. I agree with @htuch that if fuzzing doesn't uncover any issues I will feel pretty confident.
  3. Once any fuzz issues are fixed, let's see if we can get a couple of production users to smoke test (envoy-dev/envoy-users email maybe).
  4. Then remove the CLI flag, flip to default true, and config guard with the new stuff @alyssawilk has in runtime: codifying runtime guarded features #6134

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having finally at least skimmed through this whole PR, I'm not sure using a reloadable flag is appropriate. Many of the functions only work when comparing old buffer to old buffer and new to new, so I think we'd either have to do some plumbing to make sure the state was consistent throughout all buffers on a given connection (which might not even work then given muxed client connections can share a different muxed upstream connection) or we'd have to implement restart flags as well.

Rather one can latch the value of the runtime once where it's currently set, but then it's not actually doing new-style buffers until after a restart. internally we make clear the differences between reloadable and restart flags so you know the difference between when you push things and when they take effect. Alternately we could use the new config guard code it'd just be fairly misleading and we'd want to communicate it doesn't work as advertised in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR used to latch the value just once per process, except for a test override method. But the problem with that was that it broke the coverage test, which expected to be able to concatenate together a lot of tests that originally were designed to be separate programs with their own control over the choice of buffer implementation.

source/common/http/http2/metadata_decoder.cc Outdated Show resolved Hide resolved
@@ -274,7 +274,7 @@ uint32_t bufferAction(Context& ctxt, char insert_value, uint32_t max_alloc, Buff
Buffer::RawSlice slices[1];
const uint64_t slices_used = target_buffer.getRawSlices(slices, 1);
if (linearize_size > 0) {
FUZZ_ASSERT(slices_used == 1);
FUZZ_ASSERT(slices_used >= 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to sanity check, have we verfied that the in-envoy code sites either linearize(full buffer) or lineraize(n)/write(n) i.e. this change will end up being a no-op?

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed through the general configuration, etc. and that part looks sane. I'm not going to have time to do an in-depth review for a while, but given the other comment I just made about rollout plan, I think we should ship the PR and then iterate. @envoyproxy/maintainers any objections? If not, then the only thing I would like is to move the metadata stuff out into another PR.

@brian-pane thank you for your awesome work here!!!

/wait

@mattklein123
Copy link
Member

@brian-pane also needs a master merge.

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's a lot of code - thanks for tackling this huge feature. I'm with Matt on this one where I trust the fuzzers more than I trust the reviewers, but here's my pass in any case :-)

// Next, scan forward and attempt to match the slices against iovecs.
uint64_t num_slices_committed = 0;
while (num_slices_committed < num_iovecs) {
if (slices_[slice_index]->commit(iovecs[num_slices_committed])) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I belive with ev_buffer if commit failed, or if not all num_iovecs were comitted it would be considered an error and fail ASSERT.
If I'm reading that right we keep that ASSERT so we're equally likely to catch bugs in test, and sanity check other ev_buffer error handling code?

If there are license concerns with Brian looking at ev_buffer code I can go through the rest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the new implementation keeps the ASSERT because the failure to commit all of the supplied iovecs is probably a sign of a major bug. The two failure scenarios I can think of are:

  • The caller did the right thing and tried to properly commit iovecs that it got from buffer.reserve(), but the commit didn't work due to some internal bug in the buffer code.
  • Or the caller did the wrong thing and tried to buffer.commit() something that didn't come from that buffer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, having as many ASSERTs as possible for consistency and debug checks has some history of paying off with the buffers + fuzzers. Please keep adding!

if (i < out_size) {
out[i].mem_ = slice->data();
out[i].len_ = slice->dataSize();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth an else with an early return here? If we have a buffer with many slices and we ask for the first, we don't want to loop through all the slices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to loop through the all the slices because of the pre-existing semantics defined for the method in include/envoy/buffer/buffer.h

   * @return the actual number of slices needed, which may be greater than out_size. Passing
   *         nullptr for out and 0 for out_size will just return the size of the array needed
   *         to capture all of the slice data.

I'll add a comment in the impl to explain why it does the counterintuitive thing.

} else {
const char* src = static_cast<const char*>(data);
bool new_slice_needed = slices_.empty();
while (size != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before evbuffer does any appending it sanity checks for total_len overflow. Should we do the same or do we think that's an artifact of their size_t vs our consistent use of uint64_t?

I don't think there's any case where there's going to be 2**64 bytes added but I could imagine cases where there was overflow in a size calculation that got plumbed through to add and it might be a useful place to fast-fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several places in the buffer implementation where the length could overflow (e.g., adding buffer A to buffer B if they both contain 2**63 bytes), so I just pushed an update that wraps all the buffer length arithmetic operations in an overflow/underflow detector that clang and g++ provide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the CircleCI errors, the builtin overflow/underflow detector seems to have some portability problems. I'll have to write one from scratch.

if (linearized_size >= size) {
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ev-buffer fast-fails and returns null if size > total_len, which would then result in the RELEASE_ASSERT above. Should we retain that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll add a RELEASE_ASSERT that covers both the old and new impls.

ASSERT(static_cast<uint64_t>(rc) == length);
static_cast<LibEventInstance&>(rhs).postProcess();
} else {
// See move() above for why we do the static cast.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static -> dynamic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you mention it, the dynamic cast is redundant now, because the ASSERT(isSameBufferImpl(rhs)) at the start of the method already encapsulates the same check (and does additional, more thorough checks). I'll remove the dynamic_cast here.

if (result.rc_ < 0) {
return {static_cast<int>(result.rc_), result.errno_};
}
uint64_t num_slices_to_commit = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks bulk of this function is duplicated in both the if and else branch.
What do we think of moving the duplicated code before we check the buffer type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest push

iov[num_slices_to_write].iov_base = slices[i].mem_;
iov[num_slices_to_write].iov_len = slices[i].len_;
num_slices_to_write++;
if (old_impl_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here - old impl and new impl look identical. Can we avoid the copy paste?

@@ -31,6 +31,7 @@ Version history
* mysql: added a MySQL proxy filter that is capable of parsing SQL queries over MySQL wire protocol. Refer to ::ref:`MySQL proxy<config_network_filters_mysql_proxy>` for more details.
* http: added :ref:`max request headers size <envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.max_request_headers_kb>`. The default behaviour is unchanged.
* http: added modifyDecodingBuffer/modifyEncodingBuffer to allow modifying the buffered request/response data.
* performance: new buffer implementation (disabled by default; to test it, add "--use-libevent-buffers 0" to the command-line arguments when starting Envoy).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having finally at least skimmed through this whole PR, I'm not sure using a reloadable flag is appropriate. Many of the functions only work when comparing old buffer to old buffer and new to new, so I think we'd either have to do some plumbing to make sure the state was consistent throughout all buffers on a given connection (which might not even work then given muxed client connections can share a different muxed upstream connection) or we'd have to implement restart flags as well.

Rather one can latch the value of the runtime once where it's currently set, but then it's not actually doing new-style buffers until after a restart. internally we make clear the differences between reloadable and restart flags so you know the difference between when you push things and when they take effect. Alternately we could use the new config guard code it'd just be fairly misleading and we'd want to communicate it doesn't work as advertised in this case.

* Deduplicated the code paths where the old and new impl were the same
* Added some clarifying comments
* Removed some now-extraneous dynamic_casts

Signed-off-by: Brian Pane <[email protected]>
@brian-pane
Copy link
Contributor Author

I've addressed all the review comments, so this is ready for re-review.

@mattklein123
Copy link
Member

@jmarantz @alyssawilk @envoyproxy/maintainers any additional thoughts on this one? I think we should ship and iterate.

@alyssawilk
Copy link
Contributor

+1.

I'm for checking this in, fuzzing the heck out of it, (ideally but optionally) getting a post-fuzzing review pass from someone security minded, canarying somewhere, then defaulting true.

@jmarantz
Copy link
Contributor

+1 -- having a security-minded review would be good; do we have someone on the project that would be suitable?

@mattklein123
Copy link
Member

OK let's merge and iterate. We can discuss next steps once initial fuzzing is complete and any bugs are fixed. @brian-pane thank you for all your hard work here and being such a trooper with reviews.

@mattklein123 mattklein123 merged commit dfd291f into envoyproxy:master Mar 13, 2019
spenceral added a commit to spenceral/envoy that referenced this pull request Mar 20, 2019
* master: (59 commits)
  http fault: add response rate limit injection (envoyproxy#6267)
  xds: introduce initial_fetch_timeout option to limit initialization time (envoyproxy#6048)
  test: fix cpuset-threads tests (envoyproxy#6278)
  server: add an API for registering for notifications for server instance life… (envoyproxy#6254)
  remove remains of TestBase (envoyproxy#6286)
  dubbo_proxy: Implement the routing of Dubbo requests (envoyproxy#5973)
  Revert "stats: add new BoolIndicator stat type (envoyproxy#5813)" (envoyproxy#6280)
  runtime: codifying runtime guarded features (envoyproxy#6134)
  mysql_filter: fix integration test flakes (envoyproxy#6272)
  tls: update BoringSSL to debed9a4 (3683). (envoyproxy#6273)
  rewrite buffer implementation to eliminate evbuffer dependency (envoyproxy#5441)
  Remove the dependency from TimeSystem to libevent by using the Event::Scheduler abstraction as a delegate. (envoyproxy#6240)
  fuzz: fix use of literal in default initialization. (envoyproxy#6268)
  http: add HCM functionality required for rate limiting (envoyproxy#6242)
  Disable mysql_integration_test until it is deflaked. (envoyproxy#6250)
  test: use ipv6_only IPv6 addresses in custom cluster integration tests. (envoyproxy#6260)
  tracing: If parent span is propagated with empty string, it causes th… (envoyproxy#6263)
  upstream: fix oss-fuzz issue envoyproxy#11095. (envoyproxy#6220)
  Wire up panic mode subset to receive updates (envoyproxy#6221)
  docs: clarify xds docs with warming information (envoyproxy#6236)
  ...
@brian-pane brian-pane deleted the buffer branch May 25, 2019 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants