Batching buffer with minChunkSize #32858

franz1981 · 2023-04-24T12:34:12Z

This is still in draft due to few still opened questions:

what happen to the batch buffer content in case of partial failures within flushBuffer? it should cleanup its whole content, loosing it? What's supposed to do the caller?
what happen to ResteasyReactiveOutputStream in case of failure while using the buffer returned by flushBuffer? releasing it if not already released? (in the previous impl it wasn't well defined in each state)
why 128? We would like to expose even that configuration parameters to users? It's great for benchmarking tests, but maybe too complex for avg urs @geoand ?
how performance looks like? Is any better? [WIP - after or during Quarkus F2F]

franz1981 · 2023-04-24T12:36:05Z

@Sanne @geoand @stuartwdouglas the implementation can already be reviewed actually - apart from the questions at #32858 (comment)

franz1981 · 2023-04-24T12:37:46Z

If this would work @Sanne the change at https://github.com/TechEmpower/FrameworkBenchmarks/pull/7861/files#diff-ea8995e85f3d4bb82b2c8aa7feab84fb8d193519e3f919c7d43c3c2c25310344R31-R32 won't be impactfull anymore and real users as well would benefit from not using 8K Netty (pooled) direct buffer sized for each response, as now

geoand · 2023-04-24T13:09:18Z

Thanks @franz1981!

Let's move this ahead in a couple weeks time :)

...active/server/vertx/src/main/java/org/jboss/resteasy/reactive/server/vertx/AppendBuffer.java

geoand · 2023-05-11T08:21:15Z

I'd like to see how this performs with small / medium and large payloads compared to what we have currently

franz1981 · 2023-05-11T16:00:25Z

Large ones should look similars (if they approach the configured buffer size ie 8K), you'll get better and better as smaller single payload are placed, but you will get worse If users perform many small ones writes to the output stream.
This last one case iiuc should not be the common use case unless users manually manipulating the stream themselves.
The append buffer could be configured to behave like before (or in others ways too) hence we can decide if expose such configuration somehow, in case users requires that level of control.
@geoand let me know If you are aware of other cases where we keep on appending small bytes arrays without flushing to the Netty connection each time.

geoand · 2023-05-11T16:42:46Z

Many small ones would only happen in SSE or other streaming scenarios, definitely not a major concern

franz1981 · 2023-05-26T09:18:16Z

I've spent few weeks testing with our CI using Techempower for evident regressions with both stacks (reactive/blocking) with/without hibernate and I didn't had any performance regression nor improvement, throughput-wise.

What this patch can deliver instead, is a very tangible improvement in the off-heap/direct memory usage in case of blocking endpoints, instead:

JSON encoding happen from within the EQE threads (aka the blocking thread pool), causing the allocation of direct ByteBuf to happen there
EQE threads while writing to the wire, delegates such to an async task on the Vertx/Netty event loop (which surprise me a bit, given that it could use a write/writeAndFlush instead of a generic Runnable on ConnectionBase::queueForWrite), see below:

that means that the buffer allocated in EQE can be alive for some time till flushed to the I/O and making the EQE ready to pick other requests in-flight

This last behavior make very important to NOT over-size such allocation buffer, because it determines how much overall in-flight bytes can be required to be allocated to the Netty allocator, that will grow to accomodate the incoming load.
This same behavior won't happen for buffers allocated within the I/O threads, instead which interact directly with I/O and can be reused right after for the same or another connection.

In a JSON hello world load test using a blocking endpoint the saving of RSS has been of ~2MB vs ~70MB.

geoand · 2023-05-26T10:17:31Z

In a JSON hello world load test using a blocking endpoint the saving of RSS has been of ~2MB vs ~70MB.

This is a very significant improvement! Definitely makes the PR worthy of inclusion :).

Thanks a lot of this!

franz1981 · 2023-05-26T10:30:03Z

I've just to decide what's the minChunkSize by default: Netty allocations capacity are

16,32,48,64,80,96,112,128,...

The logic is explained https://github.com/netty/netty/blob/eb3feb479826949090a9a55c782722ece9b42e50/buffer/src/main/java/io/netty/buffer/SizeClasses.java and https://github.com/netty/netty/blob/eb3feb479826949090a9a55c782722ece9b42e50/buffer/src/main/java/io/netty/buffer/SizeClasses.java#L83 is the initial log2 step ie 4 (2^4 = 16 - that's the step between the different size classes)
Meaning that in theory it should be 16, to be friendly with Netty

Sanne · 2023-05-26T15:07:56Z

What is holding this up? Any concerns?

franz1981 · 2023-05-26T15:09:02Z

What is holding this up? Any concerns?

I've run all possible tests to be sure no evident regressions were there and now I'm addressing the leaks in case of exceptions thrown

geoand · 2023-05-26T15:11:07Z

Yup, I've already talked with @franz1981 and once he is ready, we can rebase onto main, take it out of draft and merge when CI goees green

...active/server/vertx/src/main/java/org/jboss/resteasy/reactive/server/vertx/AppendBuffer.java

franz1981 · 2023-05-28T13:42:56Z

The original code, given that was using a single buffer, was correctly handling releases of resources, so it has taken me a while to ensure a similar behavior now that we can have few chunks instead....
I didn't add any unit test on AppendBuffer unless we plan to use it elsewhere.

@geoand
Some thoughts that I would like to solve before merging this:

as said, any append happening from outside the I/O threads, can lead to unbounded grow of direct memory (impacting RSS): right now I'm capping the minimum allocation to be 128 bytes (by default), but other strategies could be to always be 16 bytes or, more adaptively, 128 bytes if the append happen in an I/O thread (because from I/O threads, I/O is unlikely to accumulate, unless network congestion), or 16 if is not on an I/O thread (to avoid uncontrolled memory growth).
regardless the strategy chosen, we do want such values to be configurable or an option to return back to the original strategy? (the AppendBuffer can behave like the original code too, there's a eager method to make it do it)

geoand · 2023-05-29T05:08:20Z

I don't think we need this to be configurable - at least not for the time being.
It's a very low level mechanism and I don't really see us needing to tune it.

If we do in the future come up with specific scenarios (I mean code that users actually run) were configuration would be beneficial, we can add it (along with some good documentation).

geoand · 2023-05-29T09:02:58Z

Force pushed an update that makes the minChunkSize configurable via: quarkus.resteasy-reactive.min-chunk-size

quarkus-bot · 2023-05-29T12:21:51Z

✔️ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

geoand · 2023-05-29T12:22:49Z

Merging, thanks a ton for this @franz1981!

franz1981 commented Apr 25, 2023

View reviewed changes

...active/server/vertx/src/main/java/org/jboss/resteasy/reactive/server/vertx/AppendBuffer.java Show resolved Hide resolved

franz1981 force-pushed the append_buffer branch from 18fcae0 to f89fbef Compare May 26, 2023 16:05

franz1981 commented May 26, 2023

View reviewed changes

...active/server/vertx/src/main/java/org/jboss/resteasy/reactive/server/vertx/AppendBuffer.java Outdated Show resolved Hide resolved

franz1981 force-pushed the append_buffer branch from a5b45e6 to 4e83858 Compare May 28, 2023 13:33

franz1981 marked this pull request as ready for review May 28, 2023 13:34

franz1981 force-pushed the append_buffer branch from 4e83858 to 7bf38a3 Compare May 28, 2023 13:35

stuartwdouglas approved these changes May 28, 2023

View reviewed changes

geoand approved these changes May 29, 2023

View reviewed changes

geoand added the triage/waiting-for-ci Ready to merge when CI successfully finishes label May 29, 2023

This comment has been minimized.

Sign in to view

geoand force-pushed the append_buffer branch from cecc9c4 to fe8327d Compare May 29, 2023 08:32

quarkus-bot bot added the area/rest label May 29, 2023

This comment has been minimized.

Sign in to view

Batching buffer with minChunkSize

63b30a6

geoand force-pushed the append_buffer branch from fe8327d to 63b30a6 Compare May 29, 2023 09:02

geoand merged commit 29d71ec into quarkusio:main May 29, 2023

quarkus-bot bot added kind/enhancement New feature or request and removed triage/waiting-for-ci Ready to merge when CI successfully finishes labels May 29, 2023

quarkus-bot bot added this to the 3.2 - main milestone May 29, 2023

franz1981 mentioned this pull request Nov 11, 2023

Use directBuffer with 8m initial capacity quarkiverse/quarkus-cxf#1105

Closed

franz1981 mentioned this pull request Feb 28, 2024

RunOnVirtualThread should avoid using Netty FastThreadLocals #39061

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batching buffer with minChunkSize #32858

Batching buffer with minChunkSize #32858

franz1981 commented Apr 24, 2023

franz1981 commented Apr 24, 2023

franz1981 commented Apr 24, 2023

geoand commented Apr 24, 2023 •

edited

Loading

geoand commented May 11, 2023

franz1981 commented May 11, 2023

geoand commented May 11, 2023

franz1981 commented May 26, 2023 •

edited

Loading

geoand commented May 26, 2023

franz1981 commented May 26, 2023 •

edited

Loading

Sanne commented May 26, 2023

franz1981 commented May 26, 2023

geoand commented May 26, 2023 •

edited

Loading

franz1981 commented May 28, 2023 •

edited

Loading

geoand commented May 29, 2023

This comment has been minimized.

This comment has been minimized.

geoand commented May 29, 2023

quarkus-bot bot commented May 29, 2023

geoand commented May 29, 2023

Batching buffer with minChunkSize #32858

Batching buffer with minChunkSize #32858

Conversation

franz1981 commented Apr 24, 2023

franz1981 commented Apr 24, 2023

franz1981 commented Apr 24, 2023

geoand commented Apr 24, 2023 • edited Loading

geoand commented May 11, 2023

franz1981 commented May 11, 2023

geoand commented May 11, 2023

franz1981 commented May 26, 2023 • edited Loading

geoand commented May 26, 2023

franz1981 commented May 26, 2023 • edited Loading

Sanne commented May 26, 2023

franz1981 commented May 26, 2023

geoand commented May 26, 2023 • edited Loading

franz1981 commented May 28, 2023 • edited Loading

geoand commented May 29, 2023

This comment has been minimized.

This comment has been minimized.

geoand commented May 29, 2023

quarkus-bot bot commented May 29, 2023

geoand commented May 29, 2023

geoand commented Apr 24, 2023 •

edited

Loading

franz1981 commented May 26, 2023 •

edited

Loading

franz1981 commented May 26, 2023 •

edited

Loading

geoand commented May 26, 2023 •

edited

Loading

franz1981 commented May 28, 2023 •

edited

Loading