Provides Prioritised delivering of Zero Streams Frames #718

OlegDokuka · 2019-11-19T11:15:03Z

Motivation

In the current implementation, RSocketRequester sends all the data over the UnboundedProcessor which in a nutshell is a MpScUnboundedArrayQueue which in turn does not have any prioritization mechanism. In general, there is no need to have such functionality unless it comes to delivering critical internal frames/payloads such as KeepAlive Frame or Leases Frame, which SHOULD be delivered as soon as possible. The problem comes when UnboundedProcessor is overwhelmed by other packets:

RSocket requester = ...;
while (true) {
	requester.fireAndForget(...).subscribe();
	requester.requestResponse(...).subscribe();
}

The above example shows how the client can quickly overwhelm its own queue so all the other packets as KEEPALIVE or LEASE will be simply stacked at the very top and be delivered with a significant delay. Especially, KEEPALIVE can simply cause an unwonted cancelation of the alive connection.

Proposal

To make sure that all the critical frames are delivered as soon as possible, we can add a kind of priority channel, or directly talking we can add a separate MpScUnboundedArrayQueue inside the UnboundedProcessor as a way for Zero Stream frames prioritized delivering.

In turn, under the hood, the Processor will be drained as in the following example:

void drainRegular(Subscriber<? super T> a) {
    int missed = 1;

    final Queue<T> q = queue;
    final Queue<T> pq = priorityQueue;

    for (; ; ) {

      long r = requested;
      long e = 0L;

      while (r != e) {
        boolean d = done;

        T t;
        boolean empty;

        if (!pq.isEmpty()) {
          t = pq.poll();
          empty = false;
        } else {
          t = q.poll();
          empty = t == null;
        }

        if (checkTerminated(d, empty, a, q, pq)) {
          return;
        }

        if (empty) {
          break;
        }

        a.onNext(t);

        e++;
      }

      if (r == e) {
        if (checkTerminated(done, q.isEmpty() && pq.isEmpty(), a, q, pq)) {
          return;
        }
      }

      if (e != 0 && r != Long.MAX_VALUE) {
        REQUESTED.addAndGet(this, -e);
      }

      missed = WIP.addAndGet(this, -missed);
      if (missed == 0) {
        break;
      }
    }
  }

Benchmarks

The benchmark has shown that the performance impact is insignificant (within a couple of percents when it comes to standalone UnboundedProcessor measurements), and no difference was observed for the standard E2e RSocket test.

yschimke · 2019-11-20T07:22:43Z

rsocket-core/src/main/java/io/rsocket/RSocketRequester.java

@@ -110,7 +110,7 @@
          new ClientKeepAliveSupport(allocator, keepAliveTickPeriod, keepAliveAckTimeout);
      this.keepAliveFramesAcceptor =
          keepAliveHandler.start(
-              keepAliveSupport, sendProcessor::onNext, this::tryTerminateOnKeepAlive);
+              keepAliveSupport, sendProcessor::onNextPrioritized, this::tryTerminateOnKeepAlive);


We should tighten this up in the spec, whether it's stream 0 prioritised or KEEPALIVE and LEASE only as you have implemented. I'll generally defer to @robertroeser for this, but put my thoughts on the issue.

Definitely. I have already created an issue on that!

yschimke · 2019-11-20T07:24:55Z

rsocket-core/src/main/java/io/rsocket/internal/UnboundedProcessor.java

  }

  @Override
  public int getBufferSize() {
-    return Queues.capacity(this.queue);
+    return Integer.MAX_VALUE;


yschimke · 2019-11-20T07:27:54Z

rsocket-core/src/main/java/io/rsocket/internal/UnboundedProcessor.java

@@ -321,23 +346,29 @@ public void cancel() {

  @Override
  public T peek() {
+    if (!priorityQueue.isEmpty()) {


This looks racy (checking before calling peek) after the change, does Reactor promise this won't be concurrent code in practice?

I guess this code is never used in reactor. Thus just a forced Queue API. But indeed, it is racy in I guess i could do nothing to that

I meant Reactor in the sense it is a FluxProcessor and Fuseable.QueueSubscription, and are there guarantees within the Reactor framework that indicate the threading model is safe here? It sounds like you are saying it's known safe within our rsocket project.

So, by default it is

interface QueueSubscription<T> extends Queue<T>, Subscription { String NOT_SUPPORTED_MESSAGE = "Although QueueSubscription extends Queue it is purely internal" + " and only guarantees support for poll/clear/size/isEmpty." + " Instances shouldn't be used/exposed as Queue outside of Reactor operators."; ... @Override @Nullable default T peek() { throw new UnsupportedOperationException(NOT_SUPPORTED_MESSAGE); }

and as it is written in the error message only poll supposed to be used. So I don't think it matters at all and the best I can do is removing the peek operation at all.

@simonbasle @smaldini can you please correct me if I'm wrong

you are correct @OlegDokuka

He-Pin · 2019-12-05T16:57:48Z

Akka has ControlAwaredMailbox for this too.

Signed-off-by: Oleh Dokuka <[email protected]>

…arks Signed-off-by: Oleh Dokuka <[email protected]>

Signed-off-by: Oleh Dokuka <[email protected]> Signed-off-by: Oleh Dokuka <[email protected]>

Signed-off-by: Oleh Dokuka <[email protected]>

linux-china · 2020-02-22T02:04:28Z

Any news about this feature?

mostroverkhov · 2020-03-01T22:00:56Z

@linux-china why did you need this? It is flawed - and RSocket as protocol has everything to guarantee outgoing queue is not growing unbounded

linux-china · 2020-03-01T23:34:03Z

@mostroverkhov Now we use metadataPush to push some critical information, such as configuration for spring boot app, broker cluster changing, app status changing. the metadata push is zero stream id based, and it's very good to push such critical messages and make the app to do some responding ASAP. With prioritised delivering, it's good for medatapush and keep-alive checking.

For example, Now we want to implement token bucket to control message sending, if the 0 stream id messages are in this queue, and the requester and responder can not exchange some critical information by metadataPush.

mostroverkhov · 2020-03-02T21:18:00Z

@linux-china

With prioritised medatapush

Signalling additional capacity to peer while ignoring existing enqueued messages (size may be estimated by time-to-wire delay) removes natural negative feedback loop that keeps endpoints stable - responses will start to timeout, likely to be retried by loadbalancer and overwhelm remaining ones.

With prioritised delivering, it's good for keep-alive checking

Keep-alive frame can have data attached to It. Most likely this data will be RTT as in http2 ping frame. Having RTT include both RSocket peers outgoing queue latencies in addition to network latency gives true information about RSocket health. Network only RTT of 1 ms with prioritized keep-alives is not useful if 1000ms (instead of target e.g. 5ms) is spent on outgoing queues - in such case this is just false information - such RSocket is not healthy and I want to aggressively reduce allowed requests permits for It.

OlegDokuka · 2020-03-02T22:27:46Z

@mostroverkhov

responses will start to timeout, likely to be retried by loadbalancer and overwhelm remaining ones.

this is the downside of the current implementation and nothing more.

Keep-alive frame can have data attached to It. Most likely this data will be RTT as in http2 ping frame.

Funny, the same HTTP/2 spec says that

Receivers of a PING frame that does not include an ACK flag MUST send a PING frame with the ACK flag set in response, with an identical payload. PING responses SHOULD be given higher priority than any other frame.

So I rather say - this PR makes even more sense than before

mostroverkhov · 2020-03-02T23:11:07Z

@OlegDokuka you are conflating rsocket keepalives with connection keep-alives as pointed by Steve Gury rsocket/rsocket#280 (comment). As I said above, network only RTT is useless for load estimation, and prioritizing keep-alives introduced by this PR just masks a problem when you have 1 ms keep-alive RTT but requests timeout after 5 sec not even hitting the network

OlegDokuka · 2020-03-03T09:45:06Z

@mostroverkhov looking back into the history of the protocol and keepalive development, I found this discussion(rsocket/rsocket#8 (comment)) which states that keepalive is more on identifying the connection and rsocket problem rather than identifying how much messages are enqueued on the application level. Due to what @stevegury said, in order to identify queueing time, better to use simple request-response on the level of application logic.

@stevegury correct me if I'm wrong.

Also, looking over all the issues related to a keepalive, it seems that in all cases keepalive is mentioned in the context of client and connection and not in the context of the user's application. (rsocket/rsocket#58 (comment)).

@mostroverkhov feel free to open a ticket at rsocket-spec repo if you have any concerns

OlegDokuka requested review from yschimke, rdegnan and robertroeser November 19, 2019 11:15

OlegDokuka self-assigned this Nov 19, 2019

OlegDokuka added the enhancement label Nov 19, 2019

OlegDokuka force-pushed the bugfix/prioritization branch from 0aad6ea to dd0bf04 Compare November 19, 2019 11:16

OlegDokuka requested a review from smaldini November 19, 2019 11:17

This was referenced Nov 19, 2019

Stream ID 0 frames prioritization rsocket/rsocket#279

Closed

No keep-alive acks in the process of receiving data #712

Closed

yschimke reviewed Nov 20, 2019

View reviewed changes

OlegDokuka mentioned this pull request Nov 22, 2019

Provides addition to QoS and Prioritisation rsocket/rsocket#280

Merged

OlegDokuka force-pushed the bugfix/prioritization branch from caa03bb to cd08a7c Compare December 7, 2019 17:07

yschimke approved these changes Dec 8, 2019

View reviewed changes

OlegDokuka force-pushed the bugfix/prioritization branch 2 times, most recently from c3fb896 to 0ee435e Compare December 16, 2019 21:01

OlegDokuka added 6 commits December 20, 2019 16:12

prioritization

08cefe0

Signed-off-by: Oleh Dokuka <[email protected]>

provides prioritized sending for zero-stream frames; refactors benchm…

111a3b8

…arks Signed-off-by: Oleh Dokuka <[email protected]>

provides minor cleanups

37142ea

Signed-off-by: Oleh Dokuka <[email protected]> Signed-off-by: Oleh Dokuka <[email protected]>

more fixes

922effd

Signed-off-by: Oleh Dokuka <[email protected]>

fixes format

70ec02e

Signed-off-by: Oleh Dokuka <[email protected]>

fixes

39a74d7

Signed-off-by: Oleh Dokuka <[email protected]>

OlegDokuka force-pushed the bugfix/prioritization branch from 0ee435e to 39a74d7 Compare December 20, 2019 14:12

OlegDokuka merged commit f689f54 into develop Feb 22, 2020

mostroverkhov mentioned this pull request Mar 1, 2020

RSocket should not prioritize 0 frames #747

Open

OlegDokuka linked an issue Mar 25, 2020 that may be closed by this pull request

No keep-alive acks in the process of receiving data #712

Closed

rstoyanchev modified the milestones: 1.0, 1.0.0-RC7 Apr 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provides Prioritised delivering of Zero Streams Frames #718

Provides Prioritised delivering of Zero Streams Frames #718

OlegDokuka commented Nov 19, 2019 •

edited

Loading

yschimke Nov 20, 2019

OlegDokuka Nov 20, 2019

yschimke Nov 20, 2019

yschimke Nov 20, 2019

OlegDokuka Nov 20, 2019

yschimke Nov 20, 2019

OlegDokuka Nov 20, 2019

simonbasle Nov 20, 2019

He-Pin commented Dec 5, 2019

linux-china commented Feb 22, 2020

mostroverkhov commented Mar 1, 2020

linux-china commented Mar 1, 2020 •

edited

Loading

mostroverkhov commented Mar 2, 2020 •

edited

Loading

OlegDokuka commented Mar 2, 2020 •

edited

Loading

mostroverkhov commented Mar 2, 2020

OlegDokuka commented Mar 3, 2020

Provides Prioritised delivering of Zero Streams Frames #718

Provides Prioritised delivering of Zero Streams Frames #718

Conversation

OlegDokuka commented Nov 19, 2019 • edited Loading

Motivation

Proposal

Benchmarks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

He-Pin commented Dec 5, 2019

linux-china commented Feb 22, 2020

mostroverkhov commented Mar 1, 2020

linux-china commented Mar 1, 2020 • edited Loading

mostroverkhov commented Mar 2, 2020 • edited Loading

OlegDokuka commented Mar 2, 2020 • edited Loading

mostroverkhov commented Mar 2, 2020

OlegDokuka commented Mar 3, 2020

OlegDokuka commented Nov 19, 2019 •

edited

Loading

linux-china commented Mar 1, 2020 •

edited

Loading

mostroverkhov commented Mar 2, 2020 •

edited

Loading

OlegDokuka commented Mar 2, 2020 •

edited

Loading