Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publication counter seemly wrong #1682

Closed
callmerockett opened this issue Nov 8, 2024 · 4 comments
Closed

Publication counter seemly wrong #1682

callmerockett opened this issue Nov 8, 2024 · 4 comments

Comments

@callmerockett
Copy link

callmerockett commented Nov 8, 2024

TL;DR: Should pub-pos be greater than pub-lmt?

Hi, I was testing a combination of tethered and untethered IPC subscriptions, with the following options:

 ...
      -Dagrona.disable.bounds.checks=false
      -Daeron.mtu.length=1024
      -Daeron.ipc.mtu.length=1024
      -Daeron.socket.so_sndbuf=2m
      -Daeron.socket.so_rcvbuf=2m
      -Daeron.rcv.initial.window.length=2m
      -Daeron.term.buffer.length=2m
      -Daeron.ipc.term.buffer.length=2m
      -Daeron.publication.term.window.length=128k
      -Daeron.ipc.publication.term.window.length=128k
...

And using AeronStat got this results:

 56:           13,768,608 - pub-pos (sampled): 22 -1336219639 2000 aeron:ipc
 57:           13,767,392 - pub-lmt: 22 -1336219639 2000 aeron:ipc
 58:    1,731,101,471,743 - client-heartbeat: id=21
 59:    1,731,101,471,750 - client-heartbeat: id=24
 60:           13,768,608 - sub-pos: 25 -1336219639 2000 aeron:ipc?tether=false @0
 64:    1,731,101,471,535 - client-heartbeat: id=50
 66:           13,636,320 - sub-pos: 51 -1336219639 2000 aeron:ipc @9303456

In line 56 pub-pos is greater than pub-lmt, is this right?

Some context:

  • Aeron client version: 1.35.1
  • Aeron driver version: 1.46.7
@mjpt777
Copy link
Contributor

mjpt777 commented Nov 11, 2024

It is possible for the publication limit to be clamped back to the max consumer position when subscribers go away and this could be less than the producer position, i.e. the publication position.

BTW we would not recommend such extreme version range between client and driver. We recommend no more than 1 year of releases between versions.

Also those term window and buffer lengths are in ratios that I would not find useful. It seems like your system could benefit from some consulting on how to tune it.

@callmerockett
Copy link
Author

Hi @mjpt777 , thanks for the reply.

Still in this issue:

It is possible for the publication limit to be clamped back to the max consumer position when subscribers go away and this could be less than the producer position, i.e. the publication position.

This could happen even if the subscriptions are still alive? With the above sample the untethered subscription has the max consumer position while the publication is hold by the slowest tethered consumer, both still running. To provide more context below are the configured timeouts for the test

      -Daeron.threading.mode=SHARED
      -Daeron.spies.simulate.connection=true
      -Daeron.publication.connection.timeout=20s
      -Daeron.publication.unblock.timeout=150s
      -Daeron.publication.linger.timeout=15s
      -Daeron.client.liveness.timeout=30s

These parameters configurations were set intentionally to analyse our system under back pressure scenarios. I could reproduce this behavior with driver at version 1.39.0 as well.

Thanks for the recommendations.

@mjpt777
Copy link
Contributor

mjpt777 commented Nov 11, 2024

Please provide a full example as a test if you wish to have it further investigated.

@callmerockett
Copy link
Author

The issue was with untethered subscriptions rejoining at an invalid position after resting, addressed in PR #1672. Closing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants