ISSUE-437: SendAsync() can stall with large BatchingMaxPublishDelay #136

sijie · 2021-01-08T14:34:25Z

Original Issue: apache#437

Observed behavior

When using SendAsync() together with a large BatchingMaxPublishDelay, such that batch flushing is driven mainly by BatchingMaxMessages, send stalls can occur.

I don't fully understand the cause of these, but increasing MaxPendingMessages seems to make them go away.

It may be relevant that I am producing to a topic with several partitions.

I originally came across this problem because I recently moved from the cgo client, and for the same configuration the pure go client was exhibiting very much worse throughput characteristics for high-rate sends: I was getting a max of ~6k records/sec when the cgo client had been giving me ~50k/sec.

Steps to reproduce

Create a producer with a large BatchingMaxPublishDelay and the other values default, e.g.

pulsar.ProducerOptions{
  Topic:                   topic,
  CompressionType:         pulsar.ZLib,
  BatchingMaxPublishDelay: 100 * time.Second,
}

Enable debug logging and produce to a partitioned topic with a reasonable number of partitions (in my case, six), using SendAsync(), with a callback function set (callback does nothing except crash on error). Note that the debug log will frequently stall after a Received send request message, and pause until a flush initiated by the max publish delay occurs.

Increase MaxPendingMessages to 2000 and try again. The stalls now go away.

Increase MaxPendingMessages to 6000 and try again. The stalls are still gone, and throughput perhaps appears better than the case above.

System configuration

Pulsar version: 2.6.1
Client version: 71cc54f (current master)

The text was updated successfully, but these errors were encountered:

sijie assigned freeznet Jan 12, 2021

sijie added component/go pulsar/client/go triage/week-2 and removed triage/week-2 labels Jan 12, 2021

sijie closed this as completed Oct 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ISSUE-437: SendAsync() can stall with large BatchingMaxPublishDelay #136

ISSUE-437: SendAsync() can stall with large BatchingMaxPublishDelay #136

sijie commented Jan 8, 2021 •

edited

Loading

ISSUE-437: SendAsync() can stall with large BatchingMaxPublishDelay #136

ISSUE-437: SendAsync() can stall with large BatchingMaxPublishDelay #136

Comments

sijie commented Jan 8, 2021 • edited Loading

Observed behavior

Steps to reproduce

System configuration

sijie commented Jan 8, 2021 •

edited

Loading