chore(influxv2plugin): Increase accepted retry-after header values. #9619

philjb · 2021-08-12T18:02:51Z

Updates the influxdata v2 output plugin that is used for influx cloud2 to respect larger retry-after header values. The current value of 60s is well below the possible values that influx cloud2 may provide. The current largest value seen is 5 minutes.

This PR increases accepted Retry-After header values to 10 minutes.

I know there is a limited cache size for buffered/delayed write requests and that this cache will begin to drop the oldest writes when the cache is full. However, if the server says that it will not accept a failed request until after 4 minutes (for example) in a retry-after, there is no benefit from trying it every 60s. That scenario creates unnecessary network and server load to reject a request that the client already knows will be rejected based on the retry-after header.

retry-after header is specifically used in cloud2 by rate limiters. If a write/read/delete request is denied because the customer has exceeded a rate limit, the 429 response will include the retry-after header specifying when the resource will be available again.

Required for all PRs:

Updated associated README.md.
Wrote appropriate unit tests.
Pull request title or commits are in conventional commit format (e.g. feat: or fix:)

resolves #

philjb · 2021-08-12T18:37:41Z

plugins/outputs/influxdb_v2/http.go

 	retry := math.Max(backoff, retryAfterHeader)
-	retry = math.Min(retry, defaultMaxWait)
 	return time.Duration(retry) * time.Second


Because of the rounding that happens here when converting to time.Duration (int64), the first 6 retries have a retryDuration of 0s meaning there's no backoff at all.

brettbuddin

Looks reasonable to me.

telegraf-tiger · 2021-08-13T21:05:39Z

Looks like new artifacts were built from this PR. Get them here!

Artifact URLs

ssoroka · 2021-08-16T18:31:24Z

Do we have a use case where 60s is not enough time for the back-off, or is this more academic? I have a lot of reservations about respecting really high retry-after durations. What's the max retry-after that influxdb is going to respond with? how does it calculate the duration?
In high-throughput cases, a 10 minute wait time is pretty unacceptable. 60 seconds is an eternity if you write out 10k records per second.

philjb · 2021-08-17T23:06:27Z

Do we have a use case where 60s is not enough time for the back-off

I didn't change the max backoff (still 60s) but yes, today, InfluxCloud2 may respond with a retry-after of 299 seconds. While getting near the maximum (5min) is unlikely for any given user, I've seen customers making requests at 4x the rate limit I added recently for delete requests. Such a customer would get a retry-after of ~200s if they attempted that rate today.

Most customers don't use telegraf for deletes (none?) but the same logic applies for the other cloud2 rate limits - the retry-after may be as large as 5 minutes. It's just wasted effort all around to resend a message at 60s if telegraf already knows it's going to be rejected until e.g. 120s have gone by.

reimda

The influxdb service won't return a large Retry-After header value without good reason. It's not the client's place to second guess the reason and ignore the header value. Looks good to me.

philjb · 2021-10-06T18:15:55Z

I believe this PR fixes: #9353

telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Aug 12, 2021

philjb force-pushed the master branch from 7d08cc7 to d968400 Compare August 12, 2021 18:22

chore(influxv2plugin): Increase accepted retry-after header values.

46cbeff

philjb force-pushed the master branch from d968400 to 46cbeff Compare August 12, 2021 18:23

philjb commented Aug 12, 2021

View reviewed changes

brettbuddin approved these changes Aug 12, 2021

View reviewed changes

philjb added 2 commits August 13, 2021 13:25

chore(influxv2plugin): Fix backoff float rounding issue.

ffcf480

chore(influxv2plugin): Fix backoff float rounding issue, add test.

45535a6

philjb force-pushed the master branch from a1ab6ad to 45535a6 Compare August 13, 2021 20:49

reimda approved these changes Aug 18, 2021

View reviewed changes

ssoroka approved these changes Aug 25, 2021

View reviewed changes

ssoroka merged commit 8daba8a into influxdata:master Aug 25, 2021

philjb mentioned this pull request Oct 6, 2021

Exponential backoff upon output error #9353

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(influxv2plugin): Increase accepted retry-after header values. #9619

chore(influxv2plugin): Increase accepted retry-after header values. #9619

philjb commented Aug 12, 2021 •

edited

Loading

philjb Aug 12, 2021

brettbuddin left a comment

telegraf-tiger bot commented Aug 13, 2021

Artifact URLs

ssoroka commented Aug 16, 2021 •

edited

Loading

philjb commented Aug 17, 2021

reimda left a comment

philjb commented Oct 6, 2021

chore(influxv2plugin): Increase accepted retry-after header values. #9619

chore(influxv2plugin): Increase accepted retry-after header values. #9619

Conversation

philjb commented Aug 12, 2021 • edited Loading

Required for all PRs:

philjb Aug 12, 2021

Choose a reason for hiding this comment

brettbuddin left a comment

Choose a reason for hiding this comment

telegraf-tiger bot commented Aug 13, 2021

Artifact URLs

ssoroka commented Aug 16, 2021 • edited Loading

philjb commented Aug 17, 2021

reimda left a comment

Choose a reason for hiding this comment

philjb commented Oct 6, 2021

philjb commented Aug 12, 2021 •

edited

Loading

ssoroka commented Aug 16, 2021 •

edited

Loading