-
Notifications
You must be signed in to change notification settings - Fork 75
Handle rate limiting error when fetching #32
Conversation
I notice you've implemented your own throttling handler. Will close this PR. |
@evilmarty Yeah - that's my bad. I saw the email subject, didn't notice a PR, and went "oh crap, that's a regression", and went to fix it. Then went looking for the issue, and noticed it was a PR, but that you'd implemented it slightly differently. Your method would switch immediately to the next log_group (if multiple), which could then be throttled again, but would benefit from the set My method backs off, but stays in the current group until complete, before moving on. It however doesn't exit cleanly if asked to do so. I'm still not sure which is better. In general I'm open to opinions on this one.. |
@lukewaite I've updated my PR with a slight change to my approach. Instead of handling the error at the point of call I have moved the rescue up further to the main loop. As you pointed out we simply wait during the interval pause and try again albeit a different log group to ensure none are neglected. What are your thoughts? |
If CloudWatch raises a ThrottlingException it means there's too many requests. This error bubbles up and causes the plugin to crash in Logstash which then re-starts it. Instead we break for `interval` duration and then start again from the log group where we left off. After a log group is read we push its priority to bottom to ensure every log group is inspected.
@evilmarty Sorry for the delay in getting back to you on this one. First pass looks good, but I'd like to test locally before merging. |
@lukewaite no worries. I've been using this code for over a week without hiccups in both my staging and production environments. Let me know how you go. |
@evilmarty I've reverted 1256571 in prep for merging this, as it's not in your fork, and I don't think it's necessary with these changes. |
process_group(group) | ||
end # groups.each | ||
begin | ||
groups = find_log_groups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if I should move this line out of your begin/rescue here and add back in the handling for find_log_groups that was part of 1256571.
I think that if we got throttled fetching log groups here - we might get into a situation where we're always being throttled, and just stopping.
It's probably a very rare corner case... but might be better to handle it differently than how we handle an interruption while processing a group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find_log_groups
hits the API as well which can be throttled. If it's not handled then it'll simply error and cause the whole plugin to reload (potentially losing position etc). With this change throttling is always handled for all API calls.
@lukewaite Anything further holding up this PR? |
No blockers, @evilmarty, other than making the time for it which I haven't. My apologies. I'll merge in now and tag a patch release. |
Tagged now and published to rubygems. |
If CloudWatch raises a ThrottlingException it means there's too many requests. This error bubbles up and causes the plugin to crash in Logstash which then re-starts it. Instead we simply return from
process_group
and try again later on. Thus losing track since last last fetch (if not persisted).