TARDIS-3783-Analyze and Improve Short Polling Implementation #16

mhamzak008 · 2019-07-19T10:26:00Z

No description provided.

bkaganyildiz · 2019-07-19T10:28:57Z

opsgenie_sdk/models/success_response.py

@@ -77,6 +75,11 @@ def __init__(self, request_id=None, took=0.0, result=None, data=None, url=None):
        else:
            self.url = None

+        if api_client is not None:


This seems like a bit unnecessary

Agreed. I am gonna update the pull request

asqui · 2019-07-19T14:46:39Z

opsgenie_sdk/models/success_response.py


        while condition:
            if attempt_count > 0:
-                time.sleep(2 * attempt_count * 0.2)
+                sleep_time = random.uniform(0, (0.2 * 2 ** attempt_count))


So polling delay is now starting at 0.1 and doubling on every attempt, with a uniformly distributed jitter of +/-100%

Combined with the short_polling_max_retries default value of 10 (and assuming the off-by-one bug I noted gets fixed) that gives an expected maximum aggregate sleep time of 51 seconds, and a worst-case of 102 seconds.

This plus any delays in calling get_[incident_]request_status would be the maximum blocking time for any client code calling retrieve_result(), directly or indirectly.

Seems reasonable...

asqui · 2019-07-19T14:58:10Z

opsgenie_sdk/models/success_response.py

+
+            attempt_count += 1
+
+            if should_retry and attempt_count < MAX_NUMBER_OF_RETRIES:


Off-by-one error: If incrementing attempt_count before this comparison it should be attempt_count <= MAX_NUMBER_OF_RETRIES.

Good catch. I will fix it right away.

asqui · 2019-07-19T16:03:59Z

opsgenie_sdk/rest.py

@@ -263,6 +263,7 @@ def request(self, method, url, query_params=None, headers=None,
            exception = exceptions.build_exception(response=r)
            raise exception

+        self.retries = -1


Tracking retry count here seems a bit hacky and error-prone. You've fixed one problem here but I think there are more lurking. For example, what happens when all retries are exhausted? This method raises and doesn't reset retries to -1 (and it has no way to know that retries have been exhausted in order to do the reset). So I think retry_count is going to be wrong until after the next successful request?

Seem like retry counts should be maintained at the level where the retrying is being done, and passed in somehow?

Yup, I agree. In this particular pull request, I was more focused towards improving the short polling implementation but then I realized that the retry_count wasn't being reset in the previous implementation. So, from the top of my head, I just reset it there. I am gonna dig a little deeper into it. Thanks for the heads up :))

Upon further investigation, I realized that this particular retry_count is only used for metrics' publishing and not for the actual retry policy of the SDK. The number of times the SDK retries an action is managed by the retry library and is independent of this variable. I have added this information to an already existent internal issue about improving the retry logic of the SDK and hopefully the implementation will be improved very soon

It's fixed in the following pull request: #17

mhamzak008 requested a review from zfr July 19, 2019 10:26

bkaganyildiz reviewed Jul 19, 2019

View reviewed changes

mhamzak008 requested a review from bkaganyildiz July 19, 2019 10:42

mhamzak008 force-pushed the TARDIS-3783-ImproveShortPolling branch from c6a543a to 3d69f08 Compare July 19, 2019 10:42

asqui reviewed Jul 19, 2019

View reviewed changes

mhamzak008 force-pushed the TARDIS-3783-ImproveShortPolling branch from 3d69f08 to de10f77 Compare July 20, 2019 09:45

TARDIS-3783-Analyze and Improve Short Polling Implementation

601ec75

mhamzak008 force-pushed the TARDIS-3783-ImproveShortPolling branch from de10f77 to 601ec75 Compare July 26, 2019 14:43

asqui mentioned this pull request Aug 16, 2019

TARDIS-3782-Analyze and Improve Retry Logic #17

Merged

zfr merged commit d0d981a into master Sep 6, 2019

zfr deleted the TARDIS-3783-ImproveShortPolling branch September 6, 2019 13:59

mhamzak008 mentioned this pull request Dec 18, 2019

AlertApi.get_request_status raises HTTP 404 errors #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TARDIS-3783-Analyze and Improve Short Polling Implementation #16

TARDIS-3783-Analyze and Improve Short Polling Implementation #16

mhamzak008 commented Jul 19, 2019

bkaganyildiz Jul 19, 2019

mhamzak008 Jul 19, 2019

asqui Jul 19, 2019

asqui Jul 19, 2019

mhamzak008 Jul 20, 2019

asqui Jul 19, 2019

mhamzak008 Jul 20, 2019

mhamzak008 Jul 20, 2019

mhamzak008 Jul 30, 2019


		attempt_count += 1

		if should_retry and attempt_count < MAX_NUMBER_OF_RETRIES:

TARDIS-3783-Analyze and Improve Short Polling Implementation #16

TARDIS-3783-Analyze and Improve Short Polling Implementation #16

Conversation

mhamzak008 commented Jul 19, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment