gvfs-helper: auto-retry after network errors, resource throttling, split GET and POST semantics #208

jeffhostetler · 2019-10-15T21:43:51Z

I'm marking this WIP because I haven't done a cleanup round nor squashed things yet.
But I want to get the CI builds a chance to run tonight.

This series attempts to:
[x] auto-retry after network outages
[x] throttle back when requested (or demanded) by the server.

Questions:
[done] What is the right default for the network retry limit?
[done] Should the throttle back have a time limit? (It's one thing to wait 3 or 4 minutes between
packfiles because we hit it too hard, but another if it says we should wait an hour or two.)
[no] Should the network retry look at the amount of data received and try to resume it?
[no] Should the network retry split large packfile requests if we can tell the user's network is flakey?

Basic testing with 5 concurrent fetches shows that it is pretty easy to get throttled and
makes me wonder if we should even bother with multi-threading this. Perhaps we just
limit it to waiting for index-pack in another thread, but only plan to have 1 network thread.
Or maybe that is just when talking to the main server -- we might be able to multithread
when talking to the cache-server.

wilbaker · 2019-10-16T15:19:05Z

What is the right default for the network retry limit?

We might decide we no longer want this approach, but the approach that's been used by VFS4G is:

By default 6 maximum retries for a total of 7 attempts: https://github.com/microsoft/VFSForGit/blob/a537d002be8d48675f48e3e97fe1c8c6359133a9/GVFS/GVFS.Common/RetryConfig.cs#L10
The maximum number of retries can be overridden by a setting in git config:
https://github.com/microsoft/VFSForGit/blob/a537d002be8d48675f48e3e97fe1c8c6359133a9/GVFS/GVFS.Common/RetryConfig.cs#L102
A default timeout of 30 seconds (can also be overridden with the config):
https://github.com/microsoft/VFSForGit/blob/a537d002be8d48675f48e3e97fe1c8c6359133a9/GVFS/GVFS.Common/RetryConfig.cs#L11
There is no delay/backoff between the first and second attempt:
https://github.com/microsoft/VFSForGit/blob/a537d002be8d48675f48e3e97fe1c8c6359133a9/GVFS/GVFS.Common/RetryWrapper.cs#L103
There is a maximum backoff of 5 minutes:
https://github.com/microsoft/VFSForGit/blob/a537d002be8d48675f48e3e97fe1c8c6359133a9/GVFS/GVFS.Common/RetryWrapper.cs#L12
The backoff is exponential and randomly adjusted by 10% (the idea was that if there is a brief outage all clients won't retry at the exact same time):
https://github.com/microsoft/VFSForGit/blob/785ccb4a67281d0eb10f19cb53e6431b62b88555/GVFS/GVFS.Common/RetryBackoff.cs#L34

jeffhostetler · 2019-10-16T20:34:56Z

@wilbaker Thanks for the pointers.

See microsoft/git#208 for the Git changes. This should make the gvfs-helper more robust to intermittent failures when a single network call fails.

derrickstolee · 2019-10-18T11:27:58Z

/azp run Microsoft.git

azure-pipelines · 2019-10-18T11:28:10Z

Azure Pipelines successfully started running 1 pipeline(s).

derrickstolee · 2019-10-18T11:43:46Z

@jeffhostetler after the 503 update, I was unable to repro any network failures in the C# functional tests. (I got a different flaky failure in the C# code, but that's different ;) )

I'm happy to re-test and approve after you do the cleanup to make this not WIP.

jeffhostetler · 2019-10-18T12:58:57Z

Thanks for the confirmation! Almost finished....

jeffhostetler · 2019-10-19T09:53:03Z

/azp run Microsoft.git

azure-pipelines · 2019-10-19T09:53:12Z

Azure Pipelines successfully started running 1 pipeline(s).

Add robust-retry mechanism to automatically retry a request after network errors. This includes retry after: [] transient network problems reported by CURL. [] http 429 throttling (with associated Retry-After) [] http 503 server unavailable (with associated Retry-After) Add voluntary throttling using Azure X-RateLimit-* hints to avoid being soft-throttled (tarpitted) or hard-throttled (429) on later requests. Add global (outside of a single request) azure-throttle data to track the rate limit hints from the cache-server and main Git server independently. Add exponential retry backoff. This is used for transient network problems when we don't have a Retry-After hint. Move the call to index-pack earlier in the response/error handling sequence so that if we receive a 200 but yet the packfile is truncated/corrupted, we can use the regular retry logic to get it again. Refactor the way we create tempfiles for packfiles to use <odb>/pack/tempPacks/ rather than working directly in the <odb>/pack/ directory. Move the code to create a new tempfile to the start of a single request attempt (initial and retry attempts), rather than at the overall start of a request. This gives us a fresh tempfile for each network request attempt. This simplifies the retry mechanism and isolates us from the file ownership issues hidden within the tempfile class. And avoids the need to truncate previous incomplete results. This was necessary because index-pack was pulled into the retry loop. Minor: Add support for logging X-VSS-E2EID to telemetry on network errors. Minor: rename variable: params.b_no_cache_server --> params.b_permit_cache_server_if_defined. This variable is used to indicate whether we should try to use the cache-server when it is defined. Got rid of double-negative logic. Minor: rename variable: params.label --> params.tr2_label Clarify that this variable is only used with trace2 logging. Minor: Move the code to automatically map cache-server 400 responses to normal 401 response earlier in the response/error handling sequence to simplify later retry logic. Minor: Decorate trace2 messages with "(cs)" or "(main)" to identify the server in log messages. Add params->server_type to simplify this. Signed-off-by: Jeff Hostetler <[email protected]>

derrickstolee · 2019-10-22T17:36:19Z

gvfs-helper-client.c

 }

+/*
+ * Get exactly 1 object immediately.
+ * Ignore any queued objects.


So, we are assuming that any queued objects will get a flush request eventually? That sounds reasonable.

Yeah, I've split the queued and immediate usage now. The dry-run/pre-scan loops already handle the queue and drain (for missing blobs usually). The main difference now is that any missing trees (or commits) during those loops will be immediately fetched in isolation, but the queue will remain.

Expose the differences in the semantics of GET and POST for the "gvfs/objects" API: HTTP GET: fetches a single loose object over the network. When a commit object is requested, it just returns the single object. HTTP POST: fetches a batch of objects over the network. When the oid-set contains a commit object, all referenced trees are also included in the response. gvfs-helper is updated to take "get" and "post" command line options. the gvfs-helper "server" mode is updated to take "objects.get" and "objects.post" verbs. For convenience, the "get" option and the "objects.get" verb do allow more than one object to be requested. gvfs-helper will automatically issue a series of (single object) HTTP GET requests and creating a series of loose objects. The "post" option and the "objects.post" verb will perform bulk object fetching using the batch-size chunking. Individual HTTP POST requests containing more than one object will be created as a packfile. A HTTP POST for a single object will create a loose object. This commit also contains some refactoring to eliminate the assumption that POST is always associated with packfiles. In gvfs-helper-client.c, gh_client__get_immediate() now uses the "objects.get" verb and ignores any currently queued objects. In gvfs-helper-client.c, the OIDSET built by gh_client__queue_oid() is only processed when gh_client__drain_queue() is called. The queue is processed using the "object.post" verb. Signed-off-by: Jeff Hostetler <[email protected]>

jeffhostetler · 2019-10-23T18:26:48Z

@derrickstolee I think I'm done tinkering with this one. 2.20191023.1 has already been thru the functional tests and is good. Just did a final squash on my fixups and am running 2.20191023.2 thru the its paces.

derrickstolee

I'll rebase these commits onto tentative/features/sparse-checkout-2.24.0 for #214 after you merge.

jeffhostetler · 2019-10-23T19:34:26Z

Thanks for all your help!

…out-2.23.0 Upgrade to 2.20191023.7-sc which corresponds to commit microsoft/git@a782a7e and includes the following changes (since 2.20191015.2-sc): - microsoft/git#208 - microsoft/git#210

Resolves #195. Includes the following updates to `microsoft/git`: * microsoft/git#208: gvfs-helper: auto-retry after network errors, resource throttling, split GET and POST semantics. * microsoft/git#215: gvfs-helper: dramatically reduce progress noise.

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

jeffhostetler requested a review from derrickstolee October 17, 2019 17:31

derrickstolee mentioned this pull request Oct 17, 2019

[PR Build] Update Git to include retry logic in gvfs-helper microsoft/scalar#180

Closed

jeffhostetler force-pushed the users/jeffhost/gvfs-helper-robust-retry-take2 branch 2 times, most recently from 9a844ff to 0519894 Compare October 22, 2019 14:56

derrickstolee reviewed Oct 22, 2019

View reviewed changes

jeffhostetler force-pushed the users/jeffhost/gvfs-helper-robust-retry-take2 branch from a5d67f2 to 5c65e9a Compare October 23, 2019 17:21

derrickstolee approved these changes Oct 23, 2019

View reviewed changes

jeffhostetler changed the title ~~WIP auto-retry after network errors and resource throttling~~ gvfs-helper: auto-retry after network errors, resource throttling, split GET and POST semantics Oct 23, 2019

jeffhostetler merged commit a782a7e into microsoft:features/sparse-checkout-2.23.0 Oct 23, 2019

wilbaker mentioned this pull request Oct 23, 2019

git: upgrade to latest features/sparse-checkout-2.23.0 microsoft/scalar#194

Merged

derrickstolee mentioned this pull request Oct 25, 2019

Update Git to show less progress in gvfs-helper microsoft/scalar#196

Merged

derrickstolee pushed a commit that referenced this pull request Jun 1, 2020

Merge first wave of gvfs-helper feature

b098b57

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

vdye pushed a commit that referenced this pull request Feb 27, 2024

Merge first wave of gvfs-helper feature

15d4742

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Apr 23, 2024

Merge first wave of gvfs-helper feature

e0ba2e4

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Apr 23, 2024

Merge first wave of gvfs-helper feature

b1ed3f3

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Apr 23, 2024

Merge first wave of gvfs-helper feature

fd0cfae

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Apr 24, 2024

Merge first wave of gvfs-helper feature

9f3a4e5

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Apr 29, 2024

Merge first wave of gvfs-helper feature

49c0b35

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request May 14, 2024

Merge first wave of gvfs-helper feature

b82dcbf

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request May 14, 2024

Merge first wave of gvfs-helper feature

7c00e70

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jun 3, 2024

Merge first wave of gvfs-helper feature

a5d4c1d

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jul 17, 2024

Merge first wave of gvfs-helper feature

8959f03

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jul 17, 2024

Merge first wave of gvfs-helper feature

c8678c6

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jul 17, 2024

Merge first wave of gvfs-helper feature

d1fa8ad

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jul 18, 2024

Merge first wave of gvfs-helper feature

1c440d5

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

mjcheetham pushed a commit that referenced this pull request Jul 23, 2024

Merge first wave of gvfs-helper feature

c2db5a3

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jul 25, 2024

Merge first wave of gvfs-helper feature

3173b7a

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

mjcheetham pushed a commit that referenced this pull request Jul 29, 2024

Merge first wave of gvfs-helper feature

aaf8ae1

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Sep 18, 2024

Merge first wave of gvfs-helper feature

86dd678

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Sep 24, 2024

Merge first wave of gvfs-helper feature

fb4acab

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Oct 8, 2024

Merge first wave of gvfs-helper feature

c06e60c

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

mjcheetham pushed a commit that referenced this pull request Dec 3, 2024

Merge first wave of gvfs-helper feature

b546376

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Dec 17, 2024

Merge first wave of gvfs-helper feature

e738061

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Dec 18, 2024

Merge first wave of gvfs-helper feature

ee34c37

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Dec 18, 2024

Merge first wave of gvfs-helper feature

27afab6

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jan 1, 2025

Merge first wave of gvfs-helper feature

f071ed4

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jan 1, 2025

Merge first wave of gvfs-helper feature

801e7c0

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jan 1, 2025

Merge first wave of gvfs-helper feature

cb3d4e3

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jan 1, 2025

Merge first wave of gvfs-helper feature

7ebcf9e

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Jan 1, 2025

Merge first wave of gvfs-helper feature

980f6c4

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Feb 10, 2025

Merge first wave of gvfs-helper feature

bcaafd5

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

dscho pushed a commit that referenced this pull request Feb 27, 2025

Merge first wave of gvfs-helper feature

9320262

Includes commits from these pull requests: #191 #205 #206 #207 #208 #215 #220 #221 Signed-off-by: Derrick Stolee <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gvfs-helper: auto-retry after network errors, resource throttling, split GET and POST semantics #208

gvfs-helper: auto-retry after network errors, resource throttling, split GET and POST semantics #208

jeffhostetler commented Oct 15, 2019 •

edited

Loading

wilbaker commented Oct 16, 2019 •

edited

Loading

jeffhostetler commented Oct 16, 2019

derrickstolee commented Oct 18, 2019

azure-pipelines bot commented Oct 18, 2019

derrickstolee commented Oct 18, 2019

jeffhostetler commented Oct 18, 2019

jeffhostetler commented Oct 19, 2019

azure-pipelines bot commented Oct 19, 2019

derrickstolee Oct 22, 2019

jeffhostetler Oct 22, 2019

jeffhostetler commented Oct 23, 2019

derrickstolee left a comment

jeffhostetler commented Oct 23, 2019

gvfs-helper: auto-retry after network errors, resource throttling, split GET and POST semantics #208

gvfs-helper: auto-retry after network errors, resource throttling, split GET and POST semantics #208

Conversation

jeffhostetler commented Oct 15, 2019 • edited Loading

wilbaker commented Oct 16, 2019 • edited Loading

jeffhostetler commented Oct 16, 2019

derrickstolee commented Oct 18, 2019

azure-pipelines bot commented Oct 18, 2019

derrickstolee commented Oct 18, 2019

jeffhostetler commented Oct 18, 2019

jeffhostetler commented Oct 19, 2019

azure-pipelines bot commented Oct 19, 2019

derrickstolee Oct 22, 2019

Choose a reason for hiding this comment

jeffhostetler Oct 22, 2019

Choose a reason for hiding this comment

jeffhostetler commented Oct 23, 2019

derrickstolee left a comment

Choose a reason for hiding this comment

jeffhostetler commented Oct 23, 2019

jeffhostetler commented Oct 15, 2019 •

edited

Loading

wilbaker commented Oct 16, 2019 •

edited

Loading