-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing (or remove) the provider delay interval for DHT lookups used by Bitswap #8807
Comments
Consider also determining the delay dynamically so this doesn't have to be manually tweaked. Also, ensure the metrics necessary to determine this value are exposed, and the process for determining the value is documented, so it can be tweaked over time. |
@guseggert In the interest of being iterative can we just push the change through (drop this delay) and add the dynamic value support later? cc: @iand |
I'm definitely 100% in favour of what @aschmahmann suggests! We've done some extensive measurement studies and found that the DHT can, in many cases (mostly depending on geographic location of the client/requestor), complete the lookup in less than 1s - see attached fig. AFAIK, the bitswap delay is set to 1s (notice the shift by 1s between the leftmost and the middle fig), which means that it adds more than 100% of delay before the lookup is complete, in case the bitswap broadcast is unsuccessful. This obviously begs the question of how successful is the Bitswap discovery - my guess is that:
Could we have a hybrid of starting the bitswap discovery and the DHT lookup in parallel, until we figure out more details? I believe the config is this one: https://github.com/ipfs/go-bitswap/blob/dbfc6a1d986e18402d8301c7535a0c39b2461cb7/internal/defaults/defaults.go#L10 which needs to be set to 0? The study we plan to carry out in order to make a justified decision is described here: https://github.com/protocol/network-measurements/blob/master/RFMs.md#rfm-16--effectiveness-of-bitswap-discovery-process - any input more than welcome. |
I will take a look at starting bitswap discovery and the DHT lookup in parallel |
Thanks @mrd0ll4r ! Do you happen to have (even rough) estimates on the success rate of the bitswap discovery process? I.e., out of how many bitswap requests do the successful responses you've measured in the graph above come from? |
Definitely, most important thing is to have the instrumentation in place to observe the impact over time. |
We're running an experiment in Thunderdome for various values for the provider delay (0/20/50/100/200/500/1000 ms). I wrote up some early findings on Notion The TTFB was markedly worse for delay values of 0ms, 20m and 50ms. Possibly we have better TTFB around 500 but I'd want to repeat the experiment with more values around that mark to get more confidence in that. See the Notion doc for more graphs. |
Thanks looking into this @iand ! Difficult to explain what's going on from these results :-D Do you have any explanation? Some questions regarding the experiment setup that might help us get closer to what's happening:
We've done some experiments too and indeed a value around 200ms or a bit more would be recommended: if content is found through Bitswap, the node should have an answer after this period of time (roughly). We couldn't draw conclusions though as it turned out that the group of CIDs we were requesting were NFTs and therefore resolved to the nft.storage cluster, so no DHT was ever involved :-D Need to repeat the experiments with a different set of CIDs. |
This experiment is still running so I will have some additional data to add which I will do when I've analysed it a little.
Thunderdome starts new instances of Kubo for each experiment. In this case it's 8 new instances each with a different config setting for provider delay. During the experiment they all receive the same requests simultaneously.
All instances in an experiment get exactly the same traffic. It is a feed from the live gateways that we throttle to the desired rate. In the experiment I mentioned above we limited to 30rps. However that is excessive for these configurations and I re-ran the experiment over the weekend at 20rps.
We don't control the CIDs - they are whatever the live gateways receive. We filter to just path requests though. The goal is to simulate the real load a gateway is subjected to. That includes 404s, repeated requests, bad cids, range requests etc.
They are run in parallel, continuously until we stop the experiment.
No, I would like this too so I'll be looking at the best way to get this data. |
Interesting setup 👌🏽 But I still can't explain where this performance difference comes from. Are these kubo nodes connected to the gateways? I guess they're not. |
@iand also a couple of questions from my side
Disclaimer: I'm not super familiar with the Bitswap internals. I just had a look through the Bitswap code and think that the This I'm probably missing something, so would appreciate it, if you could shed some light on that. I think I just want to draw attention to unintended side-effects that could arise from decreasing that delay. |
Was this closed by mistake @Jorropo @BigLep? Reopening, as (iiuc) #9053 only added an internal configuration option ( It is still set to 1s (see docs from #9526) which means:
Perhaps after Kubo 0.18 ships with |
We got more insights on this 1s timeout from the Effectiveness of Bitswap Discovery Process study. I am fully in favor of setting the default value of Moreover, reducing the number of directly connected peers (lowering The number of CIDs that Bitswap isn't able to discover is expected to increase significantly it will broadcast the request to ~8x less peers. These unsuccessful requests will wait for The number of additional messages generated by starting a DHT walk concurrently to Bitswap is expected to be minimal (~3). I didn't study IPNI requests, but intuitively the number of additional sent message should be minimal too (1?). However, I don't know whether the Indexers have the capacity to serve ALL requests, or they target only the content that isn't found by Bitswap (in this case, the |
Do we have a sense of what multipler of traffic to DHT, IPNI will be associated with removing the search delay? |
We measured that currently 93.65% are successful within the first second after starting Bitswap, implying that only 6.35% of the request make it to the DHT. Starting the DHT concurrently to Bitswap is expected to increase the number of DHT |
cc @masih
|
An easy intermediate solution would be setting the
That would be a solution to reduce Bitswap broadcast while keeping a high discovery success rate, not to reduce the Content Routers (DHT + IPNI) load. Unless we wait for the response from these peers before asking the Content Routers, in which case the situation is similar as the above. More suggestions to limit Bitswap broadcast are described here |
Thanks! In effort to move forward, I've opened #9530 to set Rationale on the PR, tldr is that default settings in Kubo 0.18 maintains fewer connections (iiuc, this decreases number of bitswap lookup spam messages by ~700), so adding ~4 DHT+IPNI lookups sounds like an acceptable trade-off. Would appreciate review – we would like to resolve this before final Kubo 0.18 release (either |
Great stuff here @guillaumemichel ! I really enjoyed reading Effectiveness of Bitswap Discovery Process. Per #9530 (review), I'm in agreement to drop the delay to 0s. Kubo nodes with default configuration are going to experience dramatically less network traffic given #9483 even with the extra DHT / IPNI requests. cid.contact is the one taking the extra read load, but given they can handle the extra traffic, I think we're good to roll this out. Also, can someone confirm that we have a metric in Kubo that will make it clear what proportion of CIDs are satisfied with Bitswap discovery vs. DHT vs. IPNI. That will be the key metric we'll want to look at when this gets deployed to the ipfs.io gateway. @guillaumemichel : do you think it's reworth running some number with smaller connected peer amounts since that is what Kubo is moving towards in 0.18? |
This would be awesome to have!
It would be useful to benchmark the discovery+fetch latency ( |
I don't believe we have that atm (only saw prometheus stats for bandwidth/block count for bitswap itself, no routing metrics) – cc @guseggert / @ajnavarro to confirm. Notes from today's Kubo standup:
Proposal: run tests again and look at TTFB next to CPU and Bandwidth utilization.
This should give us data to see if there are any unexpected CPU/Bandwidth spikes/savings. @guillaumemichel thoughts? I don't know how flexible id your setup / how much work this involved. Is this something probelab will be able to test before we ship 0.18 (next two-three weeks)? Or should we park this and invest time into building tests and adding mentioned metrics to also understand which CIDs are satisfied with Bitswap discovery vs. DHT vs. IPNI? If we can't benchmark this any time soon, we had a mild consensus during Kubo standup about to lower ProviderSearchDelay to conservative 500ms for Kubo 0.18 – imo worth doing, still better than 1s, but lower risk given #8807 (comment) and lack of better data. |
That is a perfect usecase for Thunderdome! @iand can we easily measure CPU and Bandwidth in addition to the TTFB? We won't be able to start the measurements this week, but we can certainly have them done before two-three weeks.
Agree with that if we don't have better results by then. |
@lidel @ajnavarro Would it be easy to set a different Provider Delay for the DHT and for the Indexers? This would allow us to set the Provider Delay to 0 for the DHT, while keeping it to 1s (or 500ms) for the Indexers, limiting the Indexers' load. |
@guillaumemichel yeah, you can set your own routing config and use the ExecuteAfter parameter for the parallel router to execute delegated routing a bit after the DHT if needed: https://github.com/ipfs/kubo/blob/master/docs/config.md#routingrouters-parameters |
Yes we measure CPU already. Bandwidth can be added, though I might need to grab those metrics from AWS separately |
cross posting from #9530 since i missed it here. no need to not increase indexer usage at the same time as DHTs. Presumably any increase in queries is an equal concern for DHTs as for indexers, so we are okay with the rollout happening together. we'd like to be involved in understanding when those steps happen so we have the same visibility as the stewards. |
|
The Thunderdome Experiment Summary: provdelayrouting measurements and report were performed by @iand 😉 |
Note that this work is blocked on "go-libp2p-kad-dht refactor work" (is there a tracking issue for this @guillaumemichel )? Once that is completed, this can be attempted again. |
We have had a delay inside of go-bitswap where we wait for Bitswap broadcasts before we do a content routing lookup.
This has existed for a long time, and DHT performance has gotten much better in the meanwhile. It seems likely that we can drop this number without causing much more load on major use cases, although some analysis might be needed.
If we find analysis is a pain here, we could at least expose this as a Bitswap.Internal config.
The option is: https://github.com/ipfs/go-bitswap/blob/b6f0cc7c83aaa27a39cc7e1b16ee34bba2d8b5b8/bitswap.go#L73
The text was updated successfully, but these errors were encountered: