-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interleave docstore fetch #3226
Conversation
Storage are wrapped in a DebounceStorage that catch colliding in flight fetch request and debounces them. The commit message is good though. |
@trinity-1686a If you need to discuss the debouncer, I think it was implemented by @PSeitz |
on local disk, actually loading the block is very fast, but decompressing it takes a few hundred µs. I did not actually check if the block was getting loaded twice from storage, what I observed however is this line in tantivy being reached twice, and taking both time the few hundred µs it takes, for the same value of I'll add a few logs to make sure |
Ah understood. I had misread. So the decompression is really the problem here. Also can you move the decompression from the tokio loop? |
I thought about it, and had a hard time coming up with something that would work okay in both sync and async, and wouldn't be too complex for the benefits.
I plan on doing that at some point, but it won't change the time a single request take, only how a request impact the rest of the system. |
We discussed a bit with @fulmicoton . This gives good results, but is a bit too "magical". It at least had the benefit of proving a possible gain. Instead we will add some |
Description
While analyzing traces, I found many situations where the same block of the docstore was decompressed multiple times. This happens when two or more documents stored in the same block are fetched at the same time. Both get a cache miss, load the block, decompress it, and write back to cache.
This PR attempt to limit this behavior by interleaving access to the docstore so two requests for the same block are more likely to happen one after the other rather than concurrently, while making sure they are not too far apart to prevent the cache from evicting the entry.
How was this PR tested?
Some unit test was added to verify
interleave_sequence_generator
does what it's supposed to do.Additionally, some performance tests where done on a small index: 2681920 docs, 6 splits (a day of gh-archive), for the query
org.login:kubernetes AND repo.name:kubernetes
, fetching 20, 100 and 1000 documents. Result is an average of 100 runs, in microsecondsmax_hits
the impact is negligible for a small
max_hits
, but becomes significant when more documents are requested.The impact would be significantly smaller on a dataset where there is less locality on data ingestion.