-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal cache flush policy when processing datasets #4762
Comments
yes, and not an easy one to fix. Basically the design of the global block cache isn't compatible of multi-threaded usages where there is at least one writer. Let's consider that thread T1 reads dataset D1 and thread T2 writes dataset D2. If thread T1 needs to evict blocks from the block cache, if it chooses to evict a dirty block that belongs to D2, then it could happen that it does so while T2 is in a method of D2. |
Would it be possible that when deciding to evict a block from the global cache, if first tries to lock it (with a timeout 0) whatever the dataset it belongs to, so that if it fails, it is considered being in use and should be skipped from eviction (thus trying with another block) ? |
my conclusion of all my past attempts to fix this is that it won't work. One reason is that there's no way to lock another dataset. A I/O mutex has been introduced in GDALDataset for explicit I/O operations, but a dataset object could be in other methods (GetSpatialRef(), etc) that would modify its internal state. I'm afraid there would be other issues too. I believe that for write scenarios only a per-dataset (or per-thread) block cache would be reliable |
...and I assume that it is far from trivial to migrate to a per-dataset or per-thread block cache since the "size" of the cache becomes a nightmare to track. new idea : |
I would try a pull request, but I quite not fully understand the code GDALRasterBlock::Internalize(). |
Under which condition would you allow the flush of a cached block of another dataset than the current one ? See |
I would like to try my proposal just above of |
I'm not a big fan of exposing such low-level internals to users. |
I understand your point of view. |
I wouldn't push the merge button myself. I have that feeling that block caching for write scenarios need to be seriously re-designed. |
Ok, so you think that adding such a method is doomed and would just become obsolete after a cache redesign, leading to backwards API compatibility problems. So let's stop there, but I would let this issue open until a cache redesign. |
Since I am really concerned by this cache limitation, I did not give up. Here are some pros :
Here are some cons :
What do you think ? |
Sorry, but I would still be unenthusiastic to merge this... This might work, but good luck to document that (I have a hard time myself grasping this), and this will add more perplexity to anyone trying to understand the current code. This is were open source shines: you can experiment that in your own fork pending that a hopefully cleaner solution emerges some day... |
I will with this proposal, because I personally think it is a far better solution than my previous ideas. Not so hard to document though : Thanks for sharing thoughts anyway. |
Just in case #4852 (will certainly be refused for now) |
I do not understand some of the design choices of GDALRasterBlock::Internalize()
Consider the following scenario :
I create dataset1 and dataset2
I fill dataset1
I read dataset1 and write it to dataset2 while performing some processing of the data in the temporary buffer
Please note that the processing is local spatially but requires all the bands for each pixel; that's why the dataset will be browsed line by line to handle all bands of each pixel on a row
To better see the problem, the cache limit is set slightly above one dataset total size
What happens is that the current cache policy will spend a lot of time flushing/recalling the blocks of dataset2 instead of making room by dropping cached blocks of dataset1, resulting in very inefficient I/O.
(I found the comment
//In this first pass, only discard dirty blocks of this dataset...
explaining why, but it just does not fit my usage)Instead of dropping "oldest blocks", the choice is made to drop blocks of the current dataset.
Did I miss something or is this a real problem ?
The text was updated successfully, but these errors were encountered: