Suboptimal cache flush policy when processing datasets #4762

chacha21 · 2021-11-04T08:36:19Z

I do not understand some of the design choices of GDALRasterBlock::Internalize()

Consider the following scenario :
I create dataset1 and dataset2
I fill dataset1
I read dataset1 and write it to dataset2 while performing some processing of the data in the temporary buffer

Please note that the processing is local spatially but requires all the bands for each pixel; that's why the dataset will be browsed line by line to handle all bands of each pixel on a row

To better see the problem, the cache limit is set slightly above one dataset total size

What happens is that the current cache policy will spend a lot of time flushing/recalling the blocks of dataset2 instead of making room by dropping cached blocks of dataset1, resulting in very inefficient I/O.
(I found the comment //In this first pass, only discard dirty blocks of this dataset... explaining why, but it just does not fit my usage)
Instead of dropping "oldest blocks", the choice is made to drop blocks of the current dataset.

Did I miss something or is this a real problem ?

int main ()
{
  GDALAllRegister();
  GDALDriverManager* gdalDriverManager = GetGDALDriverManager();
  CPLSetErrorHandlerEx(&MyCPLErrorHandler, 0);
  CPLSetConfigOption( "CPL_DEBUG", "ON" );
  GDALDriver* enviDriver = !gdalDriverManager ? 0 : gdalDriverManager->GetDriverByName("ENVI");

  const int spatialWidth = 1024;
  const int spatialHeight = 1024;
  const int bandsCount = 256;
  const GDALDataType dataType = GDT_UInt16;
  typedef unsigned __int16 pixel_t;
  std::vector<pixel_t> buffer(spatialWidth*bandsCount);

  GDALSetCacheMax64(spatialHeight*spatialWidth*bandsCount*sizeof(pixel_t)+10*spatialWidth*bandsCount*sizeof(pixel_t));

  GDALDataset* srcDataset = (GDALDataset*)GDALCreate(enviDriver, "src.bip", spatialWidth, spatialHeight, bandsCount, dataType, 0);
  srcDataset->MarkSuppressOnClose();

  GDALDataset* dstDataset = (GDALDataset*)GDALCreate(enviDriver, "dst.bip", spatialWidth, spatialHeight, bandsCount, dataType, 0);
  dstDataset->MarkSuppressOnClose();

  //initialize srcDataset
  for(int row = 0 ; row<spatialHeight ; ++row)
  {
    std::iota(buffer.begin(), buffer.end(), static_cast<pixel_t>(row*bandsCount));
    srcDataset->RasterIO(GDALRWFlag::GF_Write, 0, row, spatialWidth, 1, 
      buffer.data(), spatialWidth, 1, dataType,
      bandsCount, 0,
      bandsCount*sizeof(pixel_t),
      spatialWidth*bandsCount*sizeof(pixel_t),
      sizeof(pixel_t),
      0);
  }

  //copy srcDataset -> dstDataset
  for(int row = 0 ; row<spatialHeight ; ++row)
  {
    srcDataset->RasterIO(GDALRWFlag::GF_Read, 0, row, spatialWidth, 1, 
      buffer.data(), spatialWidth, 1, dataType,
      bandsCount, 0,
      bandsCount*sizeof(pixel_t),
      spatialWidth*bandsCount*sizeof(pixel_t),
      sizeof(pixel_t),
      0);
    doSomeProcessing(buffer);
    dstDataset->RasterIO(GDALRWFlag::GF_Write, 0, row, spatialWidth, 1, 
      buffer.data(), spatialWidth, 1, dataType,
      bandsCount, 0,
      bandsCount*sizeof(pixel_t),
      spatialWidth*bandsCount*sizeof(pixel_t),
      sizeof(pixel_t),
      0);
  }

  srcDataset->ReleaseRef();
  dstDataset->ReleaseRef();

  return 0;
}

The text was updated successfully, but these errors were encountered:

rouault · 2021-11-04T11:50:00Z

Did I miss something or is this a real problem ?

yes, and not an easy one to fix. Basically the design of the global block cache isn't compatible of multi-threaded usages where there is at least one writer. Let's consider that thread T1 reads dataset D1 and thread T2 writes dataset D2. If thread T1 needs to evict blocks from the block cache, if it chooses to evict a dirty block that belongs to D2, then it could happen that it does so while T2 is in a method of D2.

chacha21 · 2021-11-04T13:00:49Z

Would it be possible that when deciding to evict a block from the global cache, if first tries to lock it (with a timeout 0) whatever the dataset it belongs to, so that if it fails, it is considered being in use and should be skipped from eviction (thus trying with another block) ?

rouault · 2021-11-04T13:12:09Z

if first tries to lock it (with a timeout 0) whatever the dataset it belongs to, so that if it fails, it is considered being in use and should be skipped from eviction (thus trying with another block) ?

my conclusion of all my past attempts to fix this is that it won't work. One reason is that there's no way to lock another dataset. A I/O mutex has been introduced in GDALDataset for explicit I/O operations, but a dataset object could be in other methods (GetSpatialRef(), etc) that would modify its internal state. I'm afraid there would be other issues too. I believe that for write scenarios only a per-dataset (or per-thread) block cache would be reliable

chacha21 · 2021-11-04T13:25:59Z

my conclusion of all my past attempts to fix this is that it won't work. [...] I believe that for write scenarios only a per-dataset (or per-thread) block cache would be reliable

...and I assume that it is far from trivial to migrate to a per-dataset or per-thread block cache since the "size" of the cache becomes a nightmare to track.

new idea :
adding some method std::unique_ptr<foo> GDALAbstractBandBlockCache::makeDatasetPurgeable(GDALDataset* d) that would register the dataset as "purgeable" (blocks can be evicted). The user takes the responsibility that it won't cause a multi-thread mess. Returning a std::unique_ptr<foo> here just means that it is a temporary operation, and as soon as we get out of scope, the "foo" will automatically unregister the given dataset to avoid further mess.

chacha21 · 2021-11-10T08:15:17Z

I would try a pull request, but I quite not fully understand the code GDALRasterBlock::Internalize().
I am pretty sure that there is a minor modification to allow the cache flush of another dataset than the current one, but I can't identify it. Can you help ?

rouault · 2021-11-10T10:20:05Z

Under which condition would you allow the flush of a cached block of another dataset than the current one ? See
2cef454 . The current strategy allow the eviction of non-dirty blocks of other datasets, which is something safe

chacha21 · 2021-11-10T10:57:58Z

I would like to try my proposal just above of std::unique_ptr<foo> GDALAbstractBandBlockCache::makeDatasetPurgeable(GDALDataset* d) (or something like that) to locally let the user responsible of "there is no multi R/W concurrency"

rouault · 2021-11-10T11:12:46Z

I would like to try my proposal just above of std::unique_ptr<foo> GDALAbstractBandBlockCache::makeDatasetPurgeable(GDALDataset* d) (or something like that) to locally let the user responsible of "there is no multi R/W concurrency"

I'm not a big fan of exposing such low-level internals to users.

chacha21 · 2021-11-10T11:20:25Z

I understand your point of view.
That's why instead of adding a method to GDALDataset, I thought it was more relevant to add a method to the cache manager, because only users that need to tune the cache usage will encounter and understand it.
But if you really think that the pull request would be denied, I won't push it.

rouault · 2021-11-10T11:32:04Z

But if you really think that the pull request would be denied

I wouldn't push the merge button myself. I have that feeling that block caching for write scenarios need to be seriously re-designed.

chacha21 · 2021-11-10T11:48:53Z

Ok, so you think that adding such a method is doomed and would just become obsolete after a cache redesign, leading to backwards API compatibility problems.
That makes sense.

So let's stop there, but I would let this issue open until a cache redesign.

chacha21 · 2021-11-17T17:38:55Z

Since I am really concerned by this cache limitation, I did not give up.
I have a new proposal that you might judge acceptable.
What about an installable global callback bool canFlushFrom(const GDALDataset* datasetBeingProcessed, const GDALDataset* datasetCandidateForFlushing, void* userContext) that would be called from GDALRasterBlock::Internalize() when the oldest cache block is in another dataset than the one being processed ?

Here are some pros :

By default, if the callback is not installed, the answer would obviously be "no" and ensures backward compatibility
It is up to the (power) user to ensure that the answer could be "yes" in a context of multiple R/W
This is a very minor API modification : just the callback installer
This is a very minor GDAL source code modification : just the callback installer and call
If no callback is installed, it is almost a no-op (no performance regression)
In the future, it is very easy to deprecate : just stop calling the callback if not needed any more after a cache refactoring : this would still be user-source-code compatible if the callback installer is not removed (would just become a no-op)

Here are some cons :

the user callback implementation must be time-efficient not to degrade performance, since it might be called a lot
if the callback signature is not well designed, it could be useless by not providing enough context to the user to answer "yes"

What do you think ?

rouault · 2021-11-17T17:50:06Z

Sorry, but I would still be unenthusiastic to merge this... This might work, but good luck to document that (I have a hard time myself grasping this), and this will add more perplexity to anyone trying to understand the current code. This is were open source shines: you can experiment that in your own fork pending that a hopefully cleaner solution emerges some day...

chacha21 · 2021-11-17T18:06:41Z

This is were open source shines: you can experiment that in your own fork pending that a hopefully cleaner solution emerges some day...

I will with this proposal, because I personally think it is a far better solution than my previous ideas.

Not so hard to document though :
"By default, when a dataset being processed needs to allocate a new cache block and the global cache is already full, it is not allowed to flush the cache of other datasets to make room in the global cache, in order to prevent problems in multithreading scenarios. This callback asks the user whether it is safe to flush some dataset or not, in order to give a chance to do so and free some cache blocks. By doing so, the pressure on the cache might be released and avoid I/O congestion. It is up to the user to ensure that by answering "yes" in a multithreading scenario, flushing another dataset will not induce unexpected results."

Thanks for sharing thoughts anyway.

chacha21 · 2021-11-18T20:17:52Z

Just in case #4852 (will certainly be refused for now)

rouault mentioned this issue Jan 20, 2022

added GDALCacheFlushHelperHandler #4852

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suboptimal cache flush policy when processing datasets #4762

Suboptimal cache flush policy when processing datasets #4762

chacha21 commented Nov 4, 2021 •

edited

Loading

rouault commented Nov 4, 2021

chacha21 commented Nov 4, 2021

rouault commented Nov 4, 2021

chacha21 commented Nov 4, 2021

chacha21 commented Nov 10, 2021

rouault commented Nov 10, 2021

chacha21 commented Nov 10, 2021

rouault commented Nov 10, 2021

chacha21 commented Nov 10, 2021

rouault commented Nov 10, 2021

chacha21 commented Nov 10, 2021

chacha21 commented Nov 17, 2021 •

edited

Loading

rouault commented Nov 17, 2021

chacha21 commented Nov 17, 2021 •

edited

Loading

chacha21 commented Nov 18, 2021

Suboptimal cache flush policy when processing datasets #4762

Suboptimal cache flush policy when processing datasets #4762

Comments

chacha21 commented Nov 4, 2021 • edited Loading

rouault commented Nov 4, 2021

chacha21 commented Nov 4, 2021

rouault commented Nov 4, 2021

chacha21 commented Nov 4, 2021

chacha21 commented Nov 10, 2021

rouault commented Nov 10, 2021

chacha21 commented Nov 10, 2021

rouault commented Nov 10, 2021

chacha21 commented Nov 10, 2021

rouault commented Nov 10, 2021

chacha21 commented Nov 10, 2021

chacha21 commented Nov 17, 2021 • edited Loading

rouault commented Nov 17, 2021

chacha21 commented Nov 17, 2021 • edited Loading

chacha21 commented Nov 18, 2021

chacha21 commented Nov 4, 2021 •

edited

Loading

chacha21 commented Nov 17, 2021 •

edited

Loading

chacha21 commented Nov 17, 2021 •

edited

Loading