Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal cache flush policy when processing datasets #4762

Open
chacha21 opened this issue Nov 4, 2021 · 15 comments
Open

Suboptimal cache flush policy when processing datasets #4762

chacha21 opened this issue Nov 4, 2021 · 15 comments

Comments

@chacha21
Copy link
Contributor

chacha21 commented Nov 4, 2021

I do not understand some of the design choices of GDALRasterBlock::Internalize()

Consider the following scenario :
I create dataset1 and dataset2
I fill dataset1
I read dataset1 and write it to dataset2 while performing some processing of the data in the temporary buffer

Please note that the processing is local spatially but requires all the bands for each pixel; that's why the dataset will be browsed line by line to handle all bands of each pixel on a row

To better see the problem, the cache limit is set slightly above one dataset total size

What happens is that the current cache policy will spend a lot of time flushing/recalling the blocks of dataset2 instead of making room by dropping cached blocks of dataset1, resulting in very inefficient I/O.
(I found the comment //In this first pass, only discard dirty blocks of this dataset... explaining why, but it just does not fit my usage)
Instead of dropping "oldest blocks", the choice is made to drop blocks of the current dataset.

Did I miss something or is this a real problem ?

int main ()
{
  GDALAllRegister();
  GDALDriverManager* gdalDriverManager = GetGDALDriverManager();
  CPLSetErrorHandlerEx(&MyCPLErrorHandler, 0);
  CPLSetConfigOption( "CPL_DEBUG", "ON" );
  GDALDriver* enviDriver = !gdalDriverManager ? 0 : gdalDriverManager->GetDriverByName("ENVI");

  const int spatialWidth = 1024;
  const int spatialHeight = 1024;
  const int bandsCount = 256;
  const GDALDataType dataType = GDT_UInt16;
  typedef unsigned __int16 pixel_t;
  std::vector<pixel_t> buffer(spatialWidth*bandsCount);

  GDALSetCacheMax64(spatialHeight*spatialWidth*bandsCount*sizeof(pixel_t)+10*spatialWidth*bandsCount*sizeof(pixel_t));

  GDALDataset* srcDataset = (GDALDataset*)GDALCreate(enviDriver, "src.bip", spatialWidth, spatialHeight, bandsCount, dataType, 0);
  srcDataset->MarkSuppressOnClose();

  GDALDataset* dstDataset = (GDALDataset*)GDALCreate(enviDriver, "dst.bip", spatialWidth, spatialHeight, bandsCount, dataType, 0);
  dstDataset->MarkSuppressOnClose();

  //initialize srcDataset
  for(int row = 0 ; row<spatialHeight ; ++row)
  {
    std::iota(buffer.begin(), buffer.end(), static_cast<pixel_t>(row*bandsCount));
    srcDataset->RasterIO(GDALRWFlag::GF_Write, 0, row, spatialWidth, 1, 
      buffer.data(), spatialWidth, 1, dataType,
      bandsCount, 0,
      bandsCount*sizeof(pixel_t),
      spatialWidth*bandsCount*sizeof(pixel_t),
      sizeof(pixel_t),
      0);
  }

  //copy srcDataset -> dstDataset
  for(int row = 0 ; row<spatialHeight ; ++row)
  {
    srcDataset->RasterIO(GDALRWFlag::GF_Read, 0, row, spatialWidth, 1, 
      buffer.data(), spatialWidth, 1, dataType,
      bandsCount, 0,
      bandsCount*sizeof(pixel_t),
      spatialWidth*bandsCount*sizeof(pixel_t),
      sizeof(pixel_t),
      0);
    doSomeProcessing(buffer);
    dstDataset->RasterIO(GDALRWFlag::GF_Write, 0, row, spatialWidth, 1, 
      buffer.data(), spatialWidth, 1, dataType,
      bandsCount, 0,
      bandsCount*sizeof(pixel_t),
      spatialWidth*bandsCount*sizeof(pixel_t),
      sizeof(pixel_t),
      0);
  }

  srcDataset->ReleaseRef();
  dstDataset->ReleaseRef();

  return 0;
}

@rouault
Copy link
Member

rouault commented Nov 4, 2021

Did I miss something or is this a real problem ?

yes, and not an easy one to fix. Basically the design of the global block cache isn't compatible of multi-threaded usages where there is at least one writer. Let's consider that thread T1 reads dataset D1 and thread T2 writes dataset D2. If thread T1 needs to evict blocks from the block cache, if it chooses to evict a dirty block that belongs to D2, then it could happen that it does so while T2 is in a method of D2.

@chacha21
Copy link
Contributor Author

chacha21 commented Nov 4, 2021

Would it be possible that when deciding to evict a block from the global cache, if first tries to lock it (with a timeout 0) whatever the dataset it belongs to, so that if it fails, it is considered being in use and should be skipped from eviction (thus trying with another block) ?

@rouault
Copy link
Member

rouault commented Nov 4, 2021

if first tries to lock it (with a timeout 0) whatever the dataset it belongs to, so that if it fails, it is considered being in use and should be skipped from eviction (thus trying with another block) ?

my conclusion of all my past attempts to fix this is that it won't work. One reason is that there's no way to lock another dataset. A I/O mutex has been introduced in GDALDataset for explicit I/O operations, but a dataset object could be in other methods (GetSpatialRef(), etc) that would modify its internal state. I'm afraid there would be other issues too. I believe that for write scenarios only a per-dataset (or per-thread) block cache would be reliable

@chacha21
Copy link
Contributor Author

chacha21 commented Nov 4, 2021

my conclusion of all my past attempts to fix this is that it won't work. [...] I believe that for write scenarios only a per-dataset (or per-thread) block cache would be reliable

...and I assume that it is far from trivial to migrate to a per-dataset or per-thread block cache since the "size" of the cache becomes a nightmare to track.

new idea :
adding some method std::unique_ptr<foo> GDALAbstractBandBlockCache::makeDatasetPurgeable(GDALDataset* d) that would register the dataset as "purgeable" (blocks can be evicted). The user takes the responsibility that it won't cause a multi-thread mess. Returning a std::unique_ptr<foo> here just means that it is a temporary operation, and as soon as we get out of scope, the "foo" will automatically unregister the given dataset to avoid further mess.

@chacha21
Copy link
Contributor Author

I would try a pull request, but I quite not fully understand the code GDALRasterBlock::Internalize().
I am pretty sure that there is a minor modification to allow the cache flush of another dataset than the current one, but I can't identify it. Can you help ?

@rouault
Copy link
Member

rouault commented Nov 10, 2021

Under which condition would you allow the flush of a cached block of another dataset than the current one ? See
2cef454 . The current strategy allow the eviction of non-dirty blocks of other datasets, which is something safe

@chacha21
Copy link
Contributor Author

I would like to try my proposal just above of std::unique_ptr<foo> GDALAbstractBandBlockCache::makeDatasetPurgeable(GDALDataset* d) (or something like that) to locally let the user responsible of "there is no multi R/W concurrency"

@rouault
Copy link
Member

rouault commented Nov 10, 2021

I would like to try my proposal just above of std::unique_ptr<foo> GDALAbstractBandBlockCache::makeDatasetPurgeable(GDALDataset* d) (or something like that) to locally let the user responsible of "there is no multi R/W concurrency"

I'm not a big fan of exposing such low-level internals to users.

@chacha21
Copy link
Contributor Author

I understand your point of view.
That's why instead of adding a method to GDALDataset, I thought it was more relevant to add a method to the cache manager, because only users that need to tune the cache usage will encounter and understand it.
But if you really think that the pull request would be denied, I won't push it.

@rouault
Copy link
Member

rouault commented Nov 10, 2021

But if you really think that the pull request would be denied

I wouldn't push the merge button myself. I have that feeling that block caching for write scenarios need to be seriously re-designed.

@chacha21
Copy link
Contributor Author

Ok, so you think that adding such a method is doomed and would just become obsolete after a cache redesign, leading to backwards API compatibility problems.
That makes sense.

So let's stop there, but I would let this issue open until a cache redesign.

@chacha21
Copy link
Contributor Author

chacha21 commented Nov 17, 2021

Since I am really concerned by this cache limitation, I did not give up.
I have a new proposal that you might judge acceptable.
What about an installable global callback bool canFlushFrom(const GDALDataset* datasetBeingProcessed, const GDALDataset* datasetCandidateForFlushing, void* userContext) that would be called from GDALRasterBlock::Internalize() when the oldest cache block is in another dataset than the one being processed ?

Here are some pros :

  • By default, if the callback is not installed, the answer would obviously be "no" and ensures backward compatibility
  • It is up to the (power) user to ensure that the answer could be "yes" in a context of multiple R/W
  • This is a very minor API modification : just the callback installer
  • This is a very minor GDAL source code modification : just the callback installer and call
  • If no callback is installed, it is almost a no-op (no performance regression)
  • In the future, it is very easy to deprecate : just stop calling the callback if not needed any more after a cache refactoring : this would still be user-source-code compatible if the callback installer is not removed (would just become a no-op)

Here are some cons :

  • the user callback implementation must be time-efficient not to degrade performance, since it might be called a lot
  • if the callback signature is not well designed, it could be useless by not providing enough context to the user to answer "yes"

What do you think ?

@rouault
Copy link
Member

rouault commented Nov 17, 2021

Sorry, but I would still be unenthusiastic to merge this... This might work, but good luck to document that (I have a hard time myself grasping this), and this will add more perplexity to anyone trying to understand the current code. This is were open source shines: you can experiment that in your own fork pending that a hopefully cleaner solution emerges some day...

@chacha21
Copy link
Contributor Author

chacha21 commented Nov 17, 2021

This is were open source shines: you can experiment that in your own fork pending that a hopefully cleaner solution emerges some day...

I will with this proposal, because I personally think it is a far better solution than my previous ideas.

Not so hard to document though :
"By default, when a dataset being processed needs to allocate a new cache block and the global cache is already full, it is not allowed to flush the cache of other datasets to make room in the global cache, in order to prevent problems in multithreading scenarios. This callback asks the user whether it is safe to flush some dataset or not, in order to give a chance to do so and free some cache blocks. By doing so, the pressure on the cache might be released and avoid I/O congestion. It is up to the user to ensure that by answering "yes" in a multithreading scenario, flushing another dataset will not induce unexpected results."

Thanks for sharing thoughts anyway.

@chacha21
Copy link
Contributor Author

Just in case #4852 (will certainly be refused for now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants