Memory not being freed with Image.fromarray #8549

jmspereira · 2024-11-11T18:47:04Z

What did you do?

Hey everyone,
I have an application that uses pillow to encode numpy arrays as jpegs, however I am seeing a strange behavior regarding the memory usage of that application.

What did you expect to happen?

All allocated memory be freed.

What actually happened?

There is memory that is not freeded.

What are your OS, Python and Pillow versions?

OS: ubuntu 22.04
Python: 3.10.12
Pillow: 11.0.0

--------------------------------------------------------------------
Pillow 11.0.0
Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
--------------------------------------------------------------------
--- PIL CORE support ok, compiled for 11.0.0
--- TKINTER support ok, loaded 8.6
--- FREETYPE2 support ok, loaded 2.13.2
--- LITTLECMS2 support ok, loaded 2.16
--- WEBP support ok, loaded 1.4.0
--- JPEG support ok, compiled for libjpeg-turbo 3.0.4
--- OPENJPEG (JPEG2000) support ok, loaded 2.5.2
--- ZLIB (PNG/ZIP) support ok, loaded 1.2.11
--- LIBTIFF support ok, loaded 4.6.0
--- RAQM (Bidirectional Text) support ok, loaded 0.10.1, fribidi 1.0.8, harfbuzz 10.0.1
*** LIBIMAGEQUANT (Quantization method) support not installed
--- XCB (X protocol) support ok
--------------------------------------------------------------------

Code that reproduces the problem:

import time
import numpy as np
from io import BytesIO
from PIL import Image


def open_pillow_image():
    random_image = (np.random.rand(720, 1280, 3) * 255).astype(np.uint8)

    with BytesIO() as output, Image.fromarray(random_image) as pillow_image:
        pillow_image.save(output, format="jpeg")


def main():
    print("before")
    ### Memory here is around 60mbs...
    time.sleep(10)
    open_pillow_image()

    ### Memory here is around 65mbs...
    print("after")
    time.sleep(1000)


if __name__ == '__main__':
    main()

The text was updated successfully, but these errors were encountered:

Yay295 · 2024-11-11T20:32:26Z

Does anything change if you add

import gc
gc.collect()

after open_pillow_image()?

radarhere · 2024-11-12T00:59:17Z

#7935 (comment)

Pillow's memory allocator doesn't necessarily release the memory in the pool back as soon as an image is destroyed, as it uses that memory pool for future allocations. See Storage.c (https://github.com/python-pillow/Pillow/blob/main/src/libImaging/Storage.c#L310) for the implementation.

jmspereira · 2024-11-12T08:53:28Z

@Yay295, calling the garbage collector explicitly does not make any difference.

@radarhere according to the documentation:

"There is now a memory pool to contain a supply of recently freed blocks, which can then be reused without going back to the OS for a fresh allocation. This caching of free blocks is currently disabled by default (...)" (https://pillow.readthedocs.io/en/stable/reference/block_allocator.html)

It appears that the caching of free blocks should be disabled by default, and tweaking with the PILLOW_BLOCKS_MAX as mentioned in the issue that you reference does not make any difference.

radarhere · 2024-11-12T10:19:56Z

I see, "caching of free blocks" refers to

Pillow/src/libImaging/Storage.c

Lines 315 to 338 in 5bff2f3

    
           memory_get_block(ImagingMemoryArena arena, int requested_size, int dirty) { 
        
               ImagingMemoryBlock block = {NULL, 0}; 
        
               if (arena->blocks_cached > 0) { 
        
                   // Get block from cache 
        
                   arena->blocks_cached -= 1; 
        
                   block = arena->blocks_pool[arena->blocks_cached]; 
        
                   // Reallocate if needed 
        
                   if (block.size != requested_size) { 
        
                       block.ptr = realloc(block.ptr, requested_size); 
        
                   } 
        
                   if (!block.ptr) { 
        
                       // Can't allocate, free previous pointer (it is still valid) 
        
                       free(arena->blocks_pool[arena->blocks_cached].ptr); 
        
                       arena->stats_freed_blocks += 1; 
        
                       return block; 
        
                   } 
        
                   if (!dirty) { 
        
                       memset(block.ptr, 0, requested_size); 
        
                   } 
        
                   arena->stats_reused_blocks += 1; 
        
                   if (block.ptr != arena->blocks_pool[arena->blocks_cached].ptr) { 
        
                       arena->stats_reallocated_blocks += 1; 
        
                   }

By default, the following is used instead.

Pillow/src/libImaging/Storage.c

Lines 339 to 349 in 5bff2f3

    
               } else { 
        
                   if (dirty) { 
        
                       block.ptr = malloc(requested_size); 
        
                   } else { 
        
                       block.ptr = calloc(1, requested_size); 
        
                   } 
        
                   arena->stats_allocated_blocks += 1; 
        
               } 
        
               block.size = requested_size; 
        
               return block; 
        
           }

Testing further, I think the issue doesn't occur only when loading the array, but rather when saving.

radarhere · 2024-11-12T10:49:18Z

If I suggest that calling JpegImagePlugin directly improves the situation, do you agree?

from PIL import JpegImagePlugin
with BytesIO() as output, Image.fromarray(random_image) as pillow_image:
    pillow_image.encoderinfo = {}
    JpegImagePlugin._save(pillow_image, output, "filename")

jmspereira · 2024-11-12T11:19:50Z

Hum, It doesn't seem to make any difference

radarhere · 2024-11-12T11:30:49Z

Do you agree that saving is the problem? As in, I think this code should be fine.

with BytesIO() as output, Image.fromarray(random_image) as pillow_image:
    pass

jmspereira · 2024-11-12T11:35:57Z

Hum, I do not think so. If I run this:

import time
from io import BytesIO

import numpy as np
from PIL import Image


def open_pillow_image():
    random_image = (np.random.rand(720, 1280, 3) * 255).astype(np.uint8)

    with BytesIO() as output, Image.fromarray(random_image) as pillow_image:
        pass


def main():
    print("before")
    time.sleep(10)
    open_pillow_image()
    print("after")
    time.sleep(1000)


if __name__ == '__main__':
    main()

The memory used by the script is larger after opening the image.

radarhere · 2024-11-12T21:01:39Z

Just to be sure, if you remove Pillow, does the problem go away?

import time
from io import BytesIO

import numpy as np


def open_pillow_image():
    random_image = (np.random.rand(720, 1280, 3) * 255).astype(np.uint8)

    with BytesIO() as output:
        pass


def main():
    print("before")
    time.sleep(10)
    open_pillow_image()
    print("after")
    time.sleep(1000)


if __name__ == '__main__':
    main()

jmspereira · 2024-11-13T08:34:29Z

Yes, the problem does not exist without pillow.

wiredfool · 2024-11-26T21:16:47Z

I've run this under massif, starting with the first example. I've also run with 100 loops, commenting out the write and using smaller images, and passing the random value in, not writing the jpeg. Valgrind/massif ascii art to follow.

Running loops doesn't change the memory, i.e., there don't appear to be leaks. This is running 10 iterations, with a slow loop of +=1 between the trials.



    MB
31.48^                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #                                         
     |                              #:::::::::::::::::::::::@@@@::: ::: :::   
     |                              #:  :   :   :   :   :   @   :   :   :     
     |                              #:  :   :   :   :   :   @   :  ::  ::  @  
     |                              #:  :   :   :   :   :   @   :  ::  ::  @  
     |                        @:@@@@#:  :   :   :   :   :   @   :  ::  ::  @@ 
     |                  :@::::@:@   #:  :   :   :   :   :   @   :  ::  ::  @@@
     |               :@::@::::@:@   #:  :   :   :   :   :   @   :  ::  ::  @@@
   0 +----------------------------------------------------------------------->Gi
     0                                                                   3.609

The peak usage is coming from the numpy manipulation of the array. This is the same one, without doing any pillow. Oddly here, the numpy data remains fully in memory, where loading from it appears to unload a large portion. Perhaps it's lazily evaluated?

    MB
31.47^                                    ##################################  
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                             @::::::#                                 : 
     |                         @@:@@:::   #                                 ::
     |                     :@@:@@:@@:::   #                                 :@
   0 +----------------------------------------------------------------------->Gi
     0                                                                   3.857

It's really hard to get valgrind to sample in a sleep, but running tight loops can make it work. sleep, no, for _ in range(10000): i+=1, ok.
There's a large ramp of memory that looks a lot like numpy being loaded into the system. This is with a 1x1

    MB
5.714^                                 #####################################  
     |                                 #                                    : 
     |                                 #                                    : 
     |                              @@:#                                    @ 
     |                              @@:#                                    @ 
     |                             @@@:#                                    @:
     |                             @@@:#                                    @:
     |                          @:@@@@:#                                    @@
     |                          @:@@@@:#                                    @@
     |                        @@@:@@@@:#                                    @@
     |                        @@@:@@@@:#                                    @@
     |                       :@@@:@@@@:#                                    @@
     |                       @@@@:@@@@:#                                    @@
     |                      :@@@@:@@@@:#                                    @@
     |                     @:@@@@:@@@@:#                                    @@
     |                     @:@@@@:@@@@:#                                    @@
     |  @::::::::::::::::::@:@@@@:@@@@:#                                    @@
     | :@:              : :@:@@@@:@@@@:#                                    @@
     |::@:              : :@:@@@@:@@@@:#                                    @@
     |::@:              : :@:@@@@:@@@@:#                                    @@
   0 +----------------------------------------------------------------------->Gi
     0                                                                   3.579

My suspicion here is that it's actually the code that's being loaded. 5MB is in the realm of the size I'd expect.

This is the massif run that from the original code, minus the trailing 1000 second sleep. It has all of the significant allocations in the process, at a few shapshots.
massif_run.zip

radarhere · 2024-11-28T06:16:08Z

@jmspereira did that answer your question?

github-actions · 2024-12-07T07:51:35Z

Closing this issue as no feedback has been received.

radarhere added the Memory label Nov 11, 2024

radarhere added the Awaiting OP Action label Nov 30, 2024

github-actions bot added the Stale label Dec 7, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory not being freed with Image.fromarray #8549

Memory not being freed with Image.fromarray #8549

jmspereira commented Nov 11, 2024

Yay295 commented Nov 11, 2024

radarhere commented Nov 12, 2024

jmspereira commented Nov 12, 2024 •

edited

Loading

radarhere commented Nov 12, 2024

radarhere commented Nov 12, 2024

jmspereira commented Nov 12, 2024 •

edited

Loading

radarhere commented Nov 12, 2024

jmspereira commented Nov 12, 2024 •

edited

Loading

radarhere commented Nov 12, 2024

jmspereira commented Nov 13, 2024

wiredfool commented Nov 26, 2024

radarhere commented Nov 28, 2024

github-actions bot commented Dec 7, 2024

Memory not being freed with Image.fromarray #8549

Memory not being freed with Image.fromarray #8549

Comments

jmspereira commented Nov 11, 2024

What did you do?

What did you expect to happen?

What actually happened?

What are your OS, Python and Pillow versions?

Yay295 commented Nov 11, 2024

radarhere commented Nov 12, 2024

jmspereira commented Nov 12, 2024 • edited Loading

radarhere commented Nov 12, 2024

radarhere commented Nov 12, 2024

jmspereira commented Nov 12, 2024 • edited Loading

radarhere commented Nov 12, 2024

jmspereira commented Nov 12, 2024 • edited Loading

radarhere commented Nov 12, 2024

jmspereira commented Nov 13, 2024

wiredfool commented Nov 26, 2024

radarhere commented Nov 28, 2024

github-actions bot commented Dec 7, 2024

jmspereira commented Nov 12, 2024 •

edited

Loading

jmspereira commented Nov 12, 2024 •

edited

Loading

jmspereira commented Nov 12, 2024 •

edited

Loading