Memory leak in fetch #675

dsanghan · 2019-11-15T05:57:27Z

We're using exchangelib for a large number of exchange accounts and last month updated our fork to latest master and started seeing a massive spike in virtual memory consumption (several GB over a few hours). On profiling it, we isolated it to fetch. We wrote a small test case to verify that this is indeed the case and profiled it with memory_profiler using the latest release of exchangelib. The leak increases in like with the number of emails fetched.

With 10 emails:

Line #    Mem usage    Increment   Line Contents
================================================
    20     44.6 MiB     44.6 MiB   @profile
    21                             def run_test():
    22     44.6 MiB      0.0 MiB       config = Configuration(service_endpoint='https://outlook.office365.com/ews/exchange.asmx', credentials=credentials)
    23     48.2 MiB      3.6 MiB       exchange = Account(primary_smtp_address=user, config=config, autodiscover=False, access_type=DELEGATE)
    24     48.2 MiB      0.0 MiB       ids = []
    25     48.4 MiB      0.2 MiB       for saved in exchange.inbox.all().order_by('-datetime_received').only('id', 'changekey')[:10]:
    26     48.4 MiB      0.0 MiB           ids.append((saved.id, saved.changekey))
    27     51.2 MiB      2.8 MiB       msgs = list(exchange.fetch(ids, only_fields=['mime_content']))
    28     50.9 MiB      0.0 MiB       del msgs, ids

With 20 emails:

Line #    Mem usage    Increment   Line Contents
================================================
    20     44.6 MiB     44.6 MiB   @profile
    21                             def run_test():
    22     44.6 MiB      0.0 MiB       config = Configuration(service_endpoint='https://outlook.office365.com/ews/exchange.asmx', credentials=credentials)
    23     49.0 MiB      4.3 MiB       exchange = Account(primary_smtp_address=user, config=config, autodiscover=False, access_type=DELEGATE)
    24     49.0 MiB      0.0 MiB       ids = []
    25     49.1 MiB      0.1 MiB       for saved in exchange.inbox.all().order_by('-datetime_received').only('id', 'changekey')[:20]:
    26     49.1 MiB      0.0 MiB           ids.append((saved.id, saved.changekey))
    27     69.5 MiB     20.4 MiB       msgs = list(exchange.fetch(ids, only_fields=['mime_content']))
    28     69.5 MiB      0.0 MiB       del msgs, ids

To be clear, the spike is in virtual memory and not committed memory, but once you hit something like ~20GB in virtual memory, the scripts crash. Any suggestions?

The text was updated successfully, but these errors were encountered:

dsanghan · 2019-11-15T06:17:15Z

Fetching a single email with a 10Mb attachment seems to make the problem worse:

Line #    Mem usage    Increment   Line Contents
================================================
    21     44.6 MiB     44.6 MiB   @profile
    22                             def run_test():
    23     44.6 MiB      0.0 MiB       config = Configuration(service_endpoint='https://outlook.office365.com/ews/exchange.asmx', credentials=credentials)
    24     48.5 MiB      3.8 MiB       exchange = Account(primary_smtp_address=user, config=config, autodiscover=False, access_type=DELEGATE)
    25     48.5 MiB      0.0 MiB       ids = []
    26     48.6 MiB      0.1 MiB       for saved in exchange.inbox.all().order_by('-datetime_received').only('id', 'changekey')[:1]:
    27     48.6 MiB      0.0 MiB           ids.append((saved.id, saved.changekey))
    28    125.9 MiB     77.3 MiB       msgs = list(exchange.fetch(ids, only_fields=['mime_content']))
    29    125.9 MiB      0.0 MiB       del msgs, ids

ecederstrand · 2019-11-15T10:49:08Z

Hmm. I don't have any concrete ideas, but of course this needs to be fixed.

It could be caused by some global variable in the exchangelib package ballooning, but I'm hard pressed so see which it could be.

It would be nice to see how memory is distributed in the exchange object but I don't know of a Python package that can do that visualization.

dsanghan · 2019-11-15T12:45:15Z

Some further insight:

   604     49.9 MiB      0.0 MiB           ret = []
   605     49.9 MiB      0.0 MiB           for i in self._consume_item_service(service_cls=GetItem, items=ids, chunk_size=chunk_size, kwargs=dict(
   606     49.9 MiB      0.0 MiB                   additional_fields=additional_fields,
   607     99.9 MiB     40.7 MiB                   shape=ID_ONLY,
   608                                     )):
   609     90.6 MiB      0.0 MiB               if isinstance(i, Exception):
   610                                             ret.append(i)
   611                                         else:
   612     99.9 MiB      9.3 MiB                   item = validation_folder.item_model_from_tag(i.tag).from_xml(elem=i, account=self)
   613     99.9 MiB      0.0 MiB                   ret.append(item)
   614     99.9 MiB      0.0 MiB           return ret

inside _consume:

   314     90.6 MiB     40.7 MiB           for i in service_cls(account=self, chunk_size=chunk_size).call(**kwargs):
   315     90.6 MiB      0.0 MiB               ret.append(i)

inside get_item:

    23     49.9 MiB      0.0 MiB           return self._pool_requests(payload_func=self.get_payload, **dict(
    24     49.9 MiB      0.0 MiB               items=items,
    25     49.9 MiB      0.0 MiB               additional_fields=additional_fields,
    26     90.6 MiB     40.7 MiB               shape=shape,
    27                                     ))

inside post_rate_limited:

for api_version in api_versions:
   131     50.5 MiB      0.0 MiB               log.debug('Trying API version %s for account %s', api_version, account)
   132     50.5 MiB      0.0 MiB               r, session = post_ratelimited(
   133     50.5 MiB      0.0 MiB                   protocol=self.protocol,
   134     50.5 MiB      0.0 MiB                   session=self.protocol.get_session(),
   135     50.5 MiB      0.0 MiB                   url=self.protocol.service_endpoint,
   136     50.5 MiB      0.0 MiB                   headers=extra_headers(account=account),
   137     50.5 MiB      0.0 MiB                   data=wrap(content=payload, version=api_version, account=account),
   138     50.5 MiB      0.0 MiB                   allow_redirects=False,
   139     81.1 MiB     30.6 MiB                   stream=self.streaming,
   140                                         )
   141     81.1 MiB      0.0 MiB               if self.streaming:
   142                                             # Let 'requests' decode raw data automatically
   143                                             r.raw.decode_content = True
   144                                         else:
   145                                             # If we're streaming, we want to wait to release the session until we have consumed the stream.
   146     81.1 MiB      0.0 MiB                   self.protocol.release_session(session)
   147     81.1 MiB      0.0 MiB               try:
   148     90.8 MiB      9.6 MiB                   header, body = self._get_soap_parts(response=r, **parse_opts)
   149                                         except ParseError as e:
   150                                             raise SOAPError('Bad SOAP response: %s' % e)

Maybe requests has a memory leak?

psf/requests#4601

ecederstrand · 2019-11-19T07:57:04Z

That's a possible explanation. A lot is happening in post_ratelimited() so it could also be an issue buried there.

ecederstrand · 2020-01-09T12:25:31Z

@dsanghan Did you get anywhere with this issue? I haven't had time to debug this yet.

dsanghan · 2020-01-10T05:17:41Z

@ecederstrand I tried replacing requests with urllib3 but didn't see any improvement in memory. So for now, I've replaced fetching the mime_content with just fetching the text content and disabled certain features - so while it's still leaking, we're seeing smaller spikes / smaller virtual memory consumption and the micro-service is able to run for a few days before restarting from memory pressure.

Another thing I noticed was that ElementTree seems to be using a lot of memory when parsing large soap responses, so that might be worth exploring.

dsanghan · 2020-02-13T13:29:51Z

@ecederstrand Figured it out - it's the thread pool that's leaking. In a server environment that is running 1000's of accounts, the pool is creating sessions_per_account * threads_per_session = 16 new threads per account, basically causing it to leak ~150MB every few minutes as a new account spools up. Removing the thread_pool from _pool_requests as well as removing all its references solves the massive leak (~30GB) and the usage is now down to a few hundred MB's in total.

The 1000's of threads create their own context's that don't show up in memory_profiler, or heapy, or mprof so it was fairly hard to debug.

A simple way to reproduce the issue is to use the above example. A single fetch ends up creating 20 threads. Otherwise, I didn't find any major leaks - almost all objects are properly reclaimed apart from the few cached properties. Might be worth considering adding an option to not use thread_pool's.

ecederstrand · 2020-02-13T15:20:41Z

Great work, @dsanghan!

I'm reopening since this will affect a lot of users. We need to find a solution for this problem in exchangelib.

ecederstrand · 2020-02-13T15:44:20Z

The ThreadPool memory leak may actually have been fixed in Python in 3.8.0: python/cpython@3766f18

dsanghan closed this as completed Feb 13, 2020

ecederstrand reopened this Feb 13, 2020

ecederstrand closed this as completed in 010c417 Feb 13, 2020

ecederstrand mentioned this issue Apr 3, 2020

Does the request mime content not support concurrency？ #739

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in fetch #675

Memory leak in fetch #675

dsanghan commented Nov 15, 2019

dsanghan commented Nov 15, 2019

ecederstrand commented Nov 15, 2019 •

edited

Loading

dsanghan commented Nov 15, 2019

ecederstrand commented Nov 19, 2019

ecederstrand commented Jan 9, 2020

dsanghan commented Jan 10, 2020

dsanghan commented Feb 13, 2020

ecederstrand commented Feb 13, 2020

ecederstrand commented Feb 13, 2020 •

edited

Loading

Memory leak in fetch #675

Memory leak in fetch #675

Comments

dsanghan commented Nov 15, 2019

dsanghan commented Nov 15, 2019

ecederstrand commented Nov 15, 2019 • edited Loading

dsanghan commented Nov 15, 2019

ecederstrand commented Nov 19, 2019

ecederstrand commented Jan 9, 2020

dsanghan commented Jan 10, 2020

dsanghan commented Feb 13, 2020

ecederstrand commented Feb 13, 2020

ecederstrand commented Feb 13, 2020 • edited Loading

ecederstrand commented Nov 15, 2019 •

edited

Loading

ecederstrand commented Feb 13, 2020 •

edited

Loading