Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: memory leak in PgStream #3419

Closed
OliverNChalk opened this issue Aug 8, 2024 · 1 comment · Fixed by #3467
Closed

bug: memory leak in PgStream #3419

OliverNChalk opened this issue Aug 8, 2024 · 1 comment · Fixed by #3467
Labels
bug db:postgres Related to PostgreSQL

Comments

@OliverNChalk
Copy link

Bug Description

  • 2024-08-08T19:10:15.458367Z: PgListener returns Error::Protocol(msg).
  • 1s later, our app's memory usage jumps from 0.033 GB to 1.32 GB according to netdata.
  • Memory remains at this level until the next time we receive Error::Protocol at which point it jumps again. Every occurrence seems to produce a memory increase of ~1.2 GB. This repeats on each Protocol error until OOM kill.

This is the protocol error msg:

unknown message type: '\\0'

I captured this memory flamegraph with heaptrack:

image

This implies that all of the 1.2GB of memory is being allocated by the PgConnection and not released. When we encounter a protocol error, we drop the PgListener instance and create a new one. Perhaps the Drop implementation of PgListener is failing to cleanup as the PgListener is in some sort of invalid state due to the protocol error?

Minimal Reproduction

Due to the protocol error, unable to reproduce minimally.

Info

  • SQLx version: v0.7.4
  • SQLx features enabled: [
    "chrono",
    "postgres",
    "runtime-tokio",
    "time",
    "tls-rustls",
    "bigdecimal",
    ]
  • Database server and version: Postgres 16.3 (with Timescale plugin)
  • Operating system: 22.04.1-Ubuntu
  • rustc --version: rustc 1.76.0 (07dca489a 2024-02-04)
@abonander
Copy link
Collaborator

abonander commented Aug 27, 2024

What I suspect is happening is that a cancellation of PgListener::recv() is leaving the connection in an invalid state.

A cancellation between these two .awaits in PgStream::recv_unchecked() can leave the stream pointing at non-header data:

let mut header: Bytes = self.inner.read(5).await?;
let format = BackendMessageFormat::try_from_u8(header.get_u8())?;
let size = (header.get_u32() - 4) as usize;
let contents = self.inner.read(size).await?;

Then a subsequent call reads that data as if it were the header of a new message, reads some arbitrary bytes and interprets them as the size of the message it should read, and attempts to allocate and read a message of that size.

I'm not certain about why the memory is leaked. However, I think it's because PgListener uses a PoolConnection<Postgres> internally, and drops that connection if there's an error, but without closing it first. So PoolConnection attempts to return itself to the pool, and spawns a task to flush the data. This could hang forever if the connection is in an invalid state, making it appear to be a memory leak (what I've been calling "live" leaked memory, because it's still tracked somewhere).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug db:postgres Related to PostgreSQL
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants