Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent connection reset by peer errors when using neon.tech #1706

Open
3 of 5 tasks
bittermandel opened this issue Mar 1, 2025 · 0 comments
Open
3 of 5 tasks
Labels
bug Something is not working.

Comments

@bittermandel
Copy link
Contributor

bittermandel commented Mar 1, 2025

Preflight checklist

Ory Network Project

No response

Describe the bug

We are running upstream Keto in an on-premise Kubernetes cluster. We recently moved our postgres instance to https://neon.tech rather than running a CNPG instance in the cluster itself.

This has lead to very frequent connection reset by peer, specifically for both Hydra and Keto. I've been in contact with the Neon team and they suspect there's something on the application level which causes this issue.

Have you seen this issue before internally, or with other users running on Neon? At this point I am not sure how to debug this.

I did find this issue though, which seems similar to the issue we are seeing: jackc/pgx#984.

Reproducing the bug

  1. Run Keto with default configuration and PG DSN pointing at a Neon.tech instance without connection pooling.
  2. Ensure there is no active connection open.
  3. Send any read or write request which reads from DB. In our case GetRelationships

Relevant log output

Keto logs:

time=2025-02-06T15:08:49Z level=error msg=failed to look up direct access in db audience=application error=map[message:write failed: write tcp 10.0.72.159:50814->72.144.105.10:5432: write: connection reset by peer] method=checkDirect service_name=Ory Keto service_version=v0.11.1-alpha.0


Neon-side logs:

2025-02-28T13:17:21.220670Z  WARN connect_request{protocol=tcp session_id=3890068a-cae8-483d-a1b0-37ebb16f77a0 conn_info=65.109.68.175 role=neondb_owner ep=ep-flat-glitter-a9n862jk}: per-client task finished with an error: peer closed connection without sending TLS close_notify: https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-eof

2025-02-28T13:17:21.220602Z  INFO connect_request{protocol=tcp session_id=3890068a-cae8-483d-a1b0-37ebb16f77a0 conn_info=65.109.68.175 role=neondb_owner ep=ep-flat-glitter-a9n862jk}:{user="neondb_owner" db=Some("neondb") app=None}: forwarding error to user kind="clientdisconnect" error=peer closed connection without sending TLS close_notify: https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-eof msg="Internal error"

2025-02-28T13:17:21.220576Z  WARN connect_request{protocol=tcp session_id=3890068a-cae8-483d-a1b0-37ebb16f77a0 conn_info=65.109.68.175 role=neondb_owner ep=ep-flat-glitter-a9n862jk}:authenticate{allow_cleartext=false}: error processing scram messages error=Io(Custom { kind: UnexpectedEof, error: "peer closed connection without sending TLS close_notify: https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-e

Relevant configuration

Version

v0.11.1-alpha.0

On which operating system are you observing this issue?

Linux

In which environment are you deploying?

Kubernetes

Additional Context

No response

@bittermandel bittermandel added the bug Something is not working. label Mar 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working.
Projects
None yet
Development

No branches or pull requests

1 participant