Replies: 2 comments 10 replies
-
The error that you are getting is a SQLAlchemy error, not related to Flask-SocketIO. It means that your database connection pool is too small for the amount of concurrent connections you need. You should review your usage of database sessions and make sure you do not hold on to sessions when you don't need them. If you still get errors after that, then you can increase the size of the SQLAlchemy pool. |
Beta Was this translation helpful? Give feedback.
-
I believe I have discovered the issue. Greenlets are nonblocking as long as gevent is left to handle the main event loop, hence the need to monkey patch. But libraries written in C, like psycopg2 (which we are using along with sqlalchemy), cannot be monkey patched by gevent or eventlet. So if there were for example a long running database query (like in my case), if gevent were in control of the event loop, it could switch control to another greenlet that is ready to execute, and resume the original greenlet once the slow operation is complete. But since psycopg is not being monkey patched, the operations are all synchronous and so no other operations can run until that long slow query is complete. There is a 'psycogreen' package that apparently alleviates this issue, and it appears to have been resolved in psycopg3. I will investigate if either of these solves my issue and report back |
Beta Was this translation helpful? Give feedback.
-
I'll try to keep this brief, but I have a lot of code examples to show. Please let me know if you need more context!
We recently updated our flask server to handle web socket connections using flask socketio. There was a major performance hit (we use k6 to load test the prod server, and have cloudwatch logs in elastic search for individual apis), which made sense because you can only utilize 1 worker for the server (we used 4 previously).
To solve this, we adjusted the nginx configuration to load balance multiple socketio servers. This helped, and the performance overall certainly improved. But on certain pages the page doesn't load for anywhere from 15-45 seconds. This occurs on pages that hit a large number of APIs (50+), and I'm guessing it is related to this error in sentry, which we never had before adding flask socketio
Based on this error, it seems that when too many requests are made at once and the queuepool limit is reached, the server is hanging and causing these extremely slow load times. So increasing the pool size and max overflow should help to alleviate the problem.
Question: How can I optimize our Flask-SocketIO setup to eliminate the QueuePool overflow errors and regain our previous performance levels? Are there architectural changes or advanced techniques beyond increasing resource limits and adding server nodes that I should consider?
Here is the nginx conf
Here is the server service
And here in the deployment is where we spin up three servers
If it helps, here is the flask socketio related logic
The socket instance
In app.py
Let me know if any more context is needed (e.g. k6 summaries, gunicorn logs)
Beta Was this translation helpful? Give feedback.
All reactions