global connection pool? #2

andrenam · 2017-12-01T14:43:38Z

Hi Maurits,

I just read your blog post on https://tqdev.com/2017-porting-php-crud-api-to-python
Don't you need a global to hold your connection pool to mysql?
For example, the flask framework offers flask.g to hold all globals (see http://flask.pocoo.org/docs/0.12/api/#application-globals) like database connections.

To share data that is valid for one request only from one function to another, a global variable is not good enough because it would break in threaded environments.

I don't know what the "correct" solution is here, but a simple move link = mysql.connector.connect(..) to a global variable gave me a performance boost of 3x.

The text was updated successfully, but these errors were encountered:

mevdschee · 2017-12-01T16:33:28Z

Hi @andrenam,

Thank you for contacting me and for making this suggestion. I'm a bit skeptical about the improvement this would bring, as I am already using connection pooling. The manual says:

To create a connection pool implicitly: Open a connection and specify one or more pool-related arguments (pool_name, pool_size).

I'm using implicit connection pooling, as you can see here:

https://github.com/mevdschee/py-crud-api/blob/master/api.py#L19

You are suggesting explicit connection pooling, which does not make a difference in my tests. If you would be able to do a PR then I would know exactly what you are talking about (maybe I'm doing it wrong). I would be very grateful as I would love to prove you right as a 3x improvement sounds great to me!

Kind regards,

Maurits

andrenam · 2017-12-01T18:26:34Z

Hi Maurits,

but maybe the cost of creating the connection object and the cursor with each request is too high?

I just played around and using a global variable to hold the link and cursor and a second variant using Singleton to hold the link (not cursor):

I used gunicorn with 4 threads, but a remote mysql server on a different machine on my network. so the cost of creating the connection is probably a little higher than the regular case.

original:            881.09 requests/sec
global variable:    2509.86 requests/sec
DB Singleton:       1735.19 requests/sec

See commit andrenam@b5cb20a
I added two new copies instead of overwriting the api.py to make it easier to compare them..

But I'm not sure whats the right way to do it, I don't have the expierence in python webprogramming..

mevdschee · 2017-12-01T20:19:31Z

Hi @andrenam,

That is a very interesting approach. You are not doing connection pooling, but you are using a single database connection, which does does make some sense (as we are also using a single thread). This way I am also able to get a higher performance. I never tried this approach before (on any implementation), but it will most likely also help the other implementations (such as nodejs). Great discovery!! I will update my blog post to reflect this interesting finding.

Thank you very much!

Maurits

NB: I wondered about the thread safety in the gunicorn case, but it seems not to be a problem there as it uses picoev and is thus single-threaded on every worker.

mevdschee · 2017-12-01T22:03:52Z

I still wonder about the importance of the things a connection pool does (or might do) to get you a "clean" connection as we are obviously skipping those as well (see also: http://blog.ulf-wendel.de/2011/using-mysql-with-php-mysqli-connections-options-pooling/).

MySQL does a mysql_change_user() call when getting a connection from the pool, which:

It always performs a ROLLBACK of any active transactions, closes and drops all temporary tables, and unlocks all locked tables. Session system variables are reset to the values of the corresponding global system variables. Prepared statements are released and HANDLER variables are closed. Locks acquired with GET_LOCK() are released. These effects occur even if the user did not change.

(source)

For most applications it would probably not be needed to do all these things, so maybe it is a good thing to just re-use the connection without using connection pooling as it is expensive (in Python).

andrenam · 2017-12-02T11:19:04Z

Glad to give you some input. :-)

mevdschee · 2017-12-02T12:07:52Z

I also wondering how (idle timeout) disconnects are handled when using a single connection. I guess this needs some further investigation :-)

mevdschee self-assigned this Dec 1, 2017

mevdschee added the question label Dec 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

global connection pool? #2

global connection pool? #2

andrenam commented Dec 1, 2017

mevdschee commented Dec 1, 2017 •

edited

Loading

andrenam commented Dec 1, 2017

mevdschee commented Dec 1, 2017 •

edited

Loading

mevdschee commented Dec 1, 2017 •

edited

Loading

andrenam commented Dec 2, 2017

mevdschee commented Dec 2, 2017 •

edited

Loading

global connection pool? #2

global connection pool? #2

Comments

andrenam commented Dec 1, 2017

mevdschee commented Dec 1, 2017 • edited Loading

andrenam commented Dec 1, 2017

mevdschee commented Dec 1, 2017 • edited Loading

mevdschee commented Dec 1, 2017 • edited Loading

andrenam commented Dec 2, 2017

mevdschee commented Dec 2, 2017 • edited Loading

mevdschee commented Dec 1, 2017 •

edited

Loading

mevdschee commented Dec 1, 2017 •

edited

Loading

mevdschee commented Dec 1, 2017 •

edited

Loading

mevdschee commented Dec 2, 2017 •

edited

Loading