Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race in the pool using PgPool #10198

Closed
viniciusfcf opened this issue Jun 23, 2020 · 9 comments
Closed

Race in the pool using PgPool #10198

viniciusfcf opened this issue Jun 23, 2020 · 9 comments
Labels
kind/bug Something isn't working triage/out-of-date This issue/PR is no longer valid or relevant

Comments

@viniciusfcf
Copy link
Contributor

Describe the bug
Running a little load test using jaxrs, mutiny and reactive-pg-client the throughput was worst (6x) than using only jaxrs and panache.

Expected behavior
Throughput should be better using mutiny and reactive-pg-client.

To Reproduce
Steps to reproduce the behavior:

Packaging and running the application

  1. git clone github.com/viniciusfcf/quarkus-benchmark/
  2. docker run -p 5432:5432 viniciusfcf/postgres:latest (Database with 100.000 rows on table "user")
  3. cd <PROJECT_NAME> (ex:jaxrs-reactive)
  4. java -jar target/<PROJECT_NAME>-1.0-SNAPSHOT-runner.jar

Running the benchmark

  1. rm -rf report-jaxrs* /tmp/jmeter*log
  2. jmeter -n -t select.jmx -p jmeter.properties -l /tmp/jmeter-<PROJECT_NAME>.log -e -o report-<PROJECT_NAME>
  3. Open report-<PROJECT_NAME>/index.html file

Screenshots
(If applicable, add screenshots to help explain your problem.)

JAXRS project Screenshots:
image
image

JAXRS-REACTIVE project Screenshots:
image
image

image

Environment (please complete the following information):

  • Output of uname -a or ver: Linux G3 5.3.0-59-generic Implement env based override for DatasourceProducer #53~18.04.1-Ubuntu SMP Thu Jun 4 14:58:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  • Output of java -version: openjdk version "11.0.7" 2020-04-14
    OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.7+10)
    OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.7+10, mixed mode)

  • GraalVM version (if different from Java):

  • Quarkus version or git rev: 1.5.2

  • Build tool (ie. output of mvnw --version or gradlew --version): 3.6.3

John O'Hara mail Response

I tried running the benchmark and *sometimes* I see a single postgres process running, other times I see multiple postgres processes running.  This looks like it might be a race in the pool.

The default max connections in PgPool is 4, you can configure this with -Dquarkus.datasource.reactive.max-size=  on the command line, or add to application.properties. 

After increasing the pool size, I see an increase in throughput, but it is not cpu bound.  PgPool does not perform as well Agroal under heavy contention; In this test, it will be heavily contended. 

Please can you open an issue with the details in this thread, I think we need to look at how PgPool performs in this scenario.

thanks

@viniciusfcf viniciusfcf added the kind/bug Something isn't working label Jun 23, 2020
@gsmet
Copy link
Member

gsmet commented Jun 24, 2020

/cc @johnaohara @tsegismont

@viniciusfcf
Copy link
Contributor Author

Still happens in version 1.6.1

@johnaohara
Copy link
Member

Apologise, I have not had any time to investigate this yet. @barreiro have you observed this behaviour when load testing with the vert.x PgPool?

@tsegismont
Copy link
Contributor

tsegismont commented Aug 18, 2020

@viniciusfcf @johnaohara I looked into the pool implementation as I was surprised only one connection was used when using jax-rs reactive + PgPool

It turns out this is due to the implementation of "one shot queries" on PgPool (and other reactive SQL clients actually). As of 3.9.2 (the benchmark depends on 3.9.1) the pool releases the connection immediately after having scheduled the SQL command. As a consequence, all requests are pipelined on a single connection instead of using the max-size number of connections.

I changed the User#findAll impl to:

		return client.getConnection().onItem().produceMulti(conn -> {
			return conn.preparedQuery("SELECT id, age, creation, firstname, id, points, salary, verified FROM public.\"user\" ORDER BY firstname ASC LIMIT $1 OFFSET $2").execute(Tuple.of(pageSize, pageIndex * pageSize))
				.onItemOrFailure().invoke((rows, throwable) -> conn.close())
				.onItem().produceMulti(set -> Multi.createFrom().iterable(set))
				.onItem().apply(User::from);
		});

Then all connections are used:

postgres=# select count(*) from pg_stat_activity where application_name like '%vertx%';
 count 
-------
    20
(1 row)

The throughput for reactive-jaxrs becomes much better. I will file a bug on the Vert.x SQL client repo and keep you informed.

Note that the throughput of reactive-jaxrs is still (significantly) lower in this benchmark. It's not very wise to comment about a benchmark where injector, server and database all run on the same machine. But keep in mind that in such scenarios where server work for a single request only consists in sending a single request to the DB and relaying results, the blocking impl has great chances to give better results (unless pipelining is used on all connections).
Also, in the current implementation of PgPool, all connections in the pool are handled by the event loop thread that was created by (or provided to) the pool. So a single core handles all the IO, compared to the blocking scenario where this is shared between several threads.

@johnaohara
Copy link
Member

@tsegismont Thank you for investigating this issue.

@tsegismont
Copy link
Contributor

@cescoffier
Copy link
Member

@tsegismont should we consider this one closed?

@tsegismont
Copy link
Contributor

@cescoffier yes, the issue has been fixed in Vert.x Sql Client 3.9.3

@cescoffier
Copy link
Member

Thanks @tsegismont

@cescoffier cescoffier added the triage/out-of-date This issue/PR is no longer valid or relevant label Jan 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working triage/out-of-date This issue/PR is no longer valid or relevant
Projects
None yet
Development

No branches or pull requests

5 participants