Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Turning on pg_client_use_shared_memory causes regression in scan workloads #23999

Closed
1 task done
spolitov opened this issue Sep 18, 2024 · 0 comments
Closed
1 task done
Assignees
Labels
2024.2 Backport Required area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@spolitov
Copy link
Contributor

spolitov commented Sep 18, 2024

Jira Link: DB-12886

Description

Currently we use page size to create shared memory segment for pg client communication.
By default it is just 4KB.
When response does not fit into segment, it is transferred via RPC.
Such logic increase latency. Nearly 10ms per 1000 requeses.

During scan we fetch data by chunks of 1000 rows.
So nearly all scan read responses does not fit into 4KB and we fallback to RPC every time.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@spolitov spolitov added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Sep 18, 2024
@spolitov spolitov self-assigned this Sep 18, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Sep 18, 2024
spolitov added a commit that referenced this issue Sep 26, 2024
Summary:
Currently we use page size to create shared memory segment for pg client communication.
By default it is just 4KB.
When response does not fit into segment, it is transferred via RPC.
Such logic increase latency. Nearly 10ms per 1000 requeses.

During scan we fetch data by chunks of 1000 rows.
So nearly all scan read responses does not fit into 4KB and we fallback to RPC every time.

This diff introduces intermediate shared memory buffers, that are greater in size.
And could be reused by different postgres connections.

Read time changes in newly added test (PgSingleTServerTest.ScanOneColumn):
Don't use shared memory at all: 1.17s
Only 4KB segments: 1.35s
With intermediate big buffers: 1.06s
Jira: DB-12886

Test Plan: PgSingleTServerTest.ScanOneColumn

Reviewers: rthallam, esheng

Reviewed By: esheng

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D38149
fizaaluthra pushed a commit that referenced this issue Sep 27, 2024
Summary:
 12b2c40 [#23999] DocDB: Big shared memory segments
 b1e6329 [PLAT-15279] Add gzip compression to core dumps from DB.
 06472d5 [#24050] docdb: Fix re-packing rows after alter table add column with default value
 9009d11 [#23837] YSQL: Temporarily disable some tests with Connection Manager enabled
 11acca7 [#23325][#23326] yugabyted: Support for adding new databases for xCluster replication (Phase 2)
 96703da [PLAT-15465][PLAT-15466] Minor fixes in YNP
 c5aca3b [PLAT-14924][PLAT-12829][PLAT-15446] - ui bugs and improvements
 6e82692 [#23770] [#23797] YSQL: Stabilise some test failures with Connection Manager enabled
 b50bd1b [PLAT-15279] Adjusting the core pattern to create the cores with the core_ prefix for collect cores to catch it
 f692a60 [PLAT-14045] UBI-8 images don't have hostname
 d6a19da [PLAT-15377] Adding a global uncaught exception handler to yugaware
 acbb1ba [PLAT-15225] Verify there is no running master on nodes selected for master replacement
 Excluded: 3e93354 [#23686] YSQL: Build relcache foreign key list from YB catcache

Test Plan: Jenkins: rebase: pg15-cherrypicks

Reviewers: tfoucher, fizaa, telgersma

Differential Revision: https://phorge.dev.yugabyte.com/D38503
spolitov added a commit that referenced this issue Oct 8, 2024
Summary:
Currently we use page size to create shared memory segment for pg client communication.
By default it is just 4KB.
When response does not fit into segment, it is transferred via RPC.
Such logic increase latency. Nearly 10ms per 1000 requeses.

During scan we fetch data by chunks of 1000 rows.
So nearly all scan read responses does not fit into 4KB and we fallback to RPC every time.

This diff introduces intermediate shared memory buffers, that are greater in size.
And could be reused by different postgres connections.

Read time changes in newly added test (PgSingleTServerTest.ScanOneColumn):
Don't use shared memory at all: 1.17s
Only 4KB segments: 1.35s
With intermediate big buffers: 1.06s
Original commit: 12b2c40/D38149
Jira: DB-12886

Test Plan: PgSingleTServerTest.ScanOneColumn

Reviewers: rthallam, esheng

Reviewed By: esheng

Subscribers: ybase, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D38684
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024.2 Backport Required area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants