Return bools from archive/delete_batch #95

craigpastro · 2023-08-30T18:20:36Z

Closes #89.

This is an attempt at resolving #89. I also refactored archive in terms of archive_batch. Please let me know if you want me to revert that part.

It has been quite a while since I have written any Rust. I get the feeling I should be doing more maping and and_thening and using less mut 🙂

Before

pgmq=# select * from pgmq_archive('poop', array[1, 3, 5]);
 pgmq_archive
--------------
 t
(1 row)

pgmq=# select * from pgmq_archive('poop', array[1, 3, 5]);
 pgmq_archive
--------------
 t
(1 row)

After

pgmq=# select * from pgmq_archive('poop', array[1, 3, 5]);
 pgmq_archive 
----------
 t
 t
 f
(3 rows)

pgmq=# select * from pgmq_archive('poop', array[1, 3, 5]);
 pgmq_archive 
----------
 f
 f
 f
(3 rows)

ChuckHend · 2023-08-30T22:18:36Z

First, thank you for contributing!

This is an attempt at resolving #89. I also refactored archive in terms of archive_batch. Please let me know if you want me to revert that part.

I think its ok to overload the functions so that batch and single operations use the same API. Underneath though, do you know how much perf difference there will be for treating the single operations the same as a batch? e.g. a batch of size 1 vs a single? I have not benchmarked it yet.

craigpastro · 2023-08-30T22:27:48Z

First, thank you for contributing!

This is an attempt at resolving #89. I also refactored archive in terms of archive_batch. Please let me know if you want me to revert that part.

I think its ok to overload the functions so that batch and single operations use the same API. Underneath though, do you know how much perf difference there will be for treating the single operations the same as a batch? e.g. a batch of size 1 vs a single? I have not benchmarked it yet.

Excellent question. Let me see if I can run one with pgbench and I'll check back in.

Another thing to try I suppose is to add a quick benchmark test (based on https://doc.rust-lang.org/unstable-book/library-features/test.html) in another PR, and then we can compare them that way too.

craigpastro · 2023-08-30T23:22:12Z

So I wrote a small script that looks like

select pgmq_create('q');
select * from pgmq_send('q', '{"foo":"bar"}');
select * from pgmq_archive('q', 1);
... (a bunch more)
select pgmq_drop_queue('q');

and ran pgbench with that. The results are below. There doesn't seem to be much of a difference, but I am not an expert of pgbench or Postgres benchmarks. Any suggestions? Thanks!

Current

$ pgbench -h localhost -p 28815 -U craig --client=1 --jobs=1 -t 500 -n --file=pgmq_bench.sql pgmq
pgbench (15.3 (Homebrew))
transaction type: pgmq_bench.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 500
number of transactions actually processed: 500/500
number of failed transactions: 0 (0.000%)
latency average = 8.198 ms
initial connection time = 5.505 ms
tps = 121.983560 (without initial connection time)

This PR

$ pgbench -h localhost -p 28815 -U craig --client=1 --jobs=1 -t 500 -n --file=pgmq_bench.sql pgmq
pgbench (15.3 (Homebrew))
transaction type: pgmq_bench.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 500
number of transactions actually processed: 500/500
number of failed transactions: 0 (0.000%)
latency average = 8.246 ms
initial connection time = 4.762 ms
tps = 121.263713 (without initial connection time)

craigpastro · 2023-08-31T00:09:44Z

There is no #[pg_bench] attribute in pgrx so I don't think something like https://doc.rust-lang.org/unstable-book/library-features/test.html would work. Ah, but maybe an integration test could work. Let me check.

Update: I don't see how it can work with #[tokio::test].

…rchive-batch

ChuckHend · 2023-09-01T02:00:03Z

The in is about 0.01 ms slower than the = according this bench, just comparing the deletes. I'm curious if there are situations that could make that better/worse.

pgbench postgres://user:abc@localhost:28815/pgmq -P 1 -r -T 60 -f in.sql -f eq.sql
transaction type: multiple scripts
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 102134
number of failed transactions: 0 (0.000%)
latency average = 0.587 ms
latency stddev = 0.120 ms
initial connection time = 3.283 ms
tps = 1702.194694 (without initial connection time)
SQL script 1: in.sql
 - weight: 1 (targets 50.0% of total)
 - 51290 transactions (50.2% of total, tps = 854.813929)
 - number of failed transactions: 0 (0.000%)
 - latency average = 0.593 ms
 - latency stddev = 0.146 ms
 - statement latencies in milliseconds and failures:
         0.593           0  delete from bencher where record_id in (select max(record_id) from bencher)
SQL script 2: eq.sql
 - weight: 1 (targets 50.0% of total)
 - 50844 transactions (49.8% of total, tps = 847.380764)
 - number of failed transactions: 0 (0.000%)
 - latency average = 0.582 ms
 - latency stddev = 0.086 ms
 - statement latencies in milliseconds and failures:
         0.582           0  delete from bencher where record_id = (select max(record_id) from bencher)

…rchive-batch

craigpastro · 2023-09-01T02:35:20Z

🤔 it's unlikely, but archive_batch also has to go through some string construction and manipulation which could add a slight latency. Let me try to replace IN with = ANY to avoid the string manipulation (and it is also nicer I think) and can you try to run the benchmark again?

…rchive-batch

craigpastro · 2023-09-01T21:18:57Z

I tried explain analyze just to see what happens when using = 1 and = ANY(array[1]) and the plan is the same except for the index condition, but I can't see that having much of an impact. It just so happens here that the former was a bit slower, but those times fluctuate quite a bit.

pgmq=# EXPLAIN ANALYZE SELECT * FROM pgmq_q WHERE msg_id = 1;
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Index Scan using pgmq_q_pkey on pgmq_q  (cost=0.15..8.17 rows=1 width=60) (actual time=0.019..0.020 rows=0 loops=1)
   Index Cond: (msg_id = 1)
 Planning Time: 0.236 ms
 Execution Time: 0.130 ms
(4 rows)

pgmq=# EXPLAIN ANALYZE SELECT * FROM pgmq_q WHERE msg_id = ANY(array[1]);
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Index Scan using pgmq_q_pkey on pgmq_q  (cost=0.15..8.17 rows=1 width=60) (actual time=0.023..0.024 rows=0 loops=1)
   Index Cond: (msg_id = ANY ('{1}'::integer[]))
 Planning Time: 0.224 ms
 Execution Time: 0.058 ms
(4 rows)

What was kind of interesting is that in (...) gets transformed into = ANY(...), or at least this is how I interpret below:

pgmq=# EXPLAIN ANALYZE SELECT * FROM pgmq_q WHERE msg_id in (1,2,3);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on pgmq_q  (cost=4.47..11.58 rows=3 width=60) (actual time=0.009..0.010 rows=0 loops=1)
   Recheck Cond: (msg_id = ANY ('{1,2,3}'::bigint[]))
   ->  Bitmap Index Scan on pgmq_q_pkey  (cost=0.00..4.47 rows=3 width=0) (actual time=0.005..0.006 rows=0 loops=1)
         Index Cond: (msg_id = ANY ('{1,2,3}'::bigint[]))
 Planning Time: 0.459 ms
 Execution Time: 0.112 ms
(6 rows)

ChuckHend · 2023-09-02T17:00:00Z

Will need to bump core/Cargo.toml and ./Cargo.toml versions.

Also add an empty migration script for this version.

craigpastro · 2023-09-02T17:34:57Z

👋 @ChuckHend. Bumped minor versions and added empty migration scripts. I added 0.21 to 0.22 too since we are now at 0.22. Is that correct? Thanks!

ChuckHend · 2023-09-02T18:41:25Z

👋 @ChuckHend. Bumped minor versions and added empty migration scripts. I added 0.21 to 0.22 too since we are now at 0.22. Is that correct? Thanks!

Yes! Good call. It's easy to forget those. Wish we had CI checks for them.

ChuckHend · 2023-09-02T20:51:32Z

I think the migration script will need some handing in it.

Upgrading an existing install of 0.21.0 to 0.22.0, i think fails because we need to drop the existing functions first, since we've changed the return signature.
https://www.postgresql.org/docs/current/sql-createfunction.html

Install 0.21.0

git checkout main
cargo pgrx run

pgmq=# drop extension pgmq; create extension pgmq; select pgmq_create('test_queue');

Then checkout this PR and update the extension.

gh pr checkout 95
cargo pgrx run

pgmq=# alter extension pgmq update to '0.22.0';
ALTER EXTENSION
pgmq=# select * from  pgmq_delete('test_queue', ARRAY[1::bigint, 2::bigint]);
ERROR:  table-function protocol for value-per-call mode was not followed

That fails, because I think the extension has to be completely dropped when the signature changes. Maybe pgrx can be configured to do this?

However, if we completely drop the extension (which drops all the objects), it works as I'd expect it to.

pgmq=# drop extension pgmq; create extension pgmq; select pgmq_create('test_queue');
DROP EXTENSION
CREATE EXTENSION
 pgmq_create 
-------------
 
(1 row)
pgmq=# select * from  pgmq_delete('test_queue', ARRAY[1::bigint, 2::bigint]);
 pgmq_delete 
-------------
 f
 f
(2 rows)

So I don't think there's anything wrong with the code, just the migration.

craigpastro · 2023-09-03T03:15:12Z

🤔 I spent a bunch of time trying to get this to work before realizing it doesn't really work like I thought it would. This issue pgcentralfoundation/pgrx#121 seems to indicate that alter extension pgmq update will never work for this.

…rchive-batch

ChuckHend · 2023-09-03T21:38:17Z

pgmq-rs/tests/pg_ext_integration_test.rs

@@ -157,7 +157,7 @@ async fn test_ext_send_read_delete() {
        .expect("failed to delete");
    assert!(deleted);

-    // try to delete a message that doesnt exist
+    // try to delete a message that doesn't exist


this is triggering CI to fail on pgmq-rs. only because its telling us that the version in pgmq-rs/Cargo.toml already exists in crates.io. We can ignore that CI error.

craigpangea added 4 commits August 30, 2023 10:44

Return booleans from archive/delete_batch

35ecc98

add archive_batch test

929786b

Refactor archive in terms of archive_batch

1f85d8a

fix table names

bfbfb65

craigpastro requested review from ChuckHend, ianstanton, ryw, sjmiller609 and nhudson as code owners August 30, 2023 18:20

craigpastro marked this pull request as draft August 30, 2023 18:26

Remove obsolete test

f084a6a

craigpastro marked this pull request as ready for review August 30, 2023 18:47

craigpangea added 3 commits August 30, 2023 17:11

fix typo

7435299

Merge remote-tracking branch 'upstream/main' into return-bools-from-a…

43d9c30

…rchive-batch

lint

431ed3f

Merge remote-tracking branch 'upstream/main' into return-bools-from-a…

fb53471

…rchive-batch

craigpangea added 2 commits August 31, 2023 19:35

Replace IN with = ANY

e8e46d8

Merge remote-tracking branch 'upstream/main' into return-bools-from-a…

c9bd8e6

…rchive-batch

Merge branch 'main' into return-bools-from-archive-batch

f22178a

bump versions and add migration scripts

3433723

craigpangea and others added 5 commits September 3, 2023 13:45

try

ae8daae

add archive

68d9c86

in prep for rebase

8a655f5

Merge remote-tracking branch 'upstream/main' into return-bools-from-a…

d8d0894

…rchive-batch

Merge branch 'main' into return-bools-from-archive-batch

18cfc05

ChuckHend reviewed Sep 3, 2023

View reviewed changes

ChuckHend merged commit a4bd31c into tembo-io:main Sep 3, 2023

craigpastro mentioned this pull request Sep 4, 2023

Update to v0.23.0 of pgmq craigpastro/pgmq-go#15

Merged

craigpastro deleted the return-bools-from-archive-batch branch September 11, 2023 17:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return bools from archive/delete_batch #95

Return bools from archive/delete_batch #95

craigpastro commented Aug 30, 2023

ChuckHend commented Aug 30, 2023 •

edited

Loading

craigpastro commented Aug 30, 2023

craigpastro commented Aug 30, 2023

craigpastro commented Aug 31, 2023 •

edited

Loading

ChuckHend commented Sep 1, 2023

craigpastro commented Sep 1, 2023

craigpastro commented Sep 1, 2023

ChuckHend commented Sep 2, 2023

craigpastro commented Sep 2, 2023

ChuckHend commented Sep 2, 2023

ChuckHend commented Sep 2, 2023 •

edited

Loading

craigpastro commented Sep 3, 2023

ChuckHend Sep 3, 2023

Return bools from archive/delete_batch #95

Return bools from archive/delete_batch #95

Conversation

craigpastro commented Aug 30, 2023

ChuckHend commented Aug 30, 2023 • edited Loading

craigpastro commented Aug 30, 2023

craigpastro commented Aug 30, 2023

craigpastro commented Aug 31, 2023 • edited Loading

ChuckHend commented Sep 1, 2023

craigpastro commented Sep 1, 2023

craigpastro commented Sep 1, 2023

ChuckHend commented Sep 2, 2023

craigpastro commented Sep 2, 2023

ChuckHend commented Sep 2, 2023

ChuckHend commented Sep 2, 2023 • edited Loading

craigpastro commented Sep 3, 2023

ChuckHend Sep 3, 2023

Choose a reason for hiding this comment

ChuckHend commented Aug 30, 2023 •

edited

Loading

craigpastro commented Aug 31, 2023 •

edited

Loading

ChuckHend commented Sep 2, 2023 •

edited

Loading