pg_sampletolog is a PostgreSQL extension which allows to sample statements and/or transactions to logs. It adds in PostgreSQL (from 9.4 to 11) same kind of statement sampling added in PostgreSQL 12 (currently not released).
pg_sampletolog allows to:
- Log a sample of statements
- Log a sample of transactions
- Log before or after execution (in order to be compatible with pgreplay)
- Log all
DDL
orMOD
statements, same as log_statement - Log statement's queryid if pg_stat_statements is installed
The module can be built using the standard PGXS infrastructure. For this to work, the pg_config
program must be available in your $PATH. Instruction to install follows:
git clone https://github.com/anayrat/pg_sampletolog.git
cd pg_sampletolog
make
make install
Extension must me loaded either:
- In local session with
LOAD 'pg_sampletolog';
order. - In
session_preload_libraries
. To be loaded at connection start.
New settings should be visible in pg_settings
view:
select name from pg_settings where name like 'pg_sampletolog%';
name
----------------------------------------
pg_sampletolog.disable_log_duration
pg_sampletolog.log_before_execution
pg_sampletolog.log_level
pg_sampletolog.log_statement
pg_sampletolog.statement_sample_limit
pg_sampletolog.statement_sample_rate
pg_sampletolog.transaction_sample_rate
(6 rows)
pg_sampletolog is controlled by specifing GUC settings. So you can apply them in your postgresql.conf
or change them inside the session with SET
order.
By default, pg_sampletolog is disabled, you have to change theses settings (default values in parenthesis):
- pg_sampletolog.log_level (
LOG
): Control log severity level. Refer to PostgreSQL documentation for their meaning : Message Severity Levels - pg_sampletolog.statement_sample_rate (
0
): Control fraction of statements to report in log. - pg_sampletolog.transaction_sample_rate (
0
): Control fraction of transactions to report in log. - pg_sampletolog.statement_sample_limit (
-1
): Always log statements whose duration exceed this limit. - pg_sampletolog.log_statement
none
): Allow to log all DDL and/or MOD statements. - pg_sampletolog.log_before_execution (
false
): Control to log statements before or after execution. - pg_sampletolog.disable_log_duration (
false
): Disable duration reporting (mainly used for tests).
Examples:
- Log only 10% of statements:
pg_sampletolog.statement_sample_rate = 0.1
pg_sampelog will log arround 10% of statements. It will do computation for each statement by calling random()
function whose cost is negligible. Message output will look like (depending of your log_line_prefix
) :
2019-01-27 12:50:39.361 CET [27047] LOG: duration: 0.014 ms statement: SELECT 1;
- Log only 10% of transactions:
pg_sampletolog.transaction_sample_rate = 0.1
Same as previous, with the difference that, when a transaction starts, pg_sampletolog will choose if the transaction will be logged. When a transaction is logged, all statements of the transaction are logged.
Message output will look like (depending of your log_line_prefix
) :
Transaction : BEGIN; SELECT 1; SELECT 1; COMMIT;
2019-01-27 12:51:40.562 CET [27069] LOG: duration: 0.008 ms statement: SELECT 1;
2019-01-27 12:51:40.562 CET [27069] LOG: duration: 0.005 ms statement: SELECT 1;
- Do not sample statement if execution exceed 100ms:
pg_sampletolog.statement_sample_limit = '100ms'
It can be useful if you do not want sampling (i.e log all occurences) for long running queries. Note that it is possible only if either pg_sampletolog.statement_sample_rate
or pg_sampletolog.transaction_sample_rate
are > 0.
- Log all DDL statements:
pg_sampletolog.log_statement = 'ddl'
:
pg_sampletolog will log all DDL statements, it is similar to log_statement = ddl
parameter in Postgres. It could be useful if you want a sample of read queries but catch all ddl changes.
2019-01-27 12:53:47.564 CET [27103] LOG: statement: CREATE TABLE t1(c1 int);
- Log all data-modifying statements:
pg_sampletolog.log_statement = 'mod'
:
pg_sampletolog will log all data-modifying statements (including ddl), it is similar to log_statement = mod
parameter in Postgres. It could be useful if you want a sample of read queries but catch all write changes.
2019-01-27 12:59:54.043 CET [27160] LOG: duration: 0.246 ms statement: INSERT INTO t1 VALUES(1);
2019-01-27 13:00:16.468 CET [27160] LOG: duration: 126.851 ms statement: CREATE INDEX ON t1(c1);
- Log before execution:
pg_sampletolog.log_before_execution = on
By default, pg_sampletolog, will log after query is executed. It could be useful if you want to use results with pgreplay.
pg_sampletolog works on primary, but also on standby. If you call an SQL function containing several queries, pg_sampletolog will only log function call.
With a simple test with pgbench and log_min_duration = 0
:
pgbench -c8 -j2 -t 30000 -P5 bench
starting vacuum...end.
progress: 5.0 s, 10618.2 tps, lat 0.750 ms stddev 0.170
progress: 10.0 s, 10690.9 tps, lat 0.746 ms stddev 0.168
progress: 15.0 s, 10596.4 tps, lat 0.752 ms stddev 0.149
progress: 20.0 s, 9547.6 tps, lat 0.836 ms stddev 0.176
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 8
number of threads: 2
number of transactions per client: 30000
number of transactions actually processed: 240000/240000
latency average = 0.774 ms
latency stddev = 0.175 ms
tps = 9836.617690 (including connections establishing)
tps = 9837.334350 (excluding connections establishing)
pgbench -c8 -j2 -t 30000 -P5 bench 6.96s user 12.32s system 78% cpu 24.448 total
wc -l logfile
1680009
Same test with pg_sampletolog.statement_sample_rate = 0.1
wc -l logfile
119859
We got less than 10% lines of first test because pg_sampletolog don't log BEGIN
or COMMIT
orders.
Bonus if you have pg_stat_statements
installed, pg_sampletolog will report queryid:
LOG: duration: 0.032 ms statement: /* queryid = 5892081150081336634 */ UPDATE pgbench_tellers SET tbalance = tbalance + -3337 WHERE tid = 326;
It is useful to get query's parameters from a query you identified in pg_stat_statements' view.
No supported:
- PREPARE statements are not supported.
- Multiple statements in a single query will give this output:
select pg_sleep(1)\; select pg_sleep(1);
pg_sleep
----------
(1 row)
2019-06-07 23:07:24.556 CEST [26519] LOG: duration: 1005.733 ms statement: select pg_sleep(1); select pg_sleep(1);
2019-06-07 23:07:25.558 CEST [26519] LOG: duration: 2006.789 ms statement: select pg_sleep(1); select pg_sleep(1);
Tests are in sql/pg_sampletolog.sql
and results in expected/pg_sampletolog.out
.
Use make installcheck
to run them, you should see:
/tmp/pgbuild/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --bindir='/tmp/pgbuild/bin' --dbname=contrib_regression pg_sampletolog
(using postmaster on Unix socket, port 5432)
============== dropping database "contrib_regression" ==============
DROP DATABASE
============== creating database "contrib_regression" ==============
CREATE DATABASE
ALTER DATABASE
============== running regression test queries ==============
test pg_sampletolog ... ok
=====================
All 1 tests passed.
=====================
- Log statements by their queryid
Adrien NAYRAT
- Gilles Darold
- Julien Rouhaud
pg_sampletolog is free software distributed under the PostgreSQL license.
Copyright (c) 2019, Adrien NAYRAT