diff --git a/filebeat/docs/modules/postgresql.asciidoc b/filebeat/docs/modules/postgresql.asciidoc index 695a30dffdd7..7483be9ac215 100644 --- a/filebeat/docs/modules/postgresql.asciidoc +++ b/filebeat/docs/modules/postgresql.asciidoc @@ -26,6 +26,80 @@ The +{modulename}+ module using `.log` was tested with logs from versions 9.5 on The +{modulename}+ module using `.csv` was tested using versions 11 and 13 (distro is not relevant here). +[float] +=== Supported log formats + +This module can collect any logs from PostgreSQL servers, but to be able to +better analyze their contents and extract more information, they should be +formatted in a determined way. + +There are some settings to take into account for the log format. + +Log lines should be preffixed with the timestamp in milliseconds, the process +id, the user id and the database name. This uses to be the default in most +distributions, and is translated to this setting in the configuration file: + +["source","sh"] +---------------------------- +log_line_prefix = '%m [%p] %q%u@%d ' +---------------------------- + +PostgreSQL server can be configured to log statements and their durations and +this module is able to collect this information. To be able to correlate each +duration with their statements, they must be logged in the same line. This +happens when the following options are used: + +["source","sh"] +---------------------------- +log_duration = 'on' +log_statement = 'none' +log_min_duration_statement = 0 +---------------------------- + +Setting a zero value in `log_min_duration_statement` will log all statements +executed by a client. You probably want to configure it to a higher value, so it +logs only slower statements. This value is configured in milliseconds. + +When using `log_statement` and `log_duration` together, statements and durations +are logged in different lines, and {beatname_uc} is not able to correlate both +values, for this reason it is recommended to disable `log_statement`. + +NOTE: The PostgreSQL module of Metricbeat is also able to collect information +about all statements executed in the server. You may chose which one is better +for your needings. An important difference is that the Metricbeat module +collects aggregated information when the statement is executed several times, +but cannot know when each statement was executed. This information can be +obtained from logs. + +Other logging options that you may consider to enable are the following ones: + +["source","sh"] +---------------------------- +log_checkpoints = 'on'; +log_connections = 'on'; +log_disconnections = 'on'; +log_lock_waits = 'on'; +---------------------------- + +Both `log_connections` and `log_disconnections` can cause a lot of events if you +don't have persistent connections, so enable with care. + +[float] +=== Using CSV logs + +Since the PostgreSQL CSV log file is a well-defined format, +there is almost no configuration to be done in {beatname_uc}, just the filepath. + +On the other hand, it's necessary to configure postgresql to emit `.csv` logs. +The recommended parameters are: + +["source","sh"] +---------------------------- +logging_collector = 'on'; +log_destination = 'csvlog'; +---------------------------- + + include::../include/configuring-intro.asciidoc[] The following example shows how to set paths in the +modules.d/{modulename}.yml+ @@ -69,38 +143,14 @@ The first dashboard is for regular logs. [role="screenshot"] image::./images/filebeat-postgresql-overview.png[] -The second one shows the slowlogs of PostgreSQL. +The second one shows the slowlogs of PostgreSQL. If `log_min_duration_statement` +is not used, this dashboard will show incomplete or no data. [role="screenshot"] image::./images/filebeat-postgresql-slowlog-overview.png[] :has-dashboards!: -=== Using CSV logs - -Since the PostgreSQL CSV log file is a well-defined format, -there is almost no configuration to be done in filebeat, just the filepath - -On the other hand, it's necessary to configure postgresql to emit `.csv` logs. -The recommended parameters are: - -``` -logging_collector = 'on'; -log_destination = 'csvlog'; -log_statement = 'none'; -log_checkpoints = on; -log_connections = on; -log_disconnections = on; -log_lock_waits = on; -log_min_duration_statement = 0; -``` - -In busy servers, `log_min_duration_statement` can cause contention, so you can assign -a value greater than 0. - -Both `log_connections` and `log_disconnections` can cause a lot of events if you don't have -persistent connections, so enable with care. - :fileset_ex!: :modulename!: diff --git a/filebeat/module/postgresql/_meta/docs.asciidoc b/filebeat/module/postgresql/_meta/docs.asciidoc index 840a15ccd823..1d27610bd8f0 100644 --- a/filebeat/module/postgresql/_meta/docs.asciidoc +++ b/filebeat/module/postgresql/_meta/docs.asciidoc @@ -21,6 +21,80 @@ The +{modulename}+ module using `.log` was tested with logs from versions 9.5 on The +{modulename}+ module using `.csv` was tested using versions 11 and 13 (distro is not relevant here). +[float] +=== Supported log formats + +This module can collect any logs from PostgreSQL servers, but to be able to +better analyze their contents and extract more information, they should be +formatted in a determined way. + +There are some settings to take into account for the log format. + +Log lines should be preffixed with the timestamp in milliseconds, the process +id, the user id and the database name. This uses to be the default in most +distributions, and is translated to this setting in the configuration file: + +["source","sh"] +---------------------------- +log_line_prefix = '%m [%p] %q%u@%d ' +---------------------------- + +PostgreSQL server can be configured to log statements and their durations and +this module is able to collect this information. To be able to correlate each +duration with their statements, they must be logged in the same line. This +happens when the following options are used: + +["source","sh"] +---------------------------- +log_duration = 'on' +log_statement = 'none' +log_min_duration_statement = 0 +---------------------------- + +Setting a zero value in `log_min_duration_statement` will log all statements +executed by a client. You probably want to configure it to a higher value, so it +logs only slower statements. This value is configured in milliseconds. + +When using `log_statement` and `log_duration` together, statements and durations +are logged in different lines, and {beatname_uc} is not able to correlate both +values, for this reason it is recommended to disable `log_statement`. + +NOTE: The PostgreSQL module of Metricbeat is also able to collect information +about all statements executed in the server. You may chose which one is better +for your needings. An important difference is that the Metricbeat module +collects aggregated information when the statement is executed several times, +but cannot know when each statement was executed. This information can be +obtained from logs. + +Other logging options that you may consider to enable are the following ones: + +["source","sh"] +---------------------------- +log_checkpoints = 'on'; +log_connections = 'on'; +log_disconnections = 'on'; +log_lock_waits = 'on'; +---------------------------- + +Both `log_connections` and `log_disconnections` can cause a lot of events if you +don't have persistent connections, so enable with care. + +[float] +=== Using CSV logs + +Since the PostgreSQL CSV log file is a well-defined format, +there is almost no configuration to be done in {beatname_uc}, just the filepath. + +On the other hand, it's necessary to configure postgresql to emit `.csv` logs. +The recommended parameters are: + +["source","sh"] +---------------------------- +logging_collector = 'on'; +log_destination = 'csvlog'; +---------------------------- + + include::../include/configuring-intro.asciidoc[] The following example shows how to set paths in the +modules.d/{modulename}.yml+ @@ -64,38 +138,14 @@ The first dashboard is for regular logs. [role="screenshot"] image::./images/filebeat-postgresql-overview.png[] -The second one shows the slowlogs of PostgreSQL. +The second one shows the slowlogs of PostgreSQL. If `log_min_duration_statement` +is not used, this dashboard will show incomplete or no data. [role="screenshot"] image::./images/filebeat-postgresql-slowlog-overview.png[] :has-dashboards!: -=== Using CSV logs - -Since the PostgreSQL CSV log file is a well-defined format, -there is almost no configuration to be done in filebeat, just the filepath - -On the other hand, it's necessary to configure postgresql to emit `.csv` logs. -The recommended parameters are: - -``` -logging_collector = 'on'; -log_destination = 'csvlog'; -log_statement = 'none'; -log_checkpoints = on; -log_connections = on; -log_disconnections = on; -log_lock_waits = on; -log_min_duration_statement = 0; -``` - -In busy servers, `log_min_duration_statement` can cause contention, so you can assign -a value greater than 0. - -Both `log_connections` and `log_disconnections` can cause a lot of events if you don't have -persistent connections, so enable with care. - :fileset_ex!: :modulename!: diff --git a/metricbeat/docs/modules/postgresql.asciidoc b/metricbeat/docs/modules/postgresql.asciidoc index ef475199d6a7..995e6854f68a 100644 --- a/metricbeat/docs/modules/postgresql.asciidoc +++ b/metricbeat/docs/modules/postgresql.asciidoc @@ -86,6 +86,10 @@ metricbeat.modules: # Stats about every PostgreSQL process - activity + # Stats about every statement executed in the server. It requires the + # `pg_stats_statement` library to be configured in the server. + #- statement + period: 10s # The host must be passed as PostgreSQL URL. Example: diff --git a/metricbeat/metricbeat.reference.yml b/metricbeat/metricbeat.reference.yml index 685dac864523..dee8b504d8fa 100644 --- a/metricbeat/metricbeat.reference.yml +++ b/metricbeat/metricbeat.reference.yml @@ -734,6 +734,10 @@ metricbeat.modules: # Stats about every PostgreSQL process - activity + # Stats about every statement executed in the server. It requires the + # `pg_stats_statement` library to be configured in the server. + #- statement + period: 10s # The host must be passed as PostgreSQL URL. Example: diff --git a/metricbeat/module/postgresql/_meta/config.reference.yml b/metricbeat/module/postgresql/_meta/config.reference.yml index f27874eee36a..3b4ed4579d11 100644 --- a/metricbeat/module/postgresql/_meta/config.reference.yml +++ b/metricbeat/module/postgresql/_meta/config.reference.yml @@ -10,6 +10,10 @@ # Stats about every PostgreSQL process - activity + # Stats about every statement executed in the server. It requires the + # `pg_stats_statement` library to be configured in the server. + #- statement + period: 10s # The host must be passed as PostgreSQL URL. Example: diff --git a/metricbeat/module/postgresql/statement/_meta/docs.asciidoc b/metricbeat/module/postgresql/statement/_meta/docs.asciidoc index 6c188dce2d99..20f295c11707 100644 --- a/metricbeat/module/postgresql/statement/_meta/docs.asciidoc +++ b/metricbeat/module/postgresql/statement/_meta/docs.asciidoc @@ -1 +1,41 @@ This is the `statement` metricset of the PostgreSQL module. + +This module collects information from the `pg_stat_statements` view, that keeps +track of planning and execution statistics of all SQL statements executed by +the server. + +`pg_stat_statements` is included by an additional module in PostgreSQL. This +module requires additional shared memory, and is disabled by default. + +You can enable it by adding this module to the configuration as a shared +preloaded library. + +["source"] +------------------------------------------- +shared_preload_libraries = 'pg_stat_statements' +pg_stat_statements.max = 10000 +pg_stat_statements.track = all +------------------------------------------- + +NOTE: Preloading this library in your server will increase the memory usage of +your PostgreSQL server. Use it with care. + +Once the server is started with this module, it starts collecting statistics +about all statements executed. To make these statistics available in the +`pg_stat_statements` view, the following statement needs to be executed in the +server: + +["source","sql"] +------------------------------------------- +CREATE EXTENSION pg_stat_statements; +------------------------------------------- + +You can read more about the available options for this module in the +https://www.postgresql.org/docs/13/pgstatstatements.html[official documentation]. + +NOTE: The PostgreSQL module of Filebeat is also able to collect information +about statements executed in the server from its logs. You may chose which one +is better for your needings. An important difference is that the Metricbeat +module collects aggregated information when the statement is executed several +times, but cannot know when each statement was executed. This information can be +obtained from logs. diff --git a/x-pack/metricbeat/metricbeat.reference.yml b/x-pack/metricbeat/metricbeat.reference.yml index be76277068f6..aa4a071c9164 100644 --- a/x-pack/metricbeat/metricbeat.reference.yml +++ b/x-pack/metricbeat/metricbeat.reference.yml @@ -1120,6 +1120,10 @@ metricbeat.modules: # Stats about every PostgreSQL process - activity + # Stats about every statement executed in the server. It requires the + # `pg_stats_statement` library to be configured in the server. + #- statement + period: 10s # The host must be passed as PostgreSQL URL. Example: