Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haproxy filebeat module tcp and default formats #8637

Merged
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 143 additions & 55 deletions filebeat/docs/fields.asciidoc

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion filebeat/include/fields.go

Large diffs are not rendered by default.

101 changes: 57 additions & 44 deletions filebeat/module/haproxy/_meta/fields.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,121 +7,134 @@
type: group
description: >
fields:

- name: destination
description: Destination information
type: group
fields:
- name: port
description: Port of the destination host
type: long

- name: ip
description: IP of the destination host

- name: process_name
description: Name of the process

- name: pid
description: Process ID
description: PID of the process
type: long

- name: client_ip
description: client_ip is the IP address of the client which initiated the TCP connection to haproxy.
- name: client
description: Information about the client doing the request
type: group
fields:
- name: ip
description: IP address of the client which initiated the TCP connection to haproxy.

- name: client_port
description: client_port is the TCP port of the client which initiated the connection.
type: long
- name: port
description: TCP port of the client which initiated the connection.
type: long

- name: frontend_name
description: frontend_name is the name of the frontend (or listener) which received and processed the connection.
description: Name of the frontend (or listener) which received and processed the connection.

- name: backend_name
description: backend_name is the name of the backend (or listener) which was selected to manage the connection to the server.
description: Name of the backend (or listener) which was selected to manage the connection to the server.

- name: server_name
description: server_name is the name of the last server to which the connection was sent.
description: Name of the last server to which the connection was sent.

- name: time_client_req
description: time_client_req is the total time in milliseconds spent waiting for a full HTTP request from the client (not counting body) after the first byte was received.
- name: total_waiting_time_ms
description: Total time in milliseconds spent waiting in the various queues
type: long

- name: time_queue
description: time_queue is the total time in milliseconds spent waiting in the various queues.
- name: connection_wait_time_ms
description: Total time in milliseconds spent waiting for the connection to establish to the final server
type: long

- name: time_backend_connect
description: time_backend_connect is the total time in milliseconds spent waiting for the connection to establish to the final server, including retries.
- name: bytes_read
description: Total number of bytes transmitted to the client when the log is emitted.
type: long

- name: time_server_response
description: time_server_response is the total time in milliseconds spent waiting for the server to send a full HTTP response, not counting data.
- name: time_queue
description: Total time in milliseconds spent waiting in the various queues.
type: long

- name: time_duration
description: time_duration is the time the request remained active in haproxy, which is the total time in milliseconds elapsed between the first byte of the request was received and the last byte of response was sent.
- name: time_backend_connect
description: Total time in milliseconds spent waiting for the connection to establish to the final server, including retries.
type: long

- name: server_queue
description: server_queue is the total number of requests which were processed before this one in the server queue.
description: Total number of requests which were processed before this one in the server queue.
type: long

- name: backend_queue
description: backend_queue is the total number of requests which were processed before this one in the backend's global queue.
description: Total number of requests which were processed before this one in the backend's global queue.
type: long

- name: bind_name
description: bind_name is the name of the listening address which received the connection.
description: Name of the listening address which received the connection.

- name: error_message
description: error_message is the error message logged by HAProxy in case of error.
description: Error message logged by HAProxy in case of error.
type: text

- name: source
type: text
description: The HAProxy source of the log

- name: geoip
type: group
description: >
Contains GeoIP information gathered based on the client_ip field.
geoip contains information gathered based on the client.ip field.
Only present if the GeoIP Elasticsearch plugin is available and
used.
fields:
- name: continent_name
type: keyword
description: >
The name of the continent.
description: Name of the continent.
- name: country_iso_code
type: keyword
description: >
Country ISO code.
description: Country ISO code.
- name: location
type: geo_point
description: >
The longitude and latitude.
description: Represents a geopoint with the longitude and latitude.
- name: region_name
type: keyword
description: >
The region name.
description: Name of the region
- name: city_name
type: keyword
description: >
The city name.
description: City name.
- name: region_iso_code
type: keyword
description: >
Region ISO code.
description: ISO code of the region

- name: termination_state
description: termination_state is the condition the session was in when the session ended.
description: Condition the session was in when the session ended.

- name: connections
description: Contains various counts of connections active in the process.
type: group
fields:
- name: active
description: active is the total number of concurrent connections on the process when the session was logged.
description: Total number of concurrent connections on the process when the session was logged.
type: long

- name: frontend
description: frontend is the total number of concurrent connections on the frontend when the session was logged.
description: Total number of concurrent connections on the frontend when the session was logged.
type: long

- name: backend
description: backend is the total number of concurrent connections handled by the backend when the session was logged.
description: Total number of concurrent connections handled by the backend when the session was logged.
type: long

- name: server
description: server is the total number of concurrent connections still active on the server when the session was logged.
description: Total number of concurrent connections still active on the server when the session was logged.
type: long

- name: retries
description: retries is the number of connection retries experienced by this session when trying to connect to the server.
description: Number of connection retries experienced by this session when trying to connect to the server.
type: long



47 changes: 36 additions & 11 deletions filebeat/module/haproxy/log/_meta/fields.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,22 @@
description: Please add description
type: group
fields:

- name: response
description: Fields related to the HTTP response
type: group
fields:
- name: status_code
description: status_code is the HTTP status code returned to the client.
type: long

- name: bytes_read
description: bytes_read is the total number of bytes transmitted to the client when the log is emitted.
description: HTTP status code returned to the client.
type: long

- name: captured_cookie
description: >
captured_cookie is an optional "name=value" entry indicating that the client had this cookie in the response.
Optional "name=value" entry indicating that the client had this cookie in the response.

- name: captured_headers
description: >
captured_response_headers is a list of headers captured in the response due to the presence of the "capture response header" statement in the frontend.
List of headers captured in the response due to the presence of the "capture response header" statement in the frontend.
type: text

- name: request
Expand All @@ -29,16 +26,44 @@
fields:
- name: captured_cookie
description: >
captured_cookie is an optional "name=value" entry indicating that the server has returned a cookie with its request.
Optional "name=value" entry indicating that the server has returned a cookie with its request.

- name: captured_headers
description: >
captured_request_headers is a list of headers captured in the request due to the presence of the "capture request header" statement in the frontend.
List of headers captured in the request due to the presence of the "capture request header" statement in the frontend.
type: text

- name: raw_request_line
description: raw_request_line is the complete HTTP request line, including the method, request and HTTP version string.
description: Complete HTTP request line, including the method, request and HTTP version string.
type: text


- name: time_active_ms
description: Time the request remained active in haproxy, which is the total time in milliseconds elapsed between the first byte of the request was received and the last byte of response was sent.
type: long

- name: time_wait_without_data_ms
description: Total time in milliseconds spent waiting for the server to send a full HTTP response, not counting data.
type: long

- name: time_wait_ms
description: Total time in milliseconds spent waiting for a full HTTP request from the client (not counting body) after the first byte was received.
type: long

- name: default
description: Default HAProxy log format https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#8.2.1
type: group
fields:
- name: mode
type: text
description: mode that the frontend is operating (TCP or HTTP)
Copy link
Member

@jsoriano jsoriano Oct 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can also be common to multiple log formats, remove it too from the default namespace, and it can be set to tcp on the tcp log format.
A possible name for this could be haproxy.protocol, or network.protocol as of ECS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not use network.protocol yet from ECS as it still might change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can read in the HAProxy documentation,mode is only specific for Default format https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#8.2.1 In TCP format there is no "mode" at all (mode is always TCP).

About the naming, I thought that we already agreed that we were going to leave ECS apart in this PR and focus on maintain naming of the service, which is the most familiar naming for an HAProxy user who could use this module.

Copy link
Member

@jsoriano jsoriano Oct 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, mode seems to be only specified in the default format, it wouldn't make sense in protocol-specific formats, but it can still be useful in events so users can differentiate logs from tcp and http connections.

We can set haproxy.protocol to the value of mode for the default formats, and for protocol-specific formats set it to the specific protocol depending on the matching pattern. I'm not sure how to set a field depending on the matching pattern, we can leave it for a future change, but I'd use a common name like haproxy.protocol instead of haproxy.default.mode in any case, thinking in the future uses of this field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are spending too much time and energy in a field that only appears (and gets parsed) in a deprecated log format.

I feel like writing complex conditional parsing for non-straightforward (invented) fields which doesn't usually appear in HAProxy and to get something that we haven't measure and that it has not come from a user requirements is over-engineering things a bit

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as I said we can leave adding this field in other formats for future changes. I was just asking about using a generic name like haproxy.protocol (instead of haproxy.default.mode), no overengineering, just using a more future-proof name, wdyt about this?

Copy link
Member

@jsoriano jsoriano Oct 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or haproxy.mode if you want to keep haproxy naming, but without the default namespacing.


- name: tcp
description: TCP log format
type: group
fields:
- name: processing_time_ms
type: long
description: Total time in milliseconds elapsed between the accept and the last close
- name: connection_waiting_time_ms
type: long
description: Total time in milliseconds elapsed between the accept and the last close
121 changes: 70 additions & 51 deletions filebeat/module/haproxy/log/ingest/pipeline.json
Original file line number Diff line number Diff line change
@@ -1,52 +1,71 @@
{
"description": "Pipeline for parsing HAProxy http logs in their default format. Requires the geoip plugin.",
"processors": [{
"grok": {
"field": "message",
"patterns": [
"(%{NOTSPACE:haproxy.process_name}\\[%{NUMBER:haproxy.pid:int}\\]: )?%{IP:haproxy.client_ip}:%{NUMBER:haproxy.client_port:int} \\[%{NOTSPACE:haproxy.http.request_date}\\] %{NOTSPACE:haproxy.frontend_name} %{NOTSPACE:haproxy.backend_name}/%{NOTSPACE:haproxy.server_name} %{NUMBER:haproxy.time_client_req:int}/%{NUMBER:haproxy.time_queue:int}/%{NUMBER:haproxy.time_backend_connect:int}/%{NUMBER:haproxy.time_server_response:int}/%{NUMBER:haproxy.time_duration:int} %{NUMBER:haproxy.http.response.status_code:int} %{NUMBER:haproxy.http.response.bytes_read:int} %{NOTSPACE:haproxy.http.request.captured_cookie} %{NOTSPACE:haproxy.http.response.captured_cookie} %{NOTSPACE:haproxy.termination_state} %{NUMBER:haproxy.connections.active:int}/%{NUMBER:haproxy.connections.frontend:int}/%{NUMBER:haproxy.connections.backend:int}/%{NUMBER:haproxy.connections.server:int}/%{NUMBER:haproxy.connections.retries:int} %{NUMBER:haproxy.server_queue:int}/%{NUMBER:haproxy.backend_queue:int} \\{%{DATA:haproxy.http.request.captured_headers}\\} \\{%{DATA:haproxy.http.response.captured_headers}\\} \"%{GREEDYDATA:haproxy.http.request.raw_request_line}\"",
"(%{NOTSPACE:haproxy.process_name}\\[%{NUMBER:haproxy.pid:int}\\]: )?%{IP:haproxy.client_ip}:%{NUMBER:haproxy.client_port:int} \\[%{NOTSPACE:haproxy.http.request_date}\\] %{NOTSPACE:haproxy.frontend_name}/%{NOTSPACE:haproxy.bind_name} %{GREEDYDATA:haproxy.error_message}"
],
"ignore_missing": false
}
},
{
"date": {
"field": "haproxy.http.request_date",
"target_field": "@timestamp",
"formats": ["dd/MMM/yyyy:HH:mm:ss.SSS"]
}
},
{
"remove": {
"field": "haproxy.http.request_date"
}
},
{
"geoip": {
"field": "haproxy.client_ip",
"target_field": "haproxy.geoip"
}
},
{
"split": {
"field": "haproxy.http.request.captured_headers",
"separator": "\\|",
"ignore_failure": true
}
},
{
"split": {
"field": "haproxy.http.response.captured_headers",
"separator": "\\|",
"ignore_failure": true
}
}
],
"on_failure" : [{
"set" : {
"field" : "error.message",
"value" : "{{ _ingest.on_failure_message }}"
}
}]
}
"description": "Pipeline for parsing HAProxy http logs in their default format. Requires the geoip plugin.",
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{HAPROXY_DATE:haproxy.request_date} %{IPORHOST:haproxy.source} %{PROG:haproxy.process_name}(?:\\[%{POSINT:haproxy.pid}\\])?: %{GREEDYDATA} %{IPORHOST:haproxy.client.ip}:%{POSINT:haproxy.client.port} %{WORD} %{IPORHOST:haproxy.destination.ip}:%{POSINT:haproxy.destination.port} \\(%{WORD:haproxy.frontend_name}/%{WORD:haproxy.default.mode}\\)",

"(%{NOTSPACE:haproxy.process_name}\\[%{NUMBER:haproxy.pid:int}\\]: )?%{IP:haproxy.client.ip}:%{NUMBER:haproxy.client.port:int} \\[%{NOTSPACE:haproxy.request_date}\\] %{NOTSPACE:haproxy.frontend_name} %{NOTSPACE:haproxy.backend_name}/%{NOTSPACE:haproxy.server_name} %{NUMBER:haproxy.http.request.time_wait_ms:int}/%{NUMBER:haproxy.total_waiting_time_ms:int}/%{NUMBER:haproxy.connection_wait_time_ms:int}/%{NUMBER:haproxy.http.request.time_wait_without_data_ms:int}/%{NUMBER:haproxy.http.request.time_active_ms:int} %{NUMBER:haproxy.http.response.status_code:int} %{NUMBER:haproxy.bytes_read:int} %{NOTSPACE:haproxy.http.request.captured_cookie} %{NOTSPACE:haproxy.http.response.captured_cookie} %{NOTSPACE:haproxy.termination_state} %{NUMBER:haproxy.connections.active:int}/%{NUMBER:haproxy.connections.frontend:int}/%{NUMBER:haproxy.connections.backend:int}/%{NUMBER:haproxy.connections.server:int}/%{NUMBER:haproxy.connections.retries:int} %{NUMBER:haproxy.server_queue:int}/%{NUMBER:haproxy.backend_queue:int} \\{%{DATA:haproxy.http.request.captured_headers}\\} \\{%{DATA:haproxy.http.response.captured_headers}\\} \"%{GREEDYDATA:haproxy.http.request.raw_request_line}\"",

"(%{NOTSPACE:haproxy.process_name}\\[%{NUMBER:haproxy.pid:int}\\]: )?%{IP:haproxy.client.ip}:%{NUMBER:haproxy.client.port:int} \\[%{NOTSPACE:haproxy.request_date}\\] %{NOTSPACE:haproxy.frontend_name}/%{NOTSPACE:haproxy.bind_name} %{GREEDYDATA:haproxy.error_message}",

"%{HAPROXY_DATE} %{IPORHOST:haproxy.source} (%{NOTSPACE:haproxy.process_name}\\[%{NUMBER:haproxy.pid:int}\\]: )?%{IP:haproxy.client.ip}:%{NUMBER:haproxy.client.port:int} \\[%{NOTSPACE:haproxy.request_date}\\] %{NOTSPACE:haproxy.frontend_name} %{NOTSPACE:haproxy.backend_name}/%{NOTSPACE:haproxy.server_name} %{NUMBER:haproxy.total_waiting_time_ms:int}/%{NUMBER:haproxy.connection_wait_time_ms:int}/%{NUMBER:haproxy.tcp.processing_time_ms:int} %{NUMBER:haproxy.bytes_read:int} %{NOTSPACE:haproxy.termination_state} %{NUMBER:haproxy.connections.active:int}/%{NUMBER:haproxy.connections.frontend:int}/%{NUMBER:haproxy.connections.backend:int}/%{NUMBER:haproxy.connections.server:int}/%{NUMBER:haproxy.connections.retries:int} %{NUMBER:haproxy.server_queue:int}/%{NUMBER:haproxy.backend_queue:int}"
],
"ignore_missing": false,
"pattern_definitions": {
"HAPROXY_DATE": "(%{MONTHDAY}[/-]%{MONTH}[/-]%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND})|%{SYSLOGTIMESTAMP}"
}
}
},
{
"date": {
"field": "haproxy.request_date",
"target_field": "@timestamp",
"formats": [
"dd/MMM/yyyy:HH:mm:ss.SSS",
"MMM dd HH:mm:ss"
]
}
},
{
"remove": {
"field": "haproxy.request_date"
}
},
{
"remove": {
"field": "message"
}
},
{
"geoip": {
"field": "haproxy.client.ip",
"target_field": "haproxy.geoip"
}
},
{
"split": {
"field": "haproxy.http.request.captured_headers",
"separator": "\\|",
"ignore_failure": true
}
},
{
"split": {
"field": "haproxy.http.response.captured_headers",
"separator": "\\|",
"ignore_failure": true
}
}
],
"on_failure": [
{
"set": {
"field": "error.message",
"value": "{{ _ingest.on_failure_message }}"
}
}
]
}
1 change: 1 addition & 0 deletions filebeat/module/haproxy/log/test/default.log
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Sep 20 15:42:59 1.2.3.4 haproxy[24551]: Connect from 1.2.3.4:40780 to 1.2.3.4:5000 (main/HTTP)
Loading