Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new mode to multiline reader to aggregate constant number of lines #18352

Merged
merged 15 commits into from
Jun 17, 2020

Conversation

kvch
Copy link
Contributor

@kvch kvch commented May 7, 2020

What does this PR do?

This PR adds a new mode for the multiline reader of Libbeat (exposed in Filebeat). The new mode lets users to aggregate the configured number of lines into a single event.

Example configuration to aggregate 5 lines:

muliline.type: count
multiline.count_lines: 5

This PR also adds a new configuration option skip_newline. If set, Filebeat does not add a newline when two events are concatenated.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Closes #18038

@kvch kvch added Filebeat Filebeat [zube]: In Review Team:Services (Deprecated) Label for the former Integrations-Services team labels May 7, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 7, 2020
@kvch kvch removed the needs_team Indicates that the issue/PR needs a Team:* label label May 7, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented May 7, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Branch indexing]

  • Start Time: 2020-05-13T15:26:03.120+0000

  • Duration: 66 min 5 sec (3965111)

Steps errors

Expand to view the steps failures

  • Name: Make check
    • Description: make check

    • Result: FAILURE

    • Duration: 10 min 5 sec

    • Start Time: 2020-05-13T16:21:19.528+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-05-13T16:31:34.235Z] Stage "Packetbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:34.236Z] Stage "dockerlogbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:34.238Z] Stage "Winlogbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:34.239Z] Stage "Winlogbeat Windows x-pack" skipped due to earlier failure(s)
[2020-05-13T16:31:34.241Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:34.244Z] Stage "Journalbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:34.245Z] Stage "Generators" skipped due to earlier failure(s)
[2020-05-13T16:31:34.246Z] Stage "Kubernetes" skipped due to earlier failure(s)
[2020-05-13T16:31:35.003Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:35.005Z] Stage "Auditbeat oss" skipped due to earlier failure(s)
[2020-05-13T16:31:35.007Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:35.019Z] Stage "Metricbeat x-pack" skipped due to earlier failure(s)
[2020-05-13T16:31:35.028Z] Stage "Packetbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:35.029Z] Stage "dockerlogbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:35.038Z] Stage "Winlogbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:35.040Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:35.041Z] Stage "Journalbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:35.043Z] Stage "Generators" skipped due to earlier failure(s)
[2020-05-13T16:31:38.722Z] Failed in branch Elastic Agent x-pack
[2020-05-13T16:31:38.731Z] Failed in branch Elastic Agent x-pack Windows
[2020-05-13T16:31:38.732Z] Failed in branch Elastic Agent Mac OS X
[2020-05-13T16:31:38.733Z] Failed in branch Filebeat oss
[2020-05-13T16:31:38.734Z] Failed in branch Filebeat x-pack
[2020-05-13T16:31:38.751Z] Failed in branch Filebeat Mac OS X
[2020-05-13T16:31:38.753Z] Failed in branch Filebeat Windows
[2020-05-13T16:31:38.758Z] Failed in branch Auditbeat x-pack
[2020-05-13T16:31:38.759Z] Failed in branch Libbeat x-pack
[2020-05-13T16:31:38.779Z] Failed in branch Metricbeat OSS Unit tests
[2020-05-13T16:31:38.782Z] Failed in branch Metricbeat OSS Integration tests
[2020-05-13T16:31:39.151Z] Failed in branch Metricbeat Python integration tests
[2020-05-13T16:31:39.152Z] Failed in branch Metricbeat crosscompile
[2020-05-13T16:31:39.153Z] Failed in branch Metricbeat Mac OS X
[2020-05-13T16:31:39.153Z] Failed in branch Metricbeat Windows
[2020-05-13T16:31:39.155Z] Failed in branch Winlogbeat Windows x-pack
[2020-05-13T16:31:39.155Z] Failed in branch Kubernetes
[2020-05-13T16:31:40.197Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:40.200Z] Stage "Auditbeat oss" skipped due to earlier failure(s)
[2020-05-13T16:31:40.202Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:40.210Z] Stage "Metricbeat x-pack" skipped due to earlier failure(s)
[2020-05-13T16:31:40.213Z] Stage "Winlogbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:40.214Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:40.216Z] Stage "Generators" skipped due to earlier failure(s)
[2020-05-13T16:31:40.507Z] Failed in branch Packetbeat
[2020-05-13T16:31:40.508Z] Failed in branch dockerlogbeat
[2020-05-13T16:31:40.509Z] Failed in branch Journalbeat
[2020-05-13T16:31:41.220Z] Stage "Heartbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:41.222Z] Stage "Auditbeat oss" skipped due to earlier failure(s)
[2020-05-13T16:31:41.224Z] Stage "Libbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:41.226Z] Stage "Functionbeat" skipped due to earlier failure(s)
[2020-05-13T16:31:41.272Z] Stage "Generators" skipped due to earlier failure(s)
[2020-05-13T16:31:41.467Z] Failed in branch Metricbeat x-pack
[2020-05-13T16:31:41.469Z] Failed in branch Winlogbeat
[2020-05-13T16:31:42.139Z] Failed in branch Heartbeat
[2020-05-13T16:31:42.140Z] Failed in branch Libbeat
[2020-05-13T16:31:42.141Z] Failed in branch Functionbeat
[2020-05-13T16:31:42.142Z] Stage "Auditbeat oss" skipped due to earlier failure(s)
[2020-05-13T16:31:42.149Z] Stage "Generators" skipped due to earlier failure(s)
[2020-05-13T16:31:42.575Z] Failed in branch Auditbeat oss
[2020-05-13T16:31:42.607Z] Failed in branch Generators
[2020-05-13T16:31:43.166Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18352/src/github.com/elastic/beats
[2020-05-13T16:31:43.881Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-05-13T16:31:44.003Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18352/src/github.com/elastic/beats/Lint
[2020-05-13T16:31:45.098Z] + cat
[2020-05-13T16:31:45.099Z] + /usr/local/bin/runbld ./runbld-script
[2020-05-13T16:31:45.099Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-05-13T16:31:51.898Z] runbld>>> runbld started
[2020-05-13T16:31:51.898Z] runbld>>> 1.6.11/a66728ff8f4356963772e6e6d2069392fa06acbe
[2020-05-13T16:31:54.368Z] runbld>>> The following profiles matched the job 'Beats/beats-beats-mbp/PR-18352' in order of occurrence in the config (last value wins).
[2020-05-13T16:31:55.654Z] runbld>>> Debug logging enabled.
[2020-05-13T16:31:55.654Z] runbld>>> Storing result
[2020-05-13T16:31:55.654Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-05-13T16:31:55.654Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200513163155-D97ADDEE
[2020-05-13T16:31:55.654Z] runbld>>> Adding system facts.
[2020-05-13T16:31:56.601Z] runbld>>> Adding vcs info for the latest commit:  5d8a73055efba2b81ad014991f5058bb9ab06280
[2020-05-13T16:31:56.997Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-05-13T16:31:56.997Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-05-13T16:31:56.997Z] Processing JUnit reports with runbld...
[2020-05-13T16:31:56.997Z] + echo 'Processing JUnit reports with runbld...'
[2020-05-13T16:31:57.394Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-05-13T16:31:57.394Z] runbld>>> DURATION: 11ms
[2020-05-13T16:31:57.394Z] runbld>>> STDOUT: 40 bytes
[2020-05-13T16:31:57.394Z] runbld>>> STDERR: 49 bytes
[2020-05-13T16:31:57.394Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-05-13T16:31:57.394Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18352/src/github.com/elastic/beats
[2020-05-13T16:31:58.785Z] runbld>>> Storing build metadata: 
[2020-05-13T16:31:58.785Z] runbld>>> Adding test report.
[2020-05-13T16:31:58.785Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18352/src/github.com/elastic/beats
[2020-05-13T16:31:59.599Z] runbld>>> Found 0 test output files
[2020-05-13T16:31:59.599Z] runbld>>> Test output logs contained: Errors: 0 Failures: 0 Tests: 0 Skipped: 0
[2020-05-13T16:31:59.599Z] runbld>>> Storing result
[2020-05-13T16:32:00.106Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-05-13T16:32:00.106Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200513163155-D97ADDEE
[2020-05-13T16:32:00.106Z] runbld>>> Email notification disabled by environment variable.
[2020-05-13T16:32:00.106Z] runbld>>> Slack notification disabled by environment variable.
[2020-05-13T16:32:06.456Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18352
[2020-05-13T16:32:06.843Z] [INFO] getVaultSecret: Getting secrets
[2020-05-13T16:32:07.006Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-05-13T16:32:08.503Z] + chmod 755 generate-build-data.sh
[2020-05-13T16:32:08.503Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18352/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18352/runs/9 FAILURE 3965111
[2020-05-13T16:32:09.414Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18352/runs/9/steps/?limit=10000 -o steps-info.json

if config.Type == patternMode || config.Type == nil {
return newMultilinePatternReader(r, separator, maxBytes, config)
} else if config.Type == countMode {
return newMultilineCountReader(r, separator, maxBytes, config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it is needed to introduce a complete new reader. Some small modifications to the existing multline-reader will do exactly what is requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not introducing a new type. Only the validation of the config options is separated.

@urso
Copy link

urso commented Jun 2, 2020

Reading through the state machine I wonder if Reader type does too much. The type implements 2 different state machines and maintains some kind of read buffer used to merge the lines. With different behavior depending on it's initial configuration, and only a subset of fields of the types are used for real. This is what we have reader.Reader interface for, to allow for some polymorphism.
Common functionality (e.g. for merging the lines into the active 'read buffer') could be moved into a separate type.

@@ -163,6 +203,68 @@ func (mlr *Reader) readFirst() (reader.Message, error) {
}
}

func (mlr *Reader) readFirstCount() (reader.Message, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the pattern matcher we need readFirst, because we don't want to validate against the line against the regular expression yet. But I don't think we need readFirstCount. We always check for the number of lines and create an event once the amount of lines have been reached.

@kvch kvch force-pushed the feature-libbeat-counter-multiline branch from 8adf22b to 3592c83 Compare June 9, 2020 08:09
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 9, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #18352 updated]

  • Start Time: 2020-06-17T08:56:10.648+0000

  • Duration: 74 min 35 sec

Test stats 🧪

Test Results
Failed 0
Passed 9441
Skipped 1574
Total 11015

urso
urso previously approved these changes Jun 9, 2020
libbeat/reader/multiline/counter.go Show resolved Hide resolved
libbeat/reader/multiline/counter.go Outdated Show resolved Hide resolved
libbeat/reader/multiline/multiline_test.go Show resolved Hide resolved
libbeat/reader/multiline/pattern.go Outdated Show resolved Hide resolved
libbeat/reader/multiline/line_buffer.go Outdated Show resolved Hide resolved
@urso urso dismissed their stale review June 9, 2020 12:21

I meant to comment, sorry.

@kvch kvch force-pushed the feature-libbeat-counter-multiline branch from 3592c83 to eb91372 Compare June 10, 2020 08:38
@kvch kvch force-pushed the feature-libbeat-counter-multiline branch from e4b0bcd to 914129f Compare June 15, 2020 11:33
@kvch kvch requested a review from urso June 15, 2020 16:35
libbeat/reader/multiline/counter.go Outdated Show resolved Hide resolved
libbeat/reader/multiline/counter.go Outdated Show resolved Hide resolved
@kvch kvch force-pushed the feature-libbeat-counter-multiline branch from 06a04dd to c6a5a66 Compare June 16, 2020 08:38
@kvch kvch requested a review from urso June 16, 2020 08:38
Copy link

@urso urso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Feel free to merge after fixing the panic.

libbeat/reader/multiline/counter.go Outdated Show resolved Hide resolved
@kvch kvch force-pushed the feature-libbeat-counter-multiline branch from 2eaa696 to 7da05b6 Compare June 17, 2020 08:55
@kvch kvch merged commit e3f51ab into elastic:master Jun 17, 2020
kvch added a commit to kvch/beats that referenced this pull request Jun 17, 2020
elastic#18352)

## What does this PR do?

This PR adds a new mode for the multiline reader of Libbeat (exposed in Filebeat). The new mode lets users to aggregate the configured number of lines into a single event.

Example configuration to aggregate 5 lines:
```yaml
muliline.type: count
multiline.count_lines: 5
```

This PR also adds a new configuration option `skip_newline`. If set, Filebeat does not add a newline when two events are concatenated.

Closes elastic#18038
(cherry picked from commit e3f51ab)
@kvch kvch added the v7.9.0 label Jun 17, 2020
kvch added a commit that referenced this pull request Jun 17, 2020
…ate constant number of lines (#19243)

* Add new mode to multiline reader to aggregate constant number of lines (#18352)

## What does this PR do?

This PR adds a new mode for the multiline reader of Libbeat (exposed in Filebeat). The new mode lets users to aggregate the configured number of lines into a single event.

Example configuration to aggregate 5 lines:
```yaml
muliline.type: count
multiline.count_lines: 5
```

This PR also adds a new configuration option `skip_newline`. If set, Filebeat does not add a newline when two events are concatenated.

Closes #18038
(cherry picked from commit e3f51ab)
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
elastic#18352)

## What does this PR do?

This PR adds a new mode for the multiline reader of Libbeat (exposed in Filebeat). The new mode lets users to aggregate the configured number of lines into a single event.

Example configuration to aggregate 5 lines:
```yaml
muliline.type: count
multiline.count_lines: 5
```

This PR also adds a new configuration option `skip_newline`. If set, Filebeat does not add a newline when two events are concatenated.

Closes elastic#18038
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Filebeat Filebeat Team:Services (Deprecated) Label for the former Integrations-Services team v7.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filebeat: multiline: introduce merge by using max-lines as condition instead of pattern
5 participants