Support summary and detailed file metrics in filestream input #25045

kvch · 2021-04-13T16:26:54Z

What does this PR do?

This PR adds support for metrics in filestream input. The summary of reader metrics is enabled by default so the same numbers are reported as in log input.

More detailed metrics are disabled by default. Users can enable it by setting detailed_metrics to true. These metrics are the same as what log input provides. But as we have received numerous complaints about it not being configurable, I have separated it from the other metrics. Now users can configure whether they would like to collect these metrics or not.

Why is it important?

It makes the inner parts of filestream observable. Required for feature-parity with log input.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~- [ ] I have made corresponding changes to the documentation~~
I have made corresponding change to the default configuration files
~~- [ ] I have added tests that prove my fix is effective or that my feature works~~
~~- [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.~~

elasticmachine · 2021-04-13T17:19:06Z

💔 Tests Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: Pull request #25045 updated
Start Time: 2021-06-03T15:09:17.419+0000
Duration: 57 min 45 sec
Commit: b202b46

Test stats 🧪

Test	Results
Failed	1
Passed	5235
Skipped	644
Total	5880

Trends 🧪

Test errors

Expand to view the tests failures

`Build&Test / filebeat-build / TestFilestreamCloseRemoved – github.com/elastic/beats/v7/filebeat/input/filestream`

Expand to view the error details

 Failed

Expand to view the stacktrace

 === RUN   TestFilestreamCloseRemoved
coverage: 59.0% of statements
panic: test timed out after 10m0s

goroutine 32 [running]:
testing.(*M).startAlarm.func1()
	/usr/local/go/src/testing/testing.go:1700 +0xe5
created by time.goFunc
	/usr/local/go/src/time/sleep.go:180 +0x45

goroutine 1 [chan receive, 9 minutes]:
testing.(*T).Run(0xc000102780, 0xcf3278, 0x1a, 0xd19300, 0x493d01)
	/usr/local/go/src/testing/testing.go:1239 +0x2da
testing.runTests.func1(0xc00025e900)
	/usr/local/go/src/testing/testing.go:1511 +0x78
testing.tRunner(0xc00025e900, 0xc0002dfda8)
	/usr/local/go/src/testing/testing.go:1193 +0xef
testing.runTests(0xc0005189a8, 0x12f7ea0, 0x31, 0x31, 0xc0265c61c6e68a63, 0x8bb35ebea3, 0x131a420, 0xce9b4f)
	/usr/local/go/src/testing/testing.go:1509 +0x2fe
testing.(*M).Run(0xc0000fd500, 0x0)
	/usr/local/go/src/testing/testing.go:1417 +0x1eb
main.main()
	_testmain.go:203 +0x1c5

goroutine 21 [sleep]:
time.Sleep(0x989680)
	/usr/local/go/src/runtime/time.go:193 +0xd2
github.com/elastic/beats/v7/filebeat/input/filestream.(*inputTestingEnvironment).waitUntilEventCount(0xc000512280, 0x1)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/environment_test.go:303 +0x30
github.com/elastic/beats/v7/filebeat/input/filestream.TestFilestreamCloseRemoved(0xc000102780)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/input_integration_test.go:140 +0x3f0
testing.tRunner(0xc000102780, 0xd19300)
	/usr/local/go/src/testing/testing.go:1193 +0xef
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:1238 +0x2b3

goroutine 26 [semacquire, 9 minutes]:
sync.runtime_Semacquire(0xc000301fd8)
	/usr/local/go/src/runtime/sema.go:56 +0x45
sync.(*WaitGroup).Wait(0xc000301fd0)
	/usr/local/go/src/sync/waitgroup.go:130 +0x65
github.com/elastic/go-concert/unison.(*MultiErrGroup).Wait(0xc000301fb0, 0x0, 0x0, 0x0)
	/go/pkg/mod/github.com/elastic/[email protected]/unison/multierrgroup.go:54 +0x53
github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).Run(0xc0003d8460, 0xc000072e60, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/prospector.go:232 +0x3a5
github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.(*managedInput).Run(0xc0003d84b0, 0xc000072e60, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/internal/input-logfile/input.go:79 +0x3cb
github.com/elastic/beats/v7/filebeat/input/filestream.(*inputTestingEnvironment).startInput.func1(0xdd9818, 0xc0002b2500, 0xdd71f0, 0xc0003d84b0, 0xc000512280, 0xc000512310)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/environment_test.go:99 +0x14c
created by github.com/elastic/beats/v7/filebeat/input/filestream.(*inputTestingEnvironment).startInput
	/go/src/github.com/elastic/beats/filebeat/input/filestream/environment_test.go:95 +0x9a

goroutine 28 [chan receive, 9 minutes]:
github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).Event(0xc0002b24c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/fswatch.go:229 +0x85
github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).Run.func2(0xc0001a6b68, 0xc000042b80)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/prospector.go:126 +0x17f
github.com/elastic/go-concert/unison.(*MultiErrGroup).Go.func1(0xc000301fb0, 0xc0002b2580)
	/go/pkg/mod/github.com/elastic/[email protected]/unison/multierrgroup.go:42 +0x64
created by github.com/elastic/go-concert/unison.(*MultiErrGroup).Go
	/go/pkg/mod/github.com/elastic/[email protected]/unison/multierrgroup.go:40 +0x66

goroutine 27 [select, 9 minutes]:
github.com/elastic/go-concert/timed.Periodic(0x7feb4640c030, 0xc0002b2540, 0x4e94914f0000, 0xc00001aec0, 0x0, 0x0)
	/go/pkg/mod/github.com/elastic/[email protected]/timed/timed.go:76 +0x118
github.com/elastic/beats/v7/filebeat/input/filestream.(*fileWatcher).Run(0xc0002b24c0, 0x7feb4640c008, 0xc0002b2540)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/fswatch.go:129 +0x10b
github.com/elastic/beats/v7/filebeat/input/filestream.(*fileProspector).Run.func1(0x0, 0x0)
	/go/src/github.com/elastic/beats/filebeat/input/filestream/prospector.go:118 +0x9a
github.com/elastic/go-concert/unison.(*MultiErrGroup).Go.func1(0xc000301fb0, 0xc0005c82e8)
	/go/pkg/mod/github.com/elastic/[email protected]/unison/multierrgroup.go:42 +0x64
created by github.com/elastic/go-concert/unison.(*MultiErrGroup).Go
	/go/pkg/mod/github.com/elastic/[email protected]/unison/multierrgroup.go:40 +0x66

Steps errors

Expand to view the steps failures

`filebeat-build - mage build test`

Took 20 min 34 sec . View more details on here
Description: mage build test

`Error signal`

Took 0 min 0 sec . View more details on here
Description: Error 'hudson.AbortException: script returned exit code 1'

Log output

Expand to view the last 100 lines of log output

[2021-06-03T16:05:27.362Z] 8.77s call     x-pack/filebeat/tests/system/test_xpack_modules.py::XPackTest::test_fileset_file_096_oracle
[2021-06-03T16:05:27.362Z] 8.74s call     x-pack/filebeat/tests/system/test_xpack_modules.py::XPackTest::test_fileset_file_382_gcp
[2021-06-03T16:05:27.362Z] ======================= 411 passed in 1737.98s (0:28:57) =======================
[2021-06-03T16:05:27.362Z] >> python test: Integration Testing Complete
[2021-06-03T16:05:30.995Z] Cleaning up /var/lib/jenkins/workspace/PR-25045-16-059d4f09-a11d-4449-b054-faa443f44470
[2021-06-03T16:05:30.995Z] Client: Docker Engine - Community
[2021-06-03T16:05:30.995Z]  Version:           20.10.3
[2021-06-03T16:05:30.995Z]  API version:       1.41
[2021-06-03T16:05:30.995Z]  Go version:        go1.13.15
[2021-06-03T16:05:30.995Z]  Git commit:        48d30b5
[2021-06-03T16:05:30.995Z]  Built:             Fri Jan 29 14:33:13 2021
[2021-06-03T16:05:30.995Z]  OS/Arch:           linux/amd64
[2021-06-03T16:05:30.995Z]  Context:           default
[2021-06-03T16:05:30.995Z]  Experimental:      true
[2021-06-03T16:05:30.995Z] 
[2021-06-03T16:05:30.995Z] Server: Docker Engine - Community
[2021-06-03T16:05:30.995Z]  Engine:
[2021-06-03T16:05:30.995Z]   Version:          20.10.3
[2021-06-03T16:05:30.995Z]   API version:      1.41 (minimum version 1.12)
[2021-06-03T16:05:30.995Z]   Go version:       go1.13.15
[2021-06-03T16:05:30.995Z]   Git commit:       46229ca
[2021-06-03T16:05:30.995Z]   Built:            Fri Jan 29 14:31:25 2021
[2021-06-03T16:05:30.995Z]   OS/Arch:          linux/amd64
[2021-06-03T16:05:30.995Z]   Experimental:     false
[2021-06-03T16:05:30.996Z]  containerd:
[2021-06-03T16:05:30.996Z]   Version:          1.4.4
[2021-06-03T16:05:30.996Z]   GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
[2021-06-03T16:05:30.996Z]  runc:
[2021-06-03T16:05:30.996Z]   Version:          1.0.0-rc93
[2021-06-03T16:05:30.996Z]   GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
[2021-06-03T16:05:30.996Z]  docker-init:
[2021-06-03T16:05:30.996Z]   Version:          0.19.0
[2021-06-03T16:05:30.996Z]   GitCommit:        de40ad0
[2021-06-03T16:05:30.996Z] Change ownership of all files inside the specific folder from root/root to current user/group
[2021-06-03T16:05:30.996Z] Unable to find image 'alpine:3.4' locally
[2021-06-03T16:05:31.576Z] 3.4: Pulling from library/alpine
[2021-06-03T16:05:31.838Z] c1e54eec4b57: Pulling fs layer
[2021-06-03T16:05:32.096Z] c1e54eec4b57: Verifying Checksum
[2021-06-03T16:05:32.096Z] c1e54eec4b57: Download complete
[2021-06-03T16:05:32.096Z] c1e54eec4b57: Pull complete
[2021-06-03T16:05:32.355Z] Digest: sha256:b733d4a32c4da6a00a84df2ca32791bb03df95400243648d8c539e7b4cce329c
[2021-06-03T16:05:32.355Z] Status: Downloaded newer image for alpine:3.4
[2021-06-03T16:05:34.257Z] Change permissions with write access of all files inside the specific folder
[2021-06-03T16:05:35.208Z] Running in /var/lib/jenkins/workspace/PR-25045-16-059d4f09-a11d-4449-b054-faa443f44470/src/github.com/elastic/beats/build
[2021-06-03T16:05:35.501Z] + rm -rf ve
[2021-06-03T16:05:35.501Z] + find . -type d -name vendor -exec rm -r {} ;
[2021-06-03T16:05:35.813Z] + python .ci/scripts/pre_archive_test.py
[2021-06-03T16:05:38.343Z] Copy ./x-pack/filebeat/build into build/x-pack/filebeat/build
[2021-06-03T16:05:38.354Z] Running in /var/lib/jenkins/workspace/PR-25045-16-059d4f09-a11d-4449-b054-faa443f44470/src/github.com/elastic/beats/build
[2021-06-03T16:05:38.368Z] Recording test results
[2021-06-03T16:05:39.481Z] [Checks API] No suitable checks publisher found.
[2021-06-03T16:05:39.892Z] + go clean -modcache
[2021-06-03T16:05:42.727Z] Cleaning up /var/lib/jenkins/workspace/PR-25045-16-059d4f09-a11d-4449-b054-faa443f44470
[2021-06-03T16:05:42.727Z] Client: Docker Engine - Community
[2021-06-03T16:05:42.727Z]  Version:           20.10.3
[2021-06-03T16:05:42.727Z]  API version:       1.41
[2021-06-03T16:05:42.727Z]  Go version:        go1.13.15
[2021-06-03T16:05:42.727Z]  Git commit:        48d30b5
[2021-06-03T16:05:42.727Z]  Built:             Fri Jan 29 14:33:13 2021
[2021-06-03T16:05:42.727Z]  OS/Arch:           linux/amd64
[2021-06-03T16:05:42.727Z]  Context:           default
[2021-06-03T16:05:42.727Z]  Experimental:      true
[2021-06-03T16:05:42.727Z] 
[2021-06-03T16:05:42.727Z] Server: Docker Engine - Community
[2021-06-03T16:05:42.727Z]  Engine:
[2021-06-03T16:05:42.727Z]   Version:          20.10.3
[2021-06-03T16:05:42.727Z]   API version:      1.41 (minimum version 1.12)
[2021-06-03T16:05:42.727Z]   Go version:       go1.13.15
[2021-06-03T16:05:42.727Z]   Git commit:       46229ca
[2021-06-03T16:05:42.727Z]   Built:            Fri Jan 29 14:31:25 2021
[2021-06-03T16:05:42.727Z]   OS/Arch:          linux/amd64
[2021-06-03T16:05:42.727Z]   Experimental:     false
[2021-06-03T16:05:42.727Z]  containerd:
[2021-06-03T16:05:42.727Z]   Version:          1.4.4
[2021-06-03T16:05:42.727Z]   GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
[2021-06-03T16:05:42.727Z]  runc:
[2021-06-03T16:05:42.727Z]   Version:          1.0.0-rc93
[2021-06-03T16:05:42.727Z]   GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
[2021-06-03T16:05:42.727Z]  docker-init:
[2021-06-03T16:05:42.727Z]   Version:          0.19.0
[2021-06-03T16:05:42.727Z]   GitCommit:        de40ad0
[2021-06-03T16:05:42.727Z] Change ownership of all files inside the specific folder from root/root to current user/group
[2021-06-03T16:05:52.696Z] Change permissions with write access of all files inside the specific folder
[2021-06-03T16:05:52.718Z] Running in /var/lib/jenkins/workspace/PR-25045-16-059d4f09-a11d-4449-b054-faa443f44470
[2021-06-03T16:05:56.948Z] + gsutil --version
[2021-06-03T16:05:58.383Z] Masking supported pattern matches of $FILE_CREDENTIAL
[2021-06-03T16:05:58.692Z] + gcloud auth activate-service-account --key-file ****
[2021-06-03T16:05:59.261Z] Activated service account credentials for: [[email protected]]
[2021-06-03T16:05:59.571Z] + gsutil -m -q cp -a public-read eC1wYWNrL2ZpbGViZWF0LWJ1aWxkYjIwMmI0Njc5NGVhNzA3OTFlOTMxYTg5MzYyYTQ1Nzc1MGNlMTQ5Mw gs://beats-ci-temp/ci/cache/
[2021-06-03T16:06:01.582Z] Stage "Extended" skipped due to earlier failure(s)
[2021-06-03T16:06:01.615Z] Stage "Packaging" skipped due to earlier failure(s)
[2021-06-03T16:06:01.652Z] Stage "Packaging-Pipeline" skipped due to earlier failure(s)
[2021-06-03T16:06:01.705Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-25045/src/github.com/elastic/beats
[2021-06-03T16:06:02.085Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-25045
[2021-06-03T16:06:02.128Z] [INFO] getVaultSecret: Getting secrets
[2021-06-03T16:06:02.168Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2021-06-03T16:06:02.916Z] + chmod 755 generate-build-data.sh
[2021-06-03T16:06:02.916Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-25045/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-25045/runs/16 FAILURE 3405235
[2021-06-03T16:06:02.916Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-25045/runs/16/steps/?limit=10000 -o steps-info.json
[2021-06-03T16:06:03.827Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-25045/runs/16/tests/?status=FAILED -o tests-errors.json

🐛 Flaky test report

❕ There are test failures but not known flaky tests.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	1
Passed	5235
Skipped	644
Total	5880

Genuine test errors

💔 There are test failures but not known flaky tests, most likely a genuine test failure.

Name: Build&Test / filebeat-build / TestFilestreamCloseRemoved – github.com/elastic/beats/v7/filebeat/input/filestream

filebeat/input/filestream/input.go

urso · 2021-04-14T12:28:02Z

+1 on metrics. I wonder if we could rather have the prospector provide the metrics? Then we can also report files that are not actively collected (e.g. due to harvester_limit).

Related discussion on organizing metrics for inputs: https://github.com/elastic/obs-dc-team/issues/401

kvch · 2021-04-14T15:21:40Z

I thought the same thing, but I decided against it because if we add metrics to the Prospector, it has to be added every time we introduce a new one. This way, it is implemented once in the reader. It is something we are not replacing with other types of readers in filestream. Also, I have considered the HarvesterGroup too, but that was too generic for this metrics.

Also, why would we want to report files that we are not collecting? Maybe we should add metrics for the file scanner?

elasticmachine · 2021-04-15T08:52:00Z

Pinging @elastic/agent (Team:Agent)

kvch · 2021-04-15T11:37:39Z

After sleeping on it, I moved global metrics to HarvesterGroup.

kvch · 2021-04-15T14:56:57Z

Failing tests are unrelated.

mergify · 2021-04-19T13:39:27Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature-filebeat-filestream-metrics upstream/feature-filebeat-filestream-metrics
git merge upstream/master
git push upstream feature-filebeat-filestream-metrics

urso · 2021-04-19T18:58:32Z

Also, why would we want to report files that we are not collecting? Maybe we should add metrics for the file scanner?

As user I'm curious to know about the amount of unprocessed data + if it is growing or shrinking.

backlog = (f.size for f in watched_files) - sum(f.offset for f in watched_files)

By not taking files into account that we do not harvest, we can't measure the total disk usage of the files we are watching + we can't tell the exact backlog in case a harvester_limit has been configured. The files we actively read is just a subset of the files we are watching. The measurement should also only contains the files we are collecting. As long as a file is not removed from disk (or still held by FB), it should be accounted for.

Maybe we should add metrics for the file scanner?

That might be an option. Or the prospector. Essentially the entity that is responsible for filtering files by path.

kvch · 2021-04-20T15:42:29Z

Also, why would we want to report files that we are not collecting? Maybe we should add metrics for the file scanner?

As user I'm curious to know about the amount of unprocessed data + if it is growing or shrinking.
backlog = (f.size for f in watched_files) - sum(f.offset for f in watched_files)
By not taking files into account that we do not harvest, we can't measure the total disk usage of the files we are watching + we can't tell the exact backlog in case a harvester_limit has been configured. The files we actively read is just a subset of the files we are watching. The measurement should also only contains the files we are collecting. As long as a file is not removed from disk (or still held by FB), it should be accounted for.

So if I understand correctly you want to collect detailed progressMetrics for all files that match the paths configuration, right?

urso · 2021-04-20T16:04:24Z

So if I understand correctly you want to collect detailed progressMetrics for all files that match the paths configuration, right?

Yes. All files that match paths and are not filtered out because of ignore_older, exclude_files, include_files, or similar. Well ignore older is debatable. If a file is not actively collected, the metrics for the given file should have an indicator why/if it is not collected. So I can properly compute the backlog in bytes at any point in time.

mergify · 2021-04-22T08:16:04Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature-filebeat-filestream-metrics upstream/feature-filebeat-filestream-metrics
git merge upstream/master
git push upstream feature-filebeat-filestream-metrics

filebeat/_meta/config/filebeat.inputs.reference.yml.tmpl

filebeat/input/filestream/input.go

urso · 2021-04-23T20:52:15Z

filebeat/input/filestream/input.go

+				inp.detailedMetrics[srcID].updateCurrentSize(fi.Size())
+				return nil
+			})
+		}()


alternatively to actively probing the file with fstat, we could define a function based metric to save CPU. The function based metric will query the size only on demand, and only if the last file size update is >10s.

What do you mean by "only if the last file size update is >10s"? The file was updated 10 seconds ago?

I mean, that instead of pro-actively updating the metrics over and over again, we could make it lazy via the metrics callback. Instead of spinning a go-routine we check the state on disk lazily.
The callback would cache the last size (or maybe even gets its update from the last file watcher scan), and only call fstat if the cached value has not been updated withing the last 10 seconds.

By making it 'cheaper' when the metrics are not required, we don't really need a setting to enable/disable metrics support. It will just always be there.

urso · 2021-04-23T20:57:47Z

filebeat/input/filestream/input.go

+		inp.detailedMetrics[srcID] = newProgressMetrics(inp.ID, path, srcID, monitorCancel)
+		go func() {
+			timed.Periodic(monitorCtx, 30*time.Second, func() error {
+				fi, err := f.Stat()


This go-routine is not cleaned up when the harvester is shutdown. That is, we will likely fail here because the file was already closed. Unfortunately we shouldn't keep the file open, in order to not exhaust file descriptors. Alternatively we might want the prospector/file watcher to update the size.

This reminds me. If possible, the reader should also update the internal 'size' one it hits EOF. That is, the reader should not read until EOF, but until the internal size. Once the limit is reached, using Stat to update the size and continue reading if the size is increased. This is e.g. helpful when reading from NFS, as the stat circumvents 0 reads, as it can force extra roundtrips at the protocol level.

Why isn't it cleaned up? The canceler is from the input.Context, so if the parent is cancelled, it should be cancelled as well. What did I miss?

Oh, I missed that the monitorCtx is derived from the input context. Nevermind.

There is still a potential race on f, as the overall input shutdown does not wait for the monitor to finish before the file is closed.

filebeat/input/filestream/input.go

kvch · 2021-05-17T12:49:27Z

I have updated the PR.

urso · 2021-05-18T12:05:52Z

filebeat/input/filestream/config.go

 	"github.com/elastic/beats/v7/libbeat/common"
 	"github.com/elastic/beats/v7/libbeat/common/match"
 	"github.com/elastic/beats/v7/libbeat/reader/readfile"
 )

 // Config stores the options of a file stream.
 type config struct {
+	ID string `config:"id"`


Can we document how ID is used?

Is it used for registry state?

Is it used for monitoring?

Added documentation.

urso · 2021-05-18T12:08:58Z

filebeat/input/filestream/input.go

@@ -43,6 +45,8 @@ const pluginName = "filestream"

 type state struct {
 	Offset int64 `json:"offset" struct:"offset"`
+
+	id string


The state is going to be stored in the regsitry file. Do we really need the ID here? Does the ID change?

urso · 2021-05-18T12:13:45Z

filebeat/input/filestream/input.go

+	go func() {
+		m := m
+		timed.Periodic(monitorCtx, 30*time.Second, func() error {
+			fi, err := f.Stat()


The (*filestream).Run method will close f on shutdown. This go-routine should be guaranteed to be stopped before Run closes the file.

urso · 2021-05-18T12:15:06Z

filebeat/input/filestream/input.go

 	f, err := inp.openFile(log, path, offset)
 	if err != nil {
 		return nil, err
 	}

+	inp.startMonitoring(log, canceler, srcID, path, f)


maybe we want to move startMonitoring to Run, in order to make it more obvious which tasks an input is running. This would require us to split open into openFile and newReader.

urso · 2021-05-18T12:17:35Z

filebeat/input/filestream/internal/input-logfile/harvester.go

@@ -48,25 +48,36 @@ type Harvester interface {
 	// Run is the event loop which reads from the source
 	// and forwards it to the publisher.
 	Run(input.Context, Source, Cursor, Publisher) error
+	// Monitor is required for detailed metrics.
+	Monitor(input.Context, Source)


Instead of having a Monitor per harvester, I wonder if we should have the file watcher do the reporting. The file watcher would report the size update and we would find and update the actual state.

mergify · 2021-05-18T14:20:17Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature-filebeat-filestream-metrics upstream/feature-filebeat-filestream-metrics
git merge upstream/master
git push upstream feature-filebeat-filestream-metrics

filebeat/input/filestream/input.go

mergify · 2021-06-24T14:05:40Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature-filebeat-filestream-metrics upstream/feature-filebeat-filestream-metrics
git merge upstream/master
git push upstream feature-filebeat-filestream-metrics

botelastic · 2021-07-24T14:16:32Z

Hi!
We just realized that we haven't looked into this PR in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it in as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

botelastic · 2021-08-23T14:43:00Z

Hi!
This PR has been stale for a while and we're going to close it as part of our cleanup procedure.
We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team.
Feel free to re-open this PR if you think it should stay open and is worth rebasing.
Thank you for your contribution!

kvch · 2021-12-08T15:45:36Z

I am reopening the PR because it has to be added to filestream.

mergify · 2021-12-08T15:45:53Z

This pull request does not have a backport label. Could you fix it @kvch? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v./d./d./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

elasticmachine · 2021-12-08T15:47:27Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

jlind23 · 2022-01-06T08:57:44Z

@kvch is this PR pending for review?

kvch · 2022-04-04T06:09:56Z

Stalled, we will need time to clean up this.

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 13, 2021

kvch changed the title ~~Init reader metrics in filestream~~ Support summary and detailed file metrics in filestream input Apr 13, 2021

kvch added the Team:Elastic-Agent Label for the Agent team label Apr 13, 2021

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Apr 13, 2021

kvch added needs_backport PR is waiting to be backported to other branches. needs_docs labels Apr 14, 2021

urso reviewed Apr 14, 2021

View reviewed changes

filebeat/input/filestream/input.go Outdated Show resolved Hide resolved

kvch force-pushed the feature-filebeat-filestream-metrics branch from c4480de to 46d729a Compare April 14, 2021 15:55

kvch requested a review from urso April 14, 2021 15:55

kvch marked this pull request as ready for review April 15, 2021 08:51

kvch added the in progress Pull request is currently in progress. label Apr 20, 2021

kvch force-pushed the feature-filebeat-filestream-metrics branch from 507b602 to de08033 Compare April 21, 2021 12:47

kvch force-pushed the feature-filebeat-filestream-metrics branch from 03d1110 to 367b2bc Compare April 22, 2021 16:58

kvch removed the in progress Pull request is currently in progress. label Apr 23, 2021

urso reviewed Apr 23, 2021

View reviewed changes

kvch force-pushed the feature-filebeat-filestream-metrics branch from 7fae87b to 6ebf0ba Compare May 17, 2021 12:44

urso reviewed May 18, 2021

View reviewed changes

urso reviewed May 28, 2021

View reviewed changes

filebeat/input/filestream/input.go Outdated Show resolved Hide resolved

kvch added 2 commits June 3, 2021 17:07

Init reader metrics in filestream

60884fd

address ewview notes i.

b202b46

kvch force-pushed the feature-filebeat-filestream-metrics branch from 160affa to b202b46 Compare June 3, 2021 15:08

kvch mentioned this pull request Jun 4, 2021

Log file collector based on v2 input API #20243

Closed

18 tasks

botelastic bot added the Stalled label Jul 24, 2021

botelastic bot closed this Aug 23, 2021

kvch reopened this Dec 8, 2021

botelastic bot removed the Stalled label Dec 8, 2021

mergify bot added the backport-skip Skip notification from the automated backport with mergify label Dec 8, 2021

kvch added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Dec 8, 2021

mergify bot assigned kvch Feb 8, 2022

kvch closed this Apr 4, 2022

Support summary and detailed file metrics in filestream input #25045

Support summary and detailed file metrics in filestream input #25045

Conversation

kvch commented Apr 13, 2021 • edited Loading

What does this PR do?

Why is it important?

Checklist

elasticmachine commented Apr 13, 2021 • edited by jenkins-beats-ci bot Loading

💔 Tests Failed

Build stats

Test stats 🧪

Trends 🧪

Test errors

Build&Test / filebeat-build / TestFilestreamCloseRemoved – github.com/elastic/beats/v7/filebeat/input/filestream

Steps errors

filebeat-build - mage build test

Error signal

Log output

🐛 Flaky test report

Test stats 🧪

Genuine test errors

urso commented Apr 14, 2021

kvch commented Apr 14, 2021

elasticmachine commented Apr 15, 2021

kvch commented Apr 15, 2021

kvch commented Apr 15, 2021

mergify bot commented Apr 19, 2021

urso commented Apr 19, 2021

kvch commented Apr 20, 2021

urso commented Apr 20, 2021

mergify bot commented Apr 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kvch commented May 17, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mergify bot commented May 18, 2021

mergify bot commented Jun 24, 2021

botelastic bot commented Jul 24, 2021

botelastic bot commented Aug 23, 2021

kvch commented Dec 8, 2021

mergify bot commented Dec 8, 2021

elasticmachine commented Dec 8, 2021

jlind23 commented Jan 6, 2022

kvch commented Apr 4, 2022

kvch commented Apr 13, 2021 •

edited

Loading

elasticmachine commented Apr 13, 2021 •

edited by jenkins-beats-ci bot

Loading

`Build&Test / filebeat-build / TestFilestreamCloseRemoved – github.com/elastic/beats/v7/filebeat/input/filestream`

`filebeat-build - mage build test`

`Error signal`