Open Source Integration Testing as part of upstream releases #4389

PettitWesley · 2021-12-03T01:57:59Z

I think the Fluent Bit community should work towards having a higher bar for releases, to ensure stability, and improve user confidence.

The most common use case for Fluent Bit users is collecting k8s log files. It would be really cool if we had automated testing prior to releases that did the following:

deploy the release candidate to a k8s node and collect logs
use kubernetes filter to decorate with metadata
some of the logs should be multiline
testing custom parsers would be ideal as well
as time goes on, we can add other common use cases
send the logs via some open source, non-vendor output plugin, like forward or http. The destination receiving the logs should validate that all logs emitted by the k8s applications were sent and that they have k8s metadata and are in the right format.

This way, we test each release candidate against real-world use cases before releasing it.

We could have two types of tests:

Performance tests: Send logs at some decently high rate for a short period of time, check that they all end up at the destination. We should set some minimum performance bar for each release. As time goes on, this could be expanded into automated benchmarking for releases- we see what the max throughput of each release is in some common use case. And then we have a min bar it must meet, and then the final result (which should be above the min bar) will be published in the release notes for the release.
Stability tests: Run Fluent Bit in the k8s cluster for some non-trivial period of time. The test fails if it crashes or restarts. For patch/bug releases, we can set some small time frame, so that these tests can be run over-night. For minor version releases with new features, we would set a higher bar, like that FB must run without restarts for 3 - 5 days.

PettitWesley · 2021-12-03T01:58:30Z

CC @zhonghui12

We will bring this open source (non-vendor) focused testing idea up in the next Fluent Community meeting.

agup006 · 2021-12-03T02:47:59Z

Adding @patrick-stephens who’s looking at this from the Calyptia side

patrick-stephens · 2021-12-03T09:44:38Z

Agreed, I'm looking at general improvements under this change: #3753

Staging build automation
Testing of staging build <-- insert the suggestions here
Promotion of staging to release

It includes some level of testing for releases, although the tests above are more specifically resilience and performance tests. I would agree these should feed in - essentially there is some minimum level of validation for staging builds and then we trigger these longer running tests on those staging builds before approving the release.

The actual test cases can also be used as a form of validation for user infrastructure, i.e. run them in-situ to help identify any issues there.

I agree with keeping verification vendor-agnostic although it would also be useful to include some level of verification for common output plugins. We'll probably need a monotonic count or similar output message to verify all messages were received (coping with out-of-order retries too).

If we have the basic framework in place it will make it easy to evolve by user submission of PRs for new test cases, targets, etc. benefiting everyone then.

agup006 · 2021-12-03T15:43:30Z

I'm going to convert this issue to a discussion

patrick-stephens self-assigned this Dec 3, 2021

fluent locked and limited conversation to collaborators Dec 3, 2021

agup006 converted this issue into discussion #4390 Dec 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Open Source Integration Testing as part of upstream releases #4389

Open Source Integration Testing as part of upstream releases #4389

PettitWesley commented Dec 3, 2021

PettitWesley commented Dec 3, 2021

agup006 commented Dec 3, 2021

patrick-stephens commented Dec 3, 2021

agup006 commented Dec 3, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Open Source Integration Testing as part of upstream releases #4389

Open Source Integration Testing as part of upstream releases #4389

Comments

PettitWesley commented Dec 3, 2021

PettitWesley commented Dec 3, 2021

agup006 commented Dec 3, 2021

patrick-stephens commented Dec 3, 2021

agup006 commented Dec 3, 2021

This issue was moved to a discussion.