Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Source Integration Testing as part of upstream releases #4389

Closed
PettitWesley opened this issue Dec 3, 2021 · 4 comments
Closed

Open Source Integration Testing as part of upstream releases #4389

PettitWesley opened this issue Dec 3, 2021 · 4 comments
Assignees

Comments

@PettitWesley
Copy link
Contributor

I think the Fluent Bit community should work towards having a higher bar for releases, to ensure stability, and improve user confidence.

The most common use case for Fluent Bit users is collecting k8s log files. It would be really cool if we had automated testing prior to releases that did the following:

  • deploy the release candidate to a k8s node and collect logs
  • use kubernetes filter to decorate with metadata
  • some of the logs should be multiline
  • testing custom parsers would be ideal as well
  • as time goes on, we can add other common use cases
  • send the logs via some open source, non-vendor output plugin, like forward or http. The destination receiving the logs should validate that all logs emitted by the k8s applications were sent and that they have k8s metadata and are in the right format.

This way, we test each release candidate against real-world use cases before releasing it.

We could have two types of tests:

  1. Performance tests: Send logs at some decently high rate for a short period of time, check that they all end up at the destination. We should set some minimum performance bar for each release. As time goes on, this could be expanded into automated benchmarking for releases- we see what the max throughput of each release is in some common use case. And then we have a min bar it must meet, and then the final result (which should be above the min bar) will be published in the release notes for the release.
  2. Stability tests: Run Fluent Bit in the k8s cluster for some non-trivial period of time. The test fails if it crashes or restarts. For patch/bug releases, we can set some small time frame, so that these tests can be run over-night. For minor version releases with new features, we would set a higher bar, like that FB must run without restarts for 3 - 5 days.
@PettitWesley
Copy link
Contributor Author

CC @zhonghui12

We will bring this open source (non-vendor) focused testing idea up in the next Fluent Community meeting.

@agup006
Copy link
Member

agup006 commented Dec 3, 2021

Adding @patrick-stephens who’s looking at this from the Calyptia side

@patrick-stephens
Copy link
Contributor

Agreed, I'm looking at general improvements under this change: #3753

  • Staging build automation
  • Testing of staging build <-- insert the suggestions here
  • Promotion of staging to release

It includes some level of testing for releases, although the tests above are more specifically resilience and performance tests. I would agree these should feed in - essentially there is some minimum level of validation for staging builds and then we trigger these longer running tests on those staging builds before approving the release.

The actual test cases can also be used as a form of validation for user infrastructure, i.e. run them in-situ to help identify any issues there.

I agree with keeping verification vendor-agnostic although it would also be useful to include some level of verification for common output plugins. We'll probably need a monotonic count or similar output message to verify all messages were received (coping with out-of-order retries too).

If we have the basic framework in place it will make it easy to evolve by user submission of PRs for new test cases, targets, etc. benefiting everyone then.

@patrick-stephens patrick-stephens self-assigned this Dec 3, 2021
@agup006
Copy link
Member

agup006 commented Dec 3, 2021

I'm going to convert this issue to a discussion

@fluent fluent locked and limited conversation to collaborators Dec 3, 2021
@agup006 agup006 converted this issue into discussion #4390 Dec 3, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants