-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve package building and testing. #3753
Comments
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
Got a limited POC going now to do all this in a single repo with an action: pushes to S3 for packages and GHCR for images. These are all staging and then next stage is to test and "bless", i.e. release. |
Further discussion with @niedbalski has clarified a few things:
|
S3 should not be used for releases. Many users and customers have restricted access to S3 buckets and have whitelisted fluentbit domains to allow mirror the repos locally. We should continue using the native repos. |
s3 can handle custom domains, the domain mapping shouldn't change, any existing whitelist related to packages.fluentbit.io and apt.fluentbit.io should remain the same, in fact, we are aiming for the release bucket to keep the same exact layout/structure without changes. Enabling s3 has many benefits for us, including CDN, replication, backup, simplify the releases, etc. |
Current plan therefore is to use a parallel workflow where we maintain the current process but also start producing the S3 bucket for release as well to evaluate. We also need to ensure build times are kept low, possibly by using a self-hosted runner for it. |
Here is my take for testing on top of staging:
|
Agreed, I think for golden config I'll add a In fact, the default config might be fine - it's a shame that the server is not defaulted to running (I know people get tripped up on the helm chart healthchecks by this). It does CPU and |
Staging build is almost there now, just resolving some GPG signing issues but should present an S3 bucket with all the repos set up correctly. Container images built, scanned (Trivy + Dockle) and signed (Cosign) before staging to ghcr.io. Container testing as per the above is in place - verify each architecture image locally then use the Helm chart to verify in K8S deployment (whatever is the default in KIND when run). Package verification is in progress using kitchen-dokken: OS-based images for each target have the package installed and then we verify the service is running. |
We will also look to trigger downstream integration and soak tests in staging to verify more things. @niedbalski We should get in the suggestions here: #4389 |
In regards to integration testing:
[0] https://github.com/calyptia/fluent-bit-ci/blob/main/.github/workflows/main-gcp.yaml#L7 |
@patrick-stephens As a reference for the build/release to staging workflows.
For 4, that is covered by the private mirror due to the security concerns. |
* Addresses #3753 New workflows added to automate the build and test of releases using the new staging environment. No changes made to current process to ensure we can keep using it. Build & test of packaging Packages built to staging in S3 bucket: https://fluentbit-staging.s3.amazonaws.com We then verify the packages using kitchen-dokken to spin up OS images as containers, install the relevant RPM/Deb and check the service is properly running then. We are testing that the packaging process is correct. Containers build to Github Container Registry, gchr.io, using multi-arch manifests. Container tests then verify each architecture runs locally as well as a simple Helm deployment on KIND. All package and container build definitions brought into the repo from external sources - containers were in this repo and packages were not so that is now identical plus having them together makes it a lot easier to manage and use. Security Trivy and Dockle scanning added - ignores current failures so these should be reviewed and addressed as needed. Hadolint and Shellcheck really should be used too but this can be a separate PR. Cosigning of container images if a key is provided, and using the experimental keyless option too. GPG signing of binary packages as well as normal. Additional work Initial promotion from staging to release provided using a new release environment for approval - this needs creating. Initial multi-arch container image definition and workflow also added. Follow up PRs to improve testing, build on self-hosted and cover the promotion to release process. Trying to prevent a big bag and reduce review overhead. Infra updates Create release and staging environments. Create the following secrets: AWS_S3_BUCKET_STAGING AWS_S3_BUCKET_RELEASE AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY COSIGN_PRIVATE_KEY COSIGN_PASSWORD - optional if private key does not require COSIGN_PUBLIC_KEY FLUENTBITIO_HOST FLUENTBITIO_USERNAME FLUENTBITIO_SSHKEY GPG_PRIVATE_KEY We can actually start breaking these secrets up into the two environments. Signed-off-by: Patrick Stephens <[email protected]>
Need to add resilience and performance testing: #4390
|
Need to support package downgrade as well, i.e. official --> staging --> official and stays working. More distributions tested too.
|
Working on adding the release promotion job now:
|
Packages (RPM + Deb) looks ok now, working on container release now.
|
* Addresses fluent#3753 New workflows added to automate the build and test of releases using the new staging environment. No changes made to current process to ensure we can keep using it. Build & test of packaging Packages built to staging in S3 bucket: https://fluentbit-staging.s3.amazonaws.com We then verify the packages using kitchen-dokken to spin up OS images as containers, install the relevant RPM/Deb and check the service is properly running then. We are testing that the packaging process is correct. Containers build to Github Container Registry, gchr.io, using multi-arch manifests. Container tests then verify each architecture runs locally as well as a simple Helm deployment on KIND. All package and container build definitions brought into the repo from external sources - containers were in this repo and packages were not so that is now identical plus having them together makes it a lot easier to manage and use. Security Trivy and Dockle scanning added - ignores current failures so these should be reviewed and addressed as needed. Hadolint and Shellcheck really should be used too but this can be a separate PR. Cosigning of container images if a key is provided, and using the experimental keyless option too. GPG signing of binary packages as well as normal. Additional work Initial promotion from staging to release provided using a new release environment for approval - this needs creating. Initial multi-arch container image definition and workflow also added. Follow up PRs to improve testing, build on self-hosted and cover the promotion to release process. Trying to prevent a big bag and reduce review overhead. Infra updates Create release and staging environments. Create the following secrets: AWS_S3_BUCKET_STAGING AWS_S3_BUCKET_RELEASE AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY COSIGN_PRIVATE_KEY COSIGN_PASSWORD - optional if private key does not require COSIGN_PUBLIC_KEY FLUENTBITIO_HOST FLUENTBITIO_USERNAME FLUENTBITIO_SSHKEY GPG_PRIVATE_KEY We can actually start breaking these secrets up into the two environments. Signed-off-by: Patrick Stephens <[email protected]>
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the |
is this ok to close? |
Problem Description
The current workflow of building package is mostly manual. We have some automation testing on place, namely this workflow [0]
Publication isn't automated and we don't have a staging repository to test installs and upgrades to the release bucket.
Proposed solution
distributions and architectures. Sanity testing should include:
[0] https://github.com/fluent/fluent-bit/blob/master/.github/workflows/build-release.yaml
Known Limitations
The text was updated successfully, but these errors were encountered: