Discussion Wanted: Truly anonymised telemetry for prowler OSS. #6389
+83
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
This PR contains code showing how i'd like to collect anonymous usage data for prowler.
I wanted to "show my working", with respect to privacy, metrics and telemetry are a sensitive topic and I beleive i've been careful to collect only non-sensitive data that can help improve the tool while respecting privacy.
Please take a look and feel free to discuss either in this Pull Request or in our community slack at https://goto.prowler.com/slack 👍
In the PR's code, i've suggested collecting the following data.
Basic execution info:
Aggregated result numbers
Anonymous feature usage:
On top of this, a list of failed checks containing ONLY the check_id name, the reasoning for this:
to my mind, the check IDs are generic identifiers (like "iam_user_mfa_enabled") and don't contain any sensitive information about the actual resources or findings. Inversely, we do not collect custom check ID's as these may be named more sensitively.
Automatic and manual disabling of telemetry.
Continuing the "open, transparent, privacy-first" theme. The code is designed to sacrifice telemetry over anything else:
--no-telemetry
will also disable the telemetry.Description
For easier readability/discussion, the telemetry code is currently in a function in prowler/prowler/main.py, and called at the end of the file before the exit() block. It will be moved to a utils file before a PR is considered for merging.
Tests are being worked on, as is a new documentation page which will be added to this PR.
The "receiving end" of the telemetry will also be open source, so that users can A. See the global trends of prowler for themselves and B. Confirm that no other data is being collected from both client and server-side of the codebase.
Checklist
License
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.