From 44a465148ca5f31a611b0e892611166b0312ded9 Mon Sep 17 00:00:00 2001 From: Anton Pirker Date: Fri, 27 Jan 2023 10:04:23 +0100 Subject: [PATCH 1/6] rfc(decision): Document sensitive data collected --- README.md | 1 + .../0070-document-sensitive-data-collected.md | 35 +++++++++++++++++++ 2 files changed, 36 insertions(+) create mode 100644 text/0070-document-sensitive-data-collected.md diff --git a/README.md b/README.md index e9d2786d..efe56398 100644 --- a/README.md +++ b/README.md @@ -30,3 +30,4 @@ This repository contains RFCs and DACIs. Lost? - [0047-introduce-profile-context](text/0047-introduce-profile-context.md): Add Profile Context - [0048-move-replayid-out-of-tags](text/0048-move-replayid-out-of-tags.md): Plan to replace freight with GoCD +- [0070-document-sensitive-data-collected](text/0070-document-sensitive-data-collected.md): Document sensitive data collected diff --git a/text/0070-document-sensitive-data-collected.md b/text/0070-document-sensitive-data-collected.md new file mode 100644 index 00000000..6dba8316 --- /dev/null +++ b/text/0070-document-sensitive-data-collected.md @@ -0,0 +1,35 @@ +- Start Date: 2023-01-27 +- RFC Type: decision +- RFC PR: https://github.com/getsentry/rfcs/pull/70 +- RFC Status: draft + +# Summary + +One paragraph explanation of the feature or document purpose. + +# Motivation + +Why are we doing this? What use cases does it support? What is the expected outcome? + +# Background + +The reason this decision or document is required. This section might not always exist. + +# Supporting Data + +[Metrics to help support your decision (if applicable).] + +# Options Considered + +If an RFC does not know yet what the options are, it can propose multiple options. The +preferred model is to propose one option and to provide alternatives. + +# Drawbacks + +Why should we not do this? What are the drawbacks of this RFC or a particular option if +multiple options are presented. + +# Unresolved questions + +- What parts of the design do you expect to resolve through this RFC? +- What issues are out of scope for this RFC but are known? From 4f4bd7f7508af80c6ed2de94cec6b7eed20af7b2 Mon Sep 17 00:00:00 2001 From: Anton Pirker Date: Fri, 27 Jan 2023 12:00:26 +0100 Subject: [PATCH 2/6] Initial version of RFC --- .../0070-document-sensitive-data-collected.md | 69 +++++++++++++++---- 1 file changed, 57 insertions(+), 12 deletions(-) diff --git a/text/0070-document-sensitive-data-collected.md b/text/0070-document-sensitive-data-collected.md index 6dba8316..897b7953 100644 --- a/text/0070-document-sensitive-data-collected.md +++ b/text/0070-document-sensitive-data-collected.md @@ -5,31 +5,76 @@ # Summary -One paragraph explanation of the feature or document purpose. +We need an exact but concise documentation on what sensitive data our SDKs collect. This should be available in the SDKs documentation on docs.sentry.io and be specific to all the integrations that each SDK supports. + +This RFC is related to [RFC-0062 Controlling PII and Credentials in SDKs](https://github.com/getsentry/rfcs/pull/62). # Motivation -Why are we doing this? What use cases does it support? What is the expected outcome? +We collect a lot of data, and transparency creates trust. This documentation will make it easier for customers to choose Sentry because they know that their users data is in good hands. +It will also make it easier for our customers to be GDPR compliant. Companies that deal with data related to persons in the european union need to create a record of what data they are processing. +This documentation will make our customers lifes way easier while creating these records. +This will probably be a big selling point for larger customers. # Background -The reason this decision or document is required. This section might not always exist. +After a data incident and a meeting with legal, we said that we need to take data issues to the next level. -# Supporting Data +# Options Considered -[Metrics to help support your decision (if applicable).] +## A) Table in docs of each integration -# Options Considered +Have a hand written (and maintained) table in the description that shows people in an easy to grasp way what data is collected. It also shows how the data collection is changed when certain options (like `sendDefaultPII`) are changed. + +Here a example on how this could look like: +https://sentry-docs-git-antonpirker-python-fastapi-sensitive-data.sentry.dev/platforms/python/guides/fastapi/#data-collected + +The elements in the table can be different for different kinds (frontend, backend, mobile) of SDKS. + +Here a list of all sensitive data that is collected: + +- HTTP Headers (`event.request.headers`) +- HTTP Cookies (`event.request.cookies`) +- HTTP Request Body (`event.request.data`) +- Log Entry Params (`event.logentry.params`) +- Logged in User (`event.user`) +- Breadcrumb Values (`event.breadcrumbs.values -> value.data`) +- Local vars in Exceptions (`event.exception.values -> value.stacktrace.frames -> frame.vars`) +- Span Data (`event.spans -> span.data`) + +Pros: + +- Easy understandable and nice to read documentation + +Cons: + +- Documentation need to be kept up to date with seperate PR in `sentry-docs` repo when changes to SDK are made +- Documentation for different versions of the SDK not solved yet + +## B) Automatic documentation creation + +If we go with _Option B)_ in [RFC-0062 Controlling PII and Credentials in SDKs](https://github.com/getsentry/rfcs/pull/62) we could add doc strings in the code of the implemented `EventScrubber` and then generate documentation from this code to render a table similar to the one in Option A) in this RFC. + +Pros: + +- Generated from code, so it should be always up to date +- Possible to render docs for different versions of the SDK + +Cons: + +- Doc strings in code need to be kept up to date. +- Need to write tooling for exporting doc string from all SDKs to be able to include the generated documentation into docs.sentry.io + +## C) \*\*please suggest\*\* -If an RFC does not know yet what the options are, it can propose multiple options. The -preferred model is to propose one option and to provide alternatives. +If you have another idea on how to document this, please and an option here. # Drawbacks -Why should we not do this? What are the drawbacks of this RFC or a particular option if -multiple options are presented. +People tend to forget about documentation and then we end up with outdated documentation, which is kind of worse than having no documentation at all. # Unresolved questions -- What parts of the design do you expect to resolve through this RFC? -- What issues are out of scope for this RFC but are known? +- How do we guarantee, that the documentation stays up to date with the implementation? +- Do we need documentation tied to different versions of SDKs? +- We should probably add some checks in CI that make sure that code changes need to be documented as well? From dee357cb8d87267c2b71dec5d866721842a84bfd Mon Sep 17 00:00:00 2001 From: Anton Pirker Date: Fri, 27 Jan 2023 12:51:38 +0100 Subject: [PATCH 3/6] Added users IP address seperately, because legal says it is PII (because law is ambigious but customers mostly treat it as PII --- text/0070-document-sensitive-data-collected.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0070-document-sensitive-data-collected.md b/text/0070-document-sensitive-data-collected.md index 897b7953..92207151 100644 --- a/text/0070-document-sensitive-data-collected.md +++ b/text/0070-document-sensitive-data-collected.md @@ -38,6 +38,7 @@ Here a list of all sensitive data that is collected: - HTTP Request Body (`event.request.data`) - Log Entry Params (`event.logentry.params`) - Logged in User (`event.user`) +- Users IP address (`event.user`) - Breadcrumb Values (`event.breadcrumbs.values -> value.data`) - Local vars in Exceptions (`event.exception.values -> value.stacktrace.frames -> frame.vars`) - Span Data (`event.spans -> span.data`) From 986fcfaab8ed2d433be27aa0d3c1df8e9f7a8d4b Mon Sep 17 00:00:00 2001 From: Anton Pirker Date: Fri, 3 Feb 2023 13:47:14 +0100 Subject: [PATCH 4/6] Update for new two table design --- text/0070-document-sensitive-data-collected.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/text/0070-document-sensitive-data-collected.md b/text/0070-document-sensitive-data-collected.md index 92207151..ecc1cf6a 100644 --- a/text/0070-document-sensitive-data-collected.md +++ b/text/0070-document-sensitive-data-collected.md @@ -26,8 +26,13 @@ After a data incident and a meeting with legal, we said that we need to take dat Have a hand written (and maintained) table in the description that shows people in an easy to grasp way what data is collected. It also shows how the data collection is changed when certain options (like `sendDefaultPII`) are changed. -Here a example on how this could look like: -https://sentry-docs-git-antonpirker-python-fastapi-sensitive-data.sentry.dev/platforms/python/guides/fastapi/#data-collected +Here a example on how this could look like. (After talking with our designer Jesse, having two tables makes it way easier to ingest the information) + +For issues: +https://sentry-docs-git-antonpirker-python-fastapi-sensitive-data.sentry.dev/platforms/python/guides/fastapi/#data-collected-by-issue-reporting + +And for performance: +https://sentry-docs-git-antonpirker-python-fastapi-sensitive-data.sentry.dev/platforms/python/guides/fastapi/#data-collected-measuring-performance The elements in the table can be different for different kinds (frontend, backend, mobile) of SDKS. @@ -42,6 +47,7 @@ Here a list of all sensitive data that is collected: - Breadcrumb Values (`event.breadcrumbs.values -> value.data`) - Local vars in Exceptions (`event.exception.values -> value.stacktrace.frames -> frame.vars`) - Span Data (`event.spans -> span.data`) +- ... more to be defined ... Pros: From 056910bd9c6a0c68640c5c632da5083190ec553c Mon Sep 17 00:00:00 2001 From: Anton Pirker Date: Fri, 10 Feb 2023 14:19:02 +0100 Subject: [PATCH 5/6] Finalized RFC --- text/0070-document-sensitive-data-collected.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0070-document-sensitive-data-collected.md b/text/0070-document-sensitive-data-collected.md index ecc1cf6a..0006e7dc 100644 --- a/text/0070-document-sensitive-data-collected.md +++ b/text/0070-document-sensitive-data-collected.md @@ -1,7 +1,7 @@ - Start Date: 2023-01-27 - RFC Type: decision - RFC PR: https://github.com/getsentry/rfcs/pull/70 -- RFC Status: draft +- RFC Status: approved # Summary @@ -20,6 +20,10 @@ This will probably be a big selling point for larger customers. After a data incident and a meeting with legal, we said that we need to take data issues to the next level. +# Decision + +We will start with implementing **Option A)**. + # Options Considered ## A) Table in docs of each integration @@ -72,10 +76,6 @@ Cons: - Doc strings in code need to be kept up to date. - Need to write tooling for exporting doc string from all SDKs to be able to include the generated documentation into docs.sentry.io -## C) \*\*please suggest\*\* - -If you have another idea on how to document this, please and an option here. - # Drawbacks People tend to forget about documentation and then we end up with outdated documentation, which is kind of worse than having no documentation at all. From 9ce8302aaafa803965a36d66bf02e04bc703223c Mon Sep 17 00:00:00 2001 From: Anton Pirker Date: Fri, 10 Feb 2023 14:20:58 +0100 Subject: [PATCH 6/6] Added RFC to readme --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index e6b56fd3..bc586f37 100644 --- a/README.md +++ b/README.md @@ -27,9 +27,10 @@ This repository contains RFCs and DACIs. Lost? - [0042-gocd-succeeds-freight-as-our-cd-solution](text/0042-gocd-succeeds-freight-as-our-cd-solution.md): Plan to replace freight with GoCD - [0043-instruction-addr-adjustment](text/0043-instruction-addr-adjustment.md): new StackTrace Protocol field that controls adjustment of the `instruction_addr` for symbolication - [0044-heartbeat](text/0044-heartbeat.md): Heartbeat monitoring -- [0046-ttfd-automatic-transaction-span](text/0046-ttfd-automatic-transaction-span.md): Provide a new `time-to-full-display` span to the automatic UI transactions +- [0046-ttfd-automatic-transaction-span](text/0046-ttfd-automatic-transaction-span.md): Provide a new `time-to-full-display` span to the automatic UI transactions - [0047-introduce-profile-context](text/0047-introduce-profile-context.md): Add Profile Context - [0048-move-replayid-out-of-tags](text/0048-move-replayid-out-of-tags.md): Plan to replace freight with GoCD - [0062-controlling-pii-and-credentials-in-sd-ks](text/0062-controlling-pii-and-credentials-in-sd-ks.md): Controlling PII and Credentials in SDKs - [0063-sdk-crash-monitoring](text/0063-sdk-crash-monitoring.md): SDK Crash Monitoring +- [0070-document-sensitive-data-collected](text/0070-document-sensitive-data-collected.md): Document sensitive data collected - [0071-continue-trace-over-process-boundaries](text/0071-continue-trace-over-process-boundaries.md): Continue trace over process boundaries