[Feature Request]: obfuscation of values that is PII or PCI data #2116

32bit · 2022-04-22T20:00:22Z

🔎 Search Terms

obfuscating log data for PII or PCI

The vision

Provide a feature to obfuscate or replace the PII/PCI data so it won't get logged.

Use case

To avoid writing custom code to maintain fields to parse out, would it be possible to add "filter" to remove or obfuscate the fields in a json perhaps.

Additional information

NA

wbt · 2022-04-28T16:58:35Z

How do you identify what data is PII/PCI? I would suggest just not passing that to your logging functions, as what constitutes PII/PCI is pretty application-specific. The custom formatting extensibility might also be useful to tap into as a good place to put a function which scans for patterns you consider PII/PCI and replaces it with whatever else you want.

escodel · 2023-10-28T01:52:03Z

Came across this feature request and had a working idea for a solution. Custom formatting makes sense, but almost like a plugin for this type of data with some options, perhaps?

If this issue is still relevant, and contributions are welcome, I could try to integrate the proof-of-concept for a PR

DABH · 2023-10-28T02:14:23Z

Contributions are always welcome! The question would be whether something belongs in Winston itself or e.g. as a separate transport. We have a lot of transports in the ecosystem but there isn’t really a standard for or plugin repository of formatters. Perhaps something like this could live under examples? Or we should have an examples-like folder of useful formatters people have written? Open to ideas on how we’d best capture that kind of community knowledge somewhere people could find it.

escodel · 2023-10-28T02:43:43Z

Makes sense, thanks! I could at least provide the example formatting and it could be grouped with other useful formats.

Thinking about it from the API payload/response scenario, it could be the approach of adding a flag similar to private: true but for masking, passing options with it.

I'll take a closer look at the transports, and to see how to integrate it as part of a formatting example

wbt · 2023-11-08T15:46:43Z

The example PR above is merged. I would still be supportive of a PR that turns on, at first optionally and by default in the next breaking-change version, some reasonable secret obfuscator, ideally drawing on something widely used elsewhere (by GitHub itself?) instead of reinventing something to be maintained separately.

escodel · 2023-11-08T16:16:27Z

@wbt right on, that would be super useful. I'll look at some examples and take a shot at a feature

escodel · 2023-11-13T03:22:36Z

The example PR above is merged. I would still be supportive of a PR that turns on, at first optionally and by default in the next breaking-change version, some reasonable secret obfuscator, ideally drawing on something widely used elsewhere (by GitHub itself?) instead of reinventing something to be maintained separately.

So I've been thinking about this feature but want to stay on the right track.

I'm looking at the problem of doing this with near-instant logging, whereas it seems the secret detection services (such as GitHub's) are more git repo/history scanners for hard-coded tokens and doing pre-commit hooks, sending matches for verification against an api, etc.

My thought process is to draw upon common regex patterns internally for performance, and maybe target common field names in dynamic input like in the formatted example from the PR.

If I follow what you're saying, should the source for those patterns be an existing outside library/service? I'm also trying to capture tweaking config options for including/excluding certain patterns.

I might not be seeing the full picture yet so any guidance would be awesome. Thanks!

escodel · 2023-11-18T21:46:05Z

Bumping for @wbt and any maintainers for input on the above thanks 🙏

wbt · 2023-12-06T18:09:36Z

It seems like a reasonable approach. My main point is that we should try to avoid reinventing the wheel (and having to maintain the reinvention) to whatever extent things have already been done.

32bit added Feature Request Request for new functionality to support use cases not already covered Needs Investigation labels Apr 22, 2022

escodel mentioned this issue Oct 29, 2023

Adding mask formatting example winstonjs/logform#287

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: obfuscation of values that is PII or PCI data #2116

[Feature Request]: obfuscation of values that is PII or PCI data #2116

32bit commented Apr 22, 2022

wbt commented Apr 28, 2022

escodel commented Oct 28, 2023

DABH commented Oct 28, 2023

escodel commented Oct 28, 2023

wbt commented Nov 8, 2023

escodel commented Nov 8, 2023

escodel commented Nov 13, 2023

escodel commented Nov 18, 2023

wbt commented Dec 6, 2023

[Feature Request]: obfuscation of values that is PII or PCI data #2116

[Feature Request]: obfuscation of values that is PII or PCI data #2116

Comments

32bit commented Apr 22, 2022

🔎 Search Terms

The vision

Use case

Additional information

wbt commented Apr 28, 2022

escodel commented Oct 28, 2023

DABH commented Oct 28, 2023

escodel commented Oct 28, 2023

wbt commented Nov 8, 2023

escodel commented Nov 8, 2023

escodel commented Nov 13, 2023

escodel commented Nov 18, 2023

wbt commented Dec 6, 2023