Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea from pgantomizer - load dump into local dev db, anonymize it on the fly without parsing #7

Open
zealot128 opened this issue Sep 18, 2024 · 1 comment

Comments

@zealot128
Copy link

zealot128 commented Sep 18, 2024

Thanks for your talk yesterday!

I just found this Python tool which does something similar goal, like your tool
https://github.com/asgeirrr/pgantomizer/tree/master/pgantomizer

But their approach is more like your First Apporch "Read + Update", I guess?

  • They can load a dump.sql into a (local) PG database, connect with it, and than run queries to extract table information, etc. and generate a anonymized version with it - instead of working on the dump.

Their config format also has some more options, which might be also something we discussed for your main Masking Gem yesterday

  • each column has several options:
    • aggregate_length - replaces content of the column with its length (can be used on any type that supports length function)
    • clear - simply nulls out the value (whatever DB constraints still apply)
    • example_email - replaces the value with an @example.com based on the primary key value
    • md5 - alternative to default TEXT handling, useful for creating variance aside default handling while also guaranteeing value uniqueness
    • x_out - converts a string alpha-numeric characters to X's, retaining length
  • truncate table - Great idea! Like analytics/usage data might not be relevant in some scenarios in dev
@zealot128 zealot128 changed the title Idea from pgantomizer Idea from pgantomizer - load dump into local dev db, anonymize it on the fly without parsing Sep 18, 2024
@kibitan
Copy link
Owner

kibitan commented Sep 18, 2024

@zealot128 thank you for letting me know! and i'm happy to hear that you liked the talk :)
interesting project, yes i also guess it is "Read + Update" approach. and as you suggest configuration option looks interesting, i would closely look later! thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants