Skip to content

Seed from Production Anonymization

Jon P Smith edited this page Apr 22, 2019 · 3 revisions

Overview

You can optionally anonymise names, emails, addresses etc. that were in your production database to ensure no personal data is contained in the unit tests. This can be done during the extract stage so that all personal data is removed before it is stored as a JSON file.

You signify what class+property you want anonymised via the DataResetterConfig's method AddToAnonymiseList. See an example set of lines that would replace the var resetter = new DataResetter(context); line in the non-anonymise extract version:

//... first part same as non-anonymise extract stage
var config = new DataResetterConfig();
config.AddToAnonymiseList<Author>(x => x.Name, "FullName");
var resetter = new DataResetter(context, config);
//... last part same as non-anonymise extract stage

This will anonymise the property called Name in every instance of the Author class in the data provided. NOTE: this process uses Reflection, so it can anonymise properties which have private setters.

The DataResetterConfig class contains a simple anonymisation function that uses GUID strings, but you can replace it with you own anonymisation function by setting the DataResetterConfig's AnonymiserFunc property to your function. Here is how you would do this:

var myAnonymiser = new MyAnonymiser(42);
var config = new DataResetterConfig
{
    AnonymiserFunc = myAnonymiser.AnonymiseThis
};

For an example of this look at TestDataResetter.ExampleSetupAndSeed unit tests.

Writing your own AnonymiserFunc

In the TestDataResetter.ExampleSetupAndSeed](https://github.com/JonPSmith/EfCore.TestSupport/blob/master/Test/UnitTests/TestDataResetter/ExampleSetupAndSeed.cs) unit test class I create a replacement AnonymiserFunc that uses the DotNetRandomNameGenerator NuGet package. This allows you to use real, but random names (and places). I suggest this code is a good place to start.

Note that the AnonymiserFunc has two parameters:

  1. The AnonymiserData class, which provides certain data. The most important is the ReplacementType property, which holds what type of string you would like. This can be any string not containing a colon (:). You can use anything but here are what I use: "FullName", "FirstName", "LastName", "Email", "Address1", "Address1" etc.
  2. object objectInstance. This is provides in case you need to align data in certain classes, e.g. you wanted the "FullName" to match the first part of the "Email". You would need to cast the object to the class you where you want to do this and manually synchronize the properties.
Clone this wiki locally