The Masking Package is a Python library designed to provide various masking operations on data columns. It supports hashing, fake data generation, and other masking techniques to ensure data privacy and security.
- Development:
To install the Masking Package, use the following command:
pip install masking@git+"https://github.com/d-one/masking.git"
Here is a basic example of how to use the package
import pandas as pd
from masking.mask.pipeline import MaskColumnPipeline
# Sample DataFrame
data = {'name': ['Alice', 'Bob', 'Charlie'], 'ssn': ['123-45-6789', '987-65-4321', '555-55-5555']}
df = pd.DataFrame(data)
# Configuration for masking
config = {
'masking': 'hash',
'config': { "col_name": "ssn", "secret": "my_secret" }
}
# Create and apply the masking pipeline
pipeline = MaskColumnPipeline(column='ssn', config=config)
masked_df = pipeline(df)
print(masked_df)