Skip to content

Examples of PII and PHI redaction in Databricks using the private-ai.com API

Notifications You must be signed in to change notification settings

mikeyoungyoung/privateai_databricks

Repository files navigation

privateai_databricks

Examples of PII and PHI redaction in Databricks using the private-ai.com API

Limitations and Backgound

PrivateAI PII/PHI NER models work best when entities have further context around then. For example My SIN is 991 834 988 will return My SIN is [SIN_1] , but passing in just 991 834 988 may return [PHONENUMBER_1]. To use privateAI APIs on short form text it is better to pass in a column name and then the value like SIN 901 934 092.

However, there are obvisouly easier ways to handle columns dedicated to storing individual PII/PHI token and dropping or tokenizing the column may be the most appropriate

TO-DO

  • Add details on how to fetch file from a URI/URL in a column and get the file

About

Examples of PII and PHI redaction in Databricks using the private-ai.com API

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages