-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for Google Cloud Data Loss Prevention #9651
Conversation
Still some checks to fix :( |
I fixed the static checks, one of the integration check is breaking. |
Templates | ||
^^^^^^^^^ | ||
|
||
Templates can be used to create and persist | ||
configuration information to use with the Cloud Data Loss Prevention. | ||
There are two types of templates supported by Cloud DLP: | ||
|
||
* `Inspection Template <https://cloud.google.com/dlp/docs/creating-templates-inspect>`__, | ||
* `De-Identification Template <https://cloud.google.com/dlp/docs/creating-templates-deid>`__, | ||
|
||
Here we will be using identification template for our example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information can be outdate in some future. So I would suggest to skip this section and let the user use the Google docs
I have the impression that there are more DLP operators but they are not covered in the example DAG. I think we should add the documentation. I'm ok with not extending the DAG itself. |
I have added documentation for all the DLP operators. Should I add other DAGs to extend the example? |
@OmairK Example tasks for the rest operators will be helpful. |
Hey @OmairK - I re-run the latest commit run and there are few errors but not a mypy one :).
I hope it's helpful. J. |
b3b737d
to
8c2fde5
Compare
Thanks alot @potiuk, but this wasn't the PR with mypy errors. I have mentioned you there, sorry for the trouble. |
create_info_type = CloudDLPCreateStoredInfoTypeOperator( | ||
project_id=GCP_PROJECT, | ||
config=custom_info_types, | ||
stored_info_type_id=custom_info_type_id, | ||
dag=dag, | ||
task_id="create_info_type", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This task fails with:
File "/airflow/airflow/providers/google/cloud/hooks/dlp.py", line 432, in create_stored_info_type
metadata=metadata,
File ".virtualenvs/airflow/lib/python3.6/site-packages/google/cloud/dlp_v2/gapic/dlp_service_client.py", line 2625, in create_stored_info_type
parent=parent, config=config, stored_info_type_id=stored_info_type_id
TypeError: Parameter to MergeFrom() must be instance of same class: expected google.privacy.dlp.v2.StoredInfoTypeConfig got list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
create_trigger = CloudDLPCreateJobTriggerOperator( | ||
project_id=GCP_PROJECT, | ||
job_trigger=JOB_TRIGGER, | ||
trigger_id=TRIGGER_ID, | ||
dag=dag, | ||
task_id="create_trigger", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This task fails with this error:
File "/airflow/airflow/providers/google/cloud/operators/dlp.py", line 433, in execute
metadata=self.metadata,
File "/airflow/airflow/providers/google/common/hooks/base_google.py", line 376, in inner_wrapper
return func(self, *args, **kwargs)
File "/airflow/airflow/providers/google/cloud/hooks/dlp.py", line 375, in create_job_trigger
metadata=metadata,
File "/.virtualenvs/airflow/lib/python3.6/site-packages/google/cloud/dlp_v2/gapic/dlp_service_client.py", line 2557, in create_job_trigger
request, retry=retry, timeout=timeout, metadata=metadata
File "/.virtualenvs/airflow/lib/python3.6/site-packages/google/api_core/gapic_v1/method.py", line 143, in __call__
return wrapped_func(*args, **kwargs)
File "/.virtualenvs/airflow/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.InvalidArgument: 400 `StorageConfig` must be set.
@OmairK would you mind rebasing? I'm ok with merging this PR with not working system tests. We need them to debug the DLP operators that are reported to be not working properly. |
627998f
to
79627d9
Compare
@OmairK Can you do a rebase and leave a comment so we get notifications? Without it, we don't know that you've found the time to respond to our request. |
@mik-laj I did the rebase. I informed @turbaszek via slack, I thought against sending two notifications for the same thing. |
@OmairK will you be able to fix the statics checks? |
@turbaszek Done 😅 |
Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.