This tutorial will walk you through the execution of the Data Catalog Fileset Exporter.
This script is a Python CLI, if you want to look at the code open: .
Otherwise go to the next step.
Go to the file, and find the 5. Export Filesets to CSV file section. This section explains the CSV columns created when the Python CLI is executed.
First, let's set up the Service Account. (You may skip this, if you already have your Service Account)
Start by setting your project ID. Replace the placeholder to your project.
gcloud config set project MY_PROJECT_PLACEHOLDER
Next load it in a environment variable.
export PROJECT_ID=$(gcloud config get-value project)
Then create a Service Account.
gcloud iam service-accounts create datacatalog-fs-exporter-sa \
--display-name "Service Account for Fileset Exporter" \
--project $PROJECT_ID
Next create a credentials folder where the Service Account will be saved.
mkdir -p ~/credentials
Next create and download the Service Account Key.
gcloud iam service-accounts keys create "datacatalog-fs-exporter-sa.json" \
--iam-account "datacatalog-fs-exporter-sa@$PROJECT_ID.iam.gserviceaccount.com" \
&& mv datacatalog-fs-exporter-sa.json ~/credentials/datacatalog-fs-exporter-sa.json
Next add Data Catalog admin role to the Service Account.
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member "serviceAccount:datacatalog-fs-exporter-sa@$PROJECT_ID.iam.gserviceaccount.com" \
--quiet \
--project $PROJECT_ID \
--role "roles/datacatalog.admin"
Next set up the credentials environment variable.
export GOOGLE_APPLICATION_CREDENTIALS=~/credentials/datacatalog-fs-exporter-sa.json
Install and config the datacatalog-fileset-exporter CLI.
pip3 install datacatalog-fileset-exporter --user
Next load it to your PATH.
export PATH=~/.local/bin:$PATH
Next test it out.
datacatalog-fileset-exporter --help
Run the Python CLI:
Create an output folder:
mkdir -p ~/output
Run the CLI:
datacatalog-fileset-exporter filesets export --project-ids $PROJECT_ID --file-path ~/output/filesets.csv
Let's see the output:
cat ~/output/filesets.csv
Use the Cloud Editor to see the file, or upload it to Google Sheets to better visualize it.
You've successfully finished the Data Catalog Fileset Exporter Tutorial.