-
Notifications
You must be signed in to change notification settings - Fork 41
Registering Data Stored in Relational Tables
Organizations with data in RDS can make their data discoverable and add descriptive information by registering metadata in the Herd repository. Organizations can register the data that resides in any RDS instance. Registering with Herd is a self-service activity that teams can perform themselves via Herd REST APIs. This document contains detailed information on how to register Relational Tables in Herd.
Initial Herd use cases focused on managing data in S3. Now Herd supports registering data that is stored in RDS. But many of the core principles are the same:
- Capture technical metadata so consumers can locate, confirm availability, and access the data with the tools of their choice. Technical metadata includes formats, lineage, and physical location.
- Capture business metadata so business and technical users can locate and understand the data. Business metadata includes object descriptions, column descriptions, tagging for categorization, SMEs.
- Manage metadata - capture audit information, provide authorization controls for metadata access, satisfy corporate records requirements.
Registering Relational Tables in Herd involves the following steps
- Register Herd Storage that references the RDS Host
- For each Table, call Herd Relational Table Registration
Detailed information about the steps follow.
Herd Relational Table Registration uses a Herd Storage to represent an RDS host. Use Storage Post to create a new Storage. Set the following Storage Attribute values with information that helps Herd access schema information from the RDS host.
Attribute Key | Description | Example Value |
---|---|---|
jdbc.url | Used to connect to the RDS host. Provide entire jdbc url including host, port, database name | jdbc:postgresql://somehostname.us-east-1.rds.amazonaws.com:5432/somedbname |
jdbc.username | Username to connect to the RDS host | someuserName |
jdbc.user.credential.name | A key used to retrieve a password from JCredstash. Store your password in credstash and enter the credstash key as this attribute value | somekey |
Herd uses the connection information and credentials specified in the Storage to connect reads to the RDS host. The user provided must have access to read schema information from the RDS host.
Register a Relational Table in Herd by calling the Relational Table Registration Post endpoint
This endpoint performs the following:
- Creates BDef (or optionally references existing BDef)
- Creates Format with FileType = RELATIONAL_TABLE
- Reads schema from RDS and populates Herd Format with columns, datatypes
- Creates a single BData that represents the data in stored in RDS
Example request for Relational Table Registration Post
<relationalTableRegistrationCreateRequest>
<namespace>Namespace_for_Data</namespace>
<businessObjectDefinitionName>Some_BDef</businessObjectDefinitionName>
<businessObjectFormatUsage>Source</businessObjectFormatUsage>
<dataProviderName>External</dataProviderName>
<relationalSchemaName>db_schema_name</relationalSchemaName>
<relationalTableName>db_table_name/relationalTableName>
<storageName>Herd_Storage_for_RDS</storageName>
</relationalTableRegistrationCreateRequest>
Important Note - The relationalSchemaName and relationalTableName values are both case-senstive
Once your Relational Table is registered, you can:
- Locate the Data Entity by searching Herd-UI
- Add business metadata using the Herd-UI or REST endpoints as described in Populating Business Metadata
- Getting Started with herd
- herd Usage Pages
- herd API documentation
- herd Workflow Tasks
- herd Tools