Skip to content

Registering Data Stored in Relational Tables

Nate Weisz edited this page Feb 25, 2019 · 1 revision

Purpose

Organizations with data in RDS can make their data discoverable and add descriptive information by registering metadata in the Herd repository. Organizations can register the data that resides in any RDS instance. Registering with Herd is a self-service activity that teams can perform themselves via Herd REST APIs. This document contains detailed information on how to register Relational Tables in Herd.

Principles

Initial Herd use cases focused on managing data in S3. Now Herd supports registering data that is stored in RDS. But many of the core principles are the same:

  • Capture technical metadata so consumers can locate, confirm availability, and access the data with the tools of their choice. Technical metadata includes formats, lineage, and physical location.
  • Capture business metadata so business and technical users can locate and understand the data. Business metadata includes object descriptions, column descriptions, tagging for categorization, SMEs.
  • Manage metadata - capture audit information, provide authorization controls for metadata access, satisfy corporate records requirements.

Steps to register a Relational Table

Registering Relational Tables in Herd involves the following steps

  • Register Herd Storage that references the RDS Host
  • For each Table, call Herd Relational Table Registration

Detailed information about the steps follow.

Register Herd Storage

Herd Relational Table Registration uses a Herd Storage to represent an RDS host. Use Storage Post to create a new Storage. Set the following Storage Attribute values with information that helps Herd access schema information from the RDS host.

Attribute Key Description Example Value
jdbc.url Used to connect to the RDS host. Provide entire jdbc url including host, port, database name jdbc:postgresql://somehostname.us-east-1.rds.amazonaws.com:5432/somedbname
jdbc.username Username to connect to the RDS host someuserName
jdbc.user.credential.name A key used to retrieve a password from JCredstash. Store your password in credstash and enter the credstash key as this attribute value somekey

Ensure Herd has access to the RDS host

Herd uses the connection information and credentials specified in the Storage to connect reads to the RDS host. The user provided must have access to read schema information from the RDS host.

Register Relational Table

Register a Relational Table in Herd by calling the Relational Table Registration Post endpoint

This endpoint performs the following:

  1. Creates BDef (or optionally references existing BDef)
  2. Creates Format with FileType = RELATIONAL_TABLE
  3. Reads schema from RDS and populates Herd Format with columns, datatypes
  4. Creates a single BData that represents the data in stored in RDS

Example request for Relational Table Registration Post

<relationalTableRegistrationCreateRequest>
     <namespace>Namespace_for_Data</namespace>
     <businessObjectDefinitionName>Some_BDef</businessObjectDefinitionName>
     <businessObjectFormatUsage>Source</businessObjectFormatUsage>
     <dataProviderName>External</dataProviderName>
     <relationalSchemaName>db_schema_name</relationalSchemaName>
     <relationalTableName>db_table_name/relationalTableName>
     <storageName>Herd_Storage_for_RDS</storageName>
</relationalTableRegistrationCreateRequest>

Important Note - The relationalSchemaName and relationalTableName values are both case-senstive

After registering Relational Tables

Once your Relational Table is registered, you can:

  • Locate the Data Entity by searching Herd-UI
  • Add business metadata using the Herd-UI or REST endpoints as described in Populating Business Metadata
Clone this wiki locally