-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Federico Zambelli
committed
Dec 6, 2024
1 parent
7a0ef73
commit 60fb208
Showing
7 changed files
with
440 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
{ | ||
"_variables": { | ||
"lastUpdateCheck": 1732194217563 | ||
"lastUpdateCheck": 1733326685860 | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+251 KB
docs/src/content/docs/components/lake-formation/imgs/grants_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+48.4 KB
docs/src/content/docs/components/lake-formation/imgs/tag_inheritance.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
190 changes: 190 additions & 0 deletions
190
docs/src/content/docs/components/lake-formation/introduction.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,190 @@ | ||
--- | ||
title: Introduction | ||
description: Lake Formation - Introduction | ||
--- | ||
import DualCode from '../../../../components/DualCode.astro'; | ||
|
||
DLZ comes with the option to enable [Lake Formation](https://docs.aws.amazon.com/lake-formation/latest/dg/what-is-lake-formation.html) (LF) in any of your accounts. | ||
|
||
It's a service designed to help with data governance, data sharing, and manage permissions for secure data lakes. It can complement or replace IAM permissions in these contexts, providing an alternative layer of security and management capabilities. | ||
|
||
If you're unfamiliar with Lake Formation, it is recommended you read [this brief intro guide](https://aws.github.io/aws-lakeformation-best-practices/). | ||
|
||
## Getting Started | ||
|
||
Having an effective and lean tag ontology is the most important thing when working with Lake Formation. | ||
|
||
The following is a minimal, non production-ready example just to show how to get started. Later sections will cover best practices, important considerations, and potential pitfalls to be aware of. | ||
|
||
<DualCode> | ||
<Fragment slot="ts"> | ||
```ts | ||
... | ||
accounts: [ | ||
{ | ||
..., | ||
lakeFormation: [{ | ||
region: Region.EU_WEST_1, | ||
admins: ['arn:aws:iam::123456789012:role/Admin'], | ||
tags: [ | ||
{ tagKey: 'common', tagValues: ['true', 'false']} | ||
{ tagKey: 'team:Data', tagValues: ['true'] } | ||
], | ||
permissions: [ | ||
{ | ||
principals: ['arn:aws:iam::123456789012:role/TeamData'], | ||
tags: [{ tagKey: 'team:Data', tagValues: ['true'] }], | ||
databaseActions: [DatabaseAction.DESCRIBE], | ||
tableActions: [TableAction.DESCRIBE, TableAction.SELECT] | ||
} | ||
] | ||
}] | ||
} | ||
] | ||
``` | ||
</Fragment> | ||
<Fragment slot="python"> | ||
```python | ||
... | ||
accounts = [ | ||
dlz.DLzAccount( | ||
..., | ||
lake_formation=[dlz.DlzLakeFormationProps( | ||
region=dlz.Region.EU_WEST_1, | ||
admins=['arn:aws:iam::123456789012:role/Admin'], | ||
tags=[ | ||
{'tag_key': 'common', 'tag_values': ['true', 'false']} | ||
{'tag_key': 'team:Data', 'tag_values': ['true']} | ||
], | ||
permissions=[ | ||
dlz.LakePermission( | ||
principals=['arn:aws:iam::123456789012:role/TeamData'], | ||
tags=[{'tag_key': 'team:Data', 'tag_values': ['true']}], | ||
database_actions=[DatabaseAction.DESCRIBE], | ||
table_actions=[TableAction.DESCRIBE, TableAction.SELECT], | ||
) | ||
] | ||
)] | ||
) | ||
] | ||
``` | ||
</Fragment> | ||
</DualCode> | ||
|
||
## Settings | ||
|
||
The `lakeFormation | lake_formation` attribute of the `DlzAccount` object is an array, where each element corresponds to the Lake Formation settings for a specific AWS region. This allows users the flexibility to configure Lake Formation across multiple regions, or to use a single element for configurations pertaining to just one region. | ||
|
||
At the very least, each Lake Formation setting must include the following: | ||
|
||
### `region` | ||
Self explanatory. Specify the AWS region where Lake Formation should be activated. | ||
|
||
### `admins` | ||
It is essential to include an admin for Lake Formation. While DLZ automatically assigns the CDK execution role as an LF admin, you need to ensure at least one role that can be assumed by a human is also included. Failing to do so will leave you unable to utilize Lake Formation beyond the initial configuration. TODO: include that the admins become also "owners" of any tag created and whatnot. | ||
|
||
### `tags` | ||
The list of all tags keys and corresponding values to be created for use in Lake Formation. | ||
> _✨ Note - When sharing data across accounts, additional steps are needed. Check the [cross-account sharing section](introduction#cross-account-sharing) for guidance._ | ||
### `permissions` | ||
Here you define "grants". Each element of the array represents a set of allowed actions granted to one or more principals, upon a combination of tags and values. | ||
> _✨ Note - When sharing data across accounts, additional steps are needed. Check the [cross-account sharing section](introduction#cross-account-sharing) for guidance._ | ||
:::caution | ||
Defining permissions requires knowledge of how Lake Formation Tag Based Access Contorl (TBAC) works. Do not make assumptions, you will make mistakes and waste time. Do yourself a favor and go read [this brief explanation](https://aws.github.io/aws-lakeformation-best-practices/lf-tags/basics/) or [the included guide]({/*TODO: ADD LINK HERE*/}) | ||
::: | ||
|
||
## Cross-Account Sharing | ||
|
||
Lake Formation enables users to share Data Catalog resources to external entities (IAM principals, AWS accounts, Organizations and organizational units). | ||
|
||
This guide shows how to set up cross-account sharing in DLZ. You can apply the same steps to other entity types as well. | ||
|
||
For a deep dive on how cross-account sharing works, refer to the [official AWS docs](https://docs.aws.amazon.com/lake-formation/latest/dg/cross-account-permissions.html). | ||
|
||
--- | ||
|
||
In this example, we'll illustrate how to share resources with an external account within the same Organization, as would be the normal case in a DLZ setup. | ||
|
||
Sharing with an external account is, in practice, no different than sharing with an internal entity: you grant it a set of allowed actions upon a combination of tags and values. | ||
|
||
The only difference is that the external account must have **at least** `DESCRIBE` permission on any key:value pair used in the aforementioned grants. | ||
|
||
In other words, let's assume that you created a tag `shared` with possible values `true` and `false`, and you want to give an external account read access on any Data Catalog resource that you tag with `shared: true`. | ||
|
||
This operation consists of two steps. In pseudocode it looks like this: | ||
|
||
```sql | ||
-- Make account '678901234567' able to read tag `shared=true` | ||
GRANT 'DESCRIBE' ON TAG shared=True TO '678901234567' | ||
|
||
-- Make account '678901234567' able to read any data catalog resource | ||
-- tagged with LF Tag: `shared=true` | ||
GRANT 'DESCRIBE', 'SELECT' ON TAG EXPRESSION (shared=True) TO '678901234567' | ||
``` | ||
|
||
In simple terms, to allow an account to view and access data from external resources based on a specific tag expression, the account must first have read permissions for the tags and their associated values. | ||
|
||
This translates to the following DLZ configs: | ||
|
||
<DualCode> | ||
<Fragment slot="ts"> | ||
```ts | ||
... | ||
tags: [ | ||
{ | ||
tagKey: 'shared', tagValues: ['true', 'false'], | ||
share: { | ||
withExternalAccount: [{ | ||
principals: ['678901234567'], | ||
tagActions: [TagAction.DESCRIBE], | ||
specificValues: ['true'] | ||
}] | ||
} | ||
} | ||
], | ||
permissions: [ | ||
{ | ||
principals: ['678901234567'], | ||
tags: [{ tagKey: 'shared', tagValues: ['true'] }], | ||
databaseActions: [DatabaseAction.DESCRIBE], | ||
tableActions: [TableAction.DESCRIBE, TableAction.SELECT] | ||
} | ||
] | ||
``` | ||
</Fragment> | ||
<Fragment slot="python"> | ||
```python | ||
... | ||
tags=[ | ||
{ | ||
'tag_key': 'shared', 'tag_values': ['true'], | ||
'share': { | ||
'with_external_account': [{ | ||
'principals': ['678901234567'], | ||
'tag_actions': [TagAction.DESCRIBE], | ||
'specific_values': ['true'] | ||
}] | ||
} | ||
} | ||
], | ||
permissions=[ | ||
dlz.LakePermission( | ||
principals=['678901234567'], | ||
tags=[{'tag_key': 'shared', 'tag_values': ['true']}], | ||
database_actions=[DatabaseAction.DESCRIBE], | ||
table_actions=[TableAction.DESCRIBE, TableAction.SELECT], | ||
) | ||
] | ||
``` | ||
</Fragment> | ||
</DualCode> | ||
|
||
## Next Steps | ||
|
||
With this, you should be equipped with the necessary knowledge to get started using Lake Formation with DLZ. | ||
|
||
But if you're not familiar with LF at all, we strongly recommend you read at least these two guides, especially if you haven't read the one recommended in the intro, which illustrate how the tag system works, and proposes a basic strategy to implement, respectively: | ||
|
||
- [How Lake Formation Tag Based Access Control Works](./lf-tbac-guide) | ||
- [Lake Formation TBAC recommended strategy](./lf-tbac-strategy) |
109 changes: 109 additions & 0 deletions
109
docs/src/content/docs/components/lake-formation/lf-tbac-guide.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
--- | ||
title: Tag Based Access Control Guide | ||
description: Lake Formation - Tag Based Access Control Guide | ||
--- | ||
import tagInheritance from './imgs/tag_inheritance.png' | ||
import grantsDiagram from './imgs/grants_diagram.png' | ||
|
||
:::note | ||
The source materials for this guide can be found at the following links: | ||
- https://aws.github.io/aws-lakeformation-best-practices/ | ||
- https://docs.aws.amazon.com/lake-formation/latest/dg/lf-tag-considerations.html | ||
::: | ||
|
||
Lake Formation access control is based on a tagging system. | ||
|
||
In short, instead of assigning permissions to individual resources, you tag the resources and grant permissions upon those tags. | ||
|
||
This approach simplifies data governance. By implementing a strong tagging strategy (a recommended one is provided in the next section), you can significantly reduce the effort required to set up and manage permissions. | ||
|
||
:::note | ||
All examples that follow are in pseudocode to make them simple to understand. | ||
::: | ||
|
||
## Rules | ||
|
||
LF tags follow a small but fundamental set of rules: | ||
|
||
- 📜 Tags are assigned to data catalog resources (databases, tables, columns), and a resource can have multiple tags attached, to a maximum of **50 tags**, and **no duplicate keys**. | ||
- 📜 Grants are made **TO** principals **ON** tags. | ||
|
||
```sql | ||
GRANT ACCESS ON TAGS foo=bar TO user | ||
``` | ||
- 📜 In grant expressions, all tag **keys** are evaluated in an `AND` fashion, whereas tag **values** are evaluated in an `OR` fashion. | ||
|
||
```sql | ||
GRANT ACCESS ON TAGS foo=bar AND spam=eggs TO user | ||
-- `user` has access on resources tagged both `foo=bar` AND `spam=eggs` | ||
|
||
GRANT ACCESS ON TAGS foo=['bar', 'baz'] TO user | ||
-- `user` has access on resource `foo=bar` OR `foo=baz` | ||
``` | ||
- 📜 Tags assigned to resources are **inherited**, unless specifically overridden. | ||
<p align='center'> | ||
<img src={tagInheritance.src} alt="Tag Inheritance Diagram" width='75%'/> | ||
</p> | ||
- 📜 Grants allow access to resources where **ALL** conditions are true. | ||
<p align='center'> | ||
<img src={grantsDiagram.src} alt="Grants Diagram" width='75%'/> | ||
</p> | ||
|
||
You should always keep in mind the above rules when designing a tagging strategy for your Lake Formation. | ||
|
||
## Limitations | ||
|
||
There are some limitations that may restrict the flexibility of creating tag systems. | ||
|
||
:::caution | ||
A resource cannot have the same LF-tag key more than once. | ||
|
||
For instance, you can't both add `team = sales` and `team = marketing` to the same table. | ||
|
||
One workaround is to embed the name inside the tag key, and turn the tag into a toggle: | ||
`team:sales = true` and `team:marketing = true`. | ||
::: | ||
|
||
:::caution | ||
A `GRANT` cannot use `OR` for different tag keys: | ||
|
||
```sql | ||
GRANT ACCESS ON TAGS foo=bar OR spam=eggs TO user -- IMPOSSIBLE! | ||
``` | ||
|
||
The workaround is to have multiple grant statements: | ||
```sql | ||
GRANT ACCESS ON TAGS foo=bar TO user | ||
GRANT ACCESS ON TAGS spam=eggs TO user | ||
``` | ||
::: | ||
|
||
## Suggestions | ||
|
||
Given the rules and limitations mentioned, use the following tips when designing your tagging system. | ||
|
||
- Since LF-tags are hierarchical, a `GRANT` on a tag key-value pair applied to a high level resource (e.g. a database) inherently allows access to all of its children with the same value, no matter all the other tags that may exist on those resources. | ||
|
||
- When writing a single `GRANT` statement, one should think **ONLY** rather than **ALSO**: | ||
|
||
```sql | ||
GRANT ACCESS ON TAGS team:marketing=true TO executives | ||
``` | ||
Allows the group `executives` to access everything that has the tag `team:marketing=true` | ||
|
||
If the statement is changed to be: | ||
```sql | ||
GRANT ACCESS ON TAGS team:marketing=true AND PII=true TO executives | ||
``` | ||
This does not imply that executives now have access to all resources tagged as `PII` in addition to those tagged with `team:marketing`. Instead, executives are granted access solely to resources that are tagged with both `team:marketing=true` and `PII=true`. | ||
|
||
--- | ||
|
||
:::tip | ||
One way to reason about LF tagging is as follows: | ||
|
||
- Adding tag _keys_ to a `GRANT`, effectively **REMOVES** permissions. | ||
- Adding tag _values_ to a `GRANT`, effectively **ADDS** permissions. | ||
- Adding `GRANT` statements to the same _principal_, effectively **ADDS** permissions. | ||
- Adding tags without a `GRANT` to a _resource_, effectively **DOES NOTHING**. | ||
::: |
Oops, something went wrong.