Skip to content

Commit

Permalink
Update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
treff7es committed Aug 5, 2024
1 parent f51a3e9 commit 20c54e2
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions metadata-ingestion/docs/sources/s3/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ Path specs config to ingest folders `orders` and `returns` as datasets:
path_specs:
- include: s3://test-bucket/{table}/{partition_key[0]}={partition[0]}/{partition_key[1]}={partition[1]}/*.parquet
```
or with partition auto-detection:
```
path_specs:
- include: s3://test-bucket/{table}/
```

One can also use `include: s3://test-bucket/{table}/*/*/*.parquet` here however above format is preferred as it allows declaring partitions explicitly.

Expand Down Expand Up @@ -150,6 +155,7 @@ Above config has 3 path_specs and will ingest following datasets
s3://my-bucket/foo/tests/bar.avro # single file table
s3://my-bucket/foo/tests/*.* # mulitple file level tables
s3://my-bucket/foo/tests/{table}/*.avro #table without partition
s3://my-bucket/foo/tests/{table}/ #table with partition autodetection. Partition only can be detected if it is in the format of key=value
s3://my-bucket/foo/tests/{table}/*/*.avro #table where partitions are not specified
s3://my-bucket/foo/tests/{table}/*.* # table where no partitions as well as data type specified
s3://my-bucket/{dept}/tests/{table}/*.avro # specifying keywords to be used in display name
Expand Down

0 comments on commit 20c54e2

Please sign in to comment.