-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to low-code documentation #17121
Changes from all commits
8512457
67ae6e1
bf24bad
1b6c2d5
d8e8864
94ba5b4
298ba48
cdc9a00
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,31 @@ | ||
:warning: This framework is in [alpha](https://docs.airbyte.com/project-overview/product-release-stages/#alpha). It is still in active development and may include backward-incompatible changes. Please share feedback and requests directly with us at [email protected] :warning: | ||
|
||
# Index | ||
|
||
## From scratch | ||
|
||
This section gives an overview of the low-code framework. | ||
|
||
- [Overview](overview.md) | ||
- [Yaml structure](yaml-structure.md) | ||
- [YAML structure](yaml-structure.md) | ||
- [Reference docs](https://airbyte-cdk.readthedocs.io/en/latest/api/airbyte_cdk.sources.declarative.html) | ||
|
||
## Concepts | ||
|
||
This section contains additional information on the different components that can be used to define a low-code connector. | ||
|
||
- [Authentication](authentication.md) | ||
- [Error handling](error-handling.md) | ||
- [Pagination](pagination.md) | ||
- [Record selection](record-selector.md) | ||
- [Request options](request-options.md) | ||
- [Stream slicers](stream-slicers.md) | ||
- [Substreams](substreams.md) | ||
|
||
## Tutorial | ||
|
||
This section a tutorial that will guide you through the end-to-end process of implementing a low-code connector. | ||
|
||
0. [Getting started](tutorial/0-getting-started.md) | ||
1. [Creating a source](tutorial/1-create-source.md) | ||
2. [Installing dependencies](tutorial/2-install-dependencies.md) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Config-based connectors overview | ||
|
||
:warning: This framework is in alpha stage. Support is not in production and is available only to select users. :warning: | ||
:warning: This framework is in [alpha](https://docs.airbyte.com/project-overview/product-release-stages/#alpha). It is still in active development and may include backward-incompatible changes. Please share feedback and requests directly with us at [email protected] :warning: | ||
|
||
The goal of this document is to give enough technical specifics to understand how config-based connectors work. | ||
When you're ready to start building a connector, you can start with [the tutorial](./tutorial/0-getting-started.md), or dive into [more detailed documentation](./index.md). | ||
|
@@ -53,11 +53,11 @@ The only pagination mechanisms supported are | |
|
||
### What is the authorization mechanism? | ||
|
||
Endpoints that require authenticating using a query param or a HTTP header, as is the case for the [Exchange Rates Data API](https://apilayer.com/marketplace/exchangerates_data-api#authentication), are supported. | ||
Endpoints that require [authenticating using a query param or a HTTP header](./authentication.md#apikeyauthenticator), as is the case for the [Exchange Rates Data API](https://apilayer.com/marketplace/exchangerates_data-api#authentication), are supported. | ||
|
||
Endpoints that require authenticating using Basic Auth over HTTPS, as is the case for [Greenhouse](https://developers.greenhouse.io/harvest.html#authentication), are supported. | ||
Endpoints that require [authenticating using Basic Auth over HTTPS](./authentication.md#basichttpauthenticator), as is the case for [Greenhouse](https://developers.greenhouse.io/harvest.html#authentication), are supported. | ||
|
||
Endpoints that require authenticating using OAuth 2.0, as is the case for [Strava](https://developers.strava.com/docs/authentication/#introduction), are supported. | ||
Endpoints that require [authenticating using OAuth 2.0](./authentication.md#oauth), as is the case for [Strava](https://developers.strava.com/docs/authentication/#introduction), are supported. | ||
|
||
Other authentication schemes such as GWT are not supported. | ||
|
||
|
@@ -78,11 +78,11 @@ Throttling is not supported, but the connector can use exponential backoff to av | |
| Transport protocol | HTTP | | ||
| HTTP methods | GET, POST | | ||
| Data format | JSON | | ||
| Resource type | Collections<br/>Sub-collection | | ||
| Resource type | Collections<br/>[Sub-collection](./substreams.md) | | ||
| [Pagination](./pagination.md) | [Page limit](./pagination.md#page-increment)<br/>[Offset](./pagination.md#offset-increment)<br/>[Cursor](./pagination.md#cursor) | | ||
| [Authentication](./authentication.md) | [Header based](./authentication.md#ApiKeyAuthenticator)<br/>[Bearer](./authentication.md#BearerAuthenticator)<br/>[Basic](./authentication.md#BasicHttpAuthenticator)<br/>[OAuth](./authentication.md#OAuth) | | ||
| Sync mode | Full refresh<br/>Incremental | | ||
| Schema discovery | Only static schemas | | ||
| Schema discovery | Static schemas | | ||
| [Stream slicing](./stream-slicers.md) | [Datetime](./stream-slicers.md#Datetime), [lists](./stream-slicers.md#list-stream-slicer), [parent-resource id](./stream-slicers.md#Substream-slicer) | | ||
| [Record transformation](./record-selector.md) | [Field selection](./record-selector.md#selecting-a-field)<br/>[Adding fields](./record-selector.md#adding-fields)<br/>[Removing fields](./record-selector.md#removing-fields)<br/>[Filtering records](./record-selector.md#filtering-records) | | ||
| [Error detection](./error-handling.md) | [From HTTP status code](./error-handling.md#from-status-code)<br/>[From error message](./error-handling.md#from-error-message) | | ||
|
@@ -122,9 +122,9 @@ The data retriever defines how to read the data for a Stream, and acts as an orc | |
There is currently only one implementation, the `SimpleRetriever`, which is defined by | ||
|
||
1. Requester: Describes how to submit requests to the API source | ||
2. Paginator: Describes how to navigate through the API's pages | ||
3. Record selector: Describes how to extract records from a HTTP response | ||
4. Stream Slicer: Describes how to partition the stream, enabling incremental syncs and checkpointing | ||
2. [Paginator](./pagination.md): Describes how to navigate through the API's pages | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add links |
||
3. [Record selector](./record-selector.md): Describes how to extract records from a HTTP response | ||
4. [Stream Slicer](./stream-slicers.md): Describes how to partition the stream, enabling incremental syncs and checkpointing | ||
|
||
Each of those components (and their subcomponents) are defined by an explicit interface and one or many implementations. | ||
The developer can choose and configure the implementation they need depending on specifications of the integration they are building against. | ||
|
@@ -157,13 +157,9 @@ There is currently only one implementation, the `HttpRequester`, which is define | |
1. A base url: The root of the API source | ||
2. A path: The specific endpoint to fetch data from for a resource | ||
3. The HTTP method: the HTTP method to use (GET or POST) | ||
4. A request options provider: Defines the request parameters (query parameters), headers, and request body to set on outgoing HTTP requests | ||
5. An authenticator: Defines how to authenticate to the source | ||
6. An error handler: Defines how to handle errors | ||
|
||
More details on authentication can be found in the [authentication section](authentication.md). | ||
|
||
More details on error handling can be found in the [error handling section](error-handling.md). | ||
4. [A request options provider](./request-options.md): Defines the request parameters (query parameters), headers, and request body to set on outgoing HTTP requests | ||
5. [An authenticator](./authentication.md): Defines how to authenticate to the source | ||
6. [An error handler](./error-handling.md): Defines how to handle errors | ||
|
||
## Connection Checker | ||
|
||
|
@@ -209,5 +205,5 @@ pagination_strategy: | |
The following connectors can serve as example of what production-ready config-based connectors look like | ||
|
||
- [Greenhouse](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-greenhouse) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no need to update the greenhouse link because the production connector is using low-code |
||
- [Sendgrid](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-sendgrid) | ||
- [Sentry](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-sentry) | ||
- [Sendgrid](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-sendgrid/source_sendgrid/sendgrid.yaml) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. update links to point to the yaml file for sendgrid and sentry because the connector is not using low-code |
||
- [Sentry](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-sentry/source_sentry/sentry.yaml) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,54 +118,10 @@ the resulting stream slices are | |
] | ||
``` | ||
|
||
### Substream slicer | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Move to |
||
|
||
`SubstreamSlicer` iterates over the parent's stream slices. | ||
This is useful for defining sub-resources. | ||
|
||
We might for instance want to read all the commits for a given repository (parent resource). | ||
|
||
For each stream, the slicer needs to know | ||
|
||
- what the parent stream is | ||
- what is the key of the records in the parent stream | ||
- what is the field defining the stream slice representing the parent record | ||
- how to specify that information on an outgoing HTTP request | ||
|
||
Assuming the commits for a given repository can be read by specifying the repository as a request_parameter, this could be defined as | ||
|
||
```yaml | ||
stream_slicer: | ||
type: "SubstreamSlicer" | ||
parent_streams_configs: | ||
- stream: "*ref(repositories_stream)" | ||
parent_key: "id" | ||
stream_slice_field: "repository" | ||
request_option: | ||
field_name: "repository" | ||
inject_into: "request_parameter" | ||
``` | ||
|
||
REST APIs often nest sub-resources in the URL path. | ||
If the URL to fetch commits was "/repositories/:id/commits", then the `Requester`'s path would need to refer to the stream slice's value and no `request_option` would be set: | ||
|
||
```yaml | ||
retriever: | ||
<...> | ||
requester: | ||
<...> | ||
path: "/respositories/{{ stream_slice.repository }}/commits | ||
stream_slicer: | ||
type: "SubstreamSlicer" | ||
parent_streams_configs: | ||
- stream: "*ref(repositories_stream)" | ||
parent_key: "id" | ||
stream_slice_field: "repository" | ||
``` | ||
|
||
[^1] This is a slight oversimplification. See [update cursor section](#cursor-update) for more details on how the cursor is updated. | ||
|
||
## More readings | ||
|
||
- [Incremental streams](../cdk-python/incremental-stream.md) | ||
- [Stream slices](../cdk-python/stream-slices.md) | ||
- [Stream slices](../cdk-python/stream-slices.md) | ||
- [Substreams](./substreams.md) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Substreams | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Moved from stream-slicers.md |
||
|
||
Substreams are streams that depend on the records on another stream | ||
|
||
We might for instance want to read all the commits for a given repository (parent stream). | ||
|
||
## Substream slicer | ||
|
||
Substreams are implemented by defining their stream slicer as a`SubstreamSlicer`. | ||
|
||
For each stream, the slicer needs to know | ||
|
||
- what the parent stream is | ||
- what is the key of the records in the parent stream | ||
- what is the field defining the stream slice representing the parent record | ||
- how to specify that information on an outgoing HTTP request | ||
|
||
Assuming the commits for a given repository can be read by specifying the repository as a request_parameter, this could be defined as | ||
|
||
```yaml | ||
stream_slicer: | ||
type: "SubstreamSlicer" | ||
parent_streams_configs: | ||
- stream: "*ref(repositories_stream)" | ||
parent_key: "id" | ||
stream_slice_field: "repository" | ||
request_option: | ||
field_name: "repository" | ||
inject_into: "request_parameter" | ||
``` | ||
|
||
REST APIs often nest sub-resources in the URL path. | ||
If the URL to fetch commits was "/repositories/:id/commits", then the `Requester`'s path would need to refer to the stream slice's value and no `request_option` would be set: | ||
|
||
```yaml | ||
retriever: | ||
<...> | ||
requester: | ||
<...> | ||
path: "/respositories/{{ stream_slice.repository }}/commits | ||
stream_slicer: | ||
type: "SubstreamSlicer" | ||
parent_streams_configs: | ||
- stream: "*ref(repositories_stream)" | ||
parent_key: "id" | ||
stream_slice_field: "repository" | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,12 @@ | ||
# Getting Started | ||
|
||
:warning: This framework is in alpha stage. Support is not in production and is available only to select users. :warning: | ||
:warning: This framework is in [alpha](https://docs.airbyte.com/project-overview/product-release-stages/#alpha). It is still in active development and may include backward-incompatible changes. Please share feedback and requests directly with us at [email protected] :warning: | ||
|
||
## Summary | ||
|
||
Throughout this tutorial, we'll walk you through the creation an Airbyte source to read and extract data from an HTTP API. | ||
|
||
We'll build a connector reading data from the Exchange Rates API, but the steps will apply to other HTTP APIs you might be interested in integrating with. | ||
We'll build a connector reading data from the Exchange Rates API, but the steps apply to other HTTP APIs you might be interested in integrating with. | ||
|
||
The API documentations can be found [here](https://apilayer.com/marketplace/exchangerates_data-api). | ||
In this tutorial, we will read data from the following endpoints: | ||
|
@@ -30,7 +30,7 @@ The output schema of our stream will look like the following: | |
|
||
## Exchange Rates API Setup | ||
|
||
Before we can get started, you'll need to generate an API access key for the Exchange Rates API. | ||
Before we get started, you'll need to generate an API access key for the Exchange Rates API. | ||
This can be done by signing up for the Free tier plan on [Exchange Rates API](https://exchangeratesapi.io/): | ||
|
||
1. Visit https://exchangeratesapi.io and click "Get free API key" on the top right | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove superfluous word