Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve][Doc] Improve the connector-v2 develop doc #8190

Merged
merged 2 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 21 additions & 19 deletions seatunnel-connectors-v2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,14 +49,19 @@ own connectors, you need to follow the steps below.

3.Create two packages corresponding to source and sink

​ package org.apache.seatunnel.connectors.seatunnel.{connector name}}.source

​ package org.apache.seatunnel.connectors.seatunnel.{connector name}}.sink
package org.apache.seatunnel.connectors.seatunnel.{connector name}}.source
package org.apache.seatunnel.connectors.seatunnel.{connector name}}.sink

4.add connector info to plugin-mapping.properties file in seatunnel root path.

5.add connector dependency to seatunnel-dist/pom.xml, so the connector jar can be find in binary package.

6.There are several classes that must be implemented on the source side, namely {ConnectorName}Source, {ConnectorName}SourceFactory, {ConnectorName}SourceReader; There are several classes that must be implemented on the sink side, namely {ConnectorName}Sink, {ConnectorName}SinkFactory, {ConnectorName}SinkWriter Please refer to other connectors for details

7.{ConnectorName}SourceFactory and {ConnectorName}SinkFactory needs to be annotated with the **@AutoService (Factory.class)** annotation on the class name, and in addition to the required methods, source side an additional **creatSource** method needs to be rewritten and sink side an additional **creatSink** method needs to be rewritten

8.{ConnectorName}Source needs to override the **getProducedCatalogTables** method; {ConnectorName}Sink needs to override the **getWriteCatalogTable** method

### **Startup Class**

Aside from the old startup class, we have created two new startup modules,
Expand Down Expand Up @@ -205,26 +210,25 @@ In order to automatically create the Source Connector and Sink Connector and Tra
supported by the current connector and the required parameters. We define TableSourceFactory and TableSinkFactory,
It is recommended to put it in the same directory as the implementation class of SeaTunnelSource or SeaTunnelSink for easy searching.

- `factoryIdentifier` is used to indicate the name of the current Factory. This value should be the same as the
value returned by `getPluginName`, so that if Factory is used to create Source/Sink in the future,
A seamless switch can be achieved.
- `createSink` and `createSource` are the methods for creating Source and Sink respectively,
and do not need to be implemented at present.
- `factoryIdentifier` is used to indicate the name of the current Factory. This value should be the same as the
value returned by `getPluginName`, so that if Factory is used to create Source/Sink in the future,
A seamless switch can be achieved.
- `createSink` and `createSource` are the methods for creating Source and Sink respectively.
- `optionRule` returns the parameter logic, which is used to indicate which parameters of our connector are supported,
which parameters are required, which parameters are optional, and which parameters are exclusive, which parameters are bundledRequired.
This method will be used when we visually create the connector logic, and it will also be used to generate a complete parameter
object according to the parameters configured by the user, and then the connector developer does not need to judge whether the parameters
exist one by one in the Config, and use it directly That's it.
You can refer to existing implementations, such as `org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSourceFactory`.
There is support for configuring Schema for many Sources, so a common Option is used.
If you need a schema, you can refer to `org.apache.seatunnel.api.table.catalog.CatalogTableUtil.SCHEMA`.
which parameters are required, which parameters are optional, and which parameters are exclusive, which parameters are bundledRequired.
This method will be used when we visually create the connector logic, and it will also be used to generate a complete parameter
object according to the parameters configured by the user, and then the connector developer does not need to judge whether the parameters
exist one by one in the Config, and use it directly That's it.
You can refer to existing implementations, such as `org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSourceFactory`.
There is support for configuring Schema for many Sources, so a common Option is used.
If you need a schema, you can refer to `org.apache.seatunnel.api.table.catalog.CatalogTableUtil.SCHEMA`.

Don't forget to add `@AutoService(Factory.class)` to the class. This Factory is the parent class of TableSourceFactory and TableSinkFactory.

### **Options**

When we implement TableSourceFactory and TableSinkFactory, the corresponding Option will be created.
Each Option corresponds to a configuration, but different configurations will have different types.
Each Option corresponds to a configuration, but different configurations will have different types.
Common types can be created by directly calling the corresponding method.
But if our parameter type is an object, we can use POJO to represent parameters of object type,
and need to use `org.apache.seatunnel.api.configuration.util.OptionMark` on each parameter to indicate that this is A child Option.
Expand All @@ -237,6 +241,4 @@ please refer to `org.apache.seatunnel.connectors.seatunnel.assertion.sink.Assert
## **Result**

All Connector implementations should be under the ``seatunnel-connectors-v2``, and the examples that can be referred to
at this stage are under this module.


at this stage are under this module.
26 changes: 15 additions & 11 deletions seatunnel-connectors-v2/README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ SeaTunnel为与计算引擎进行解耦,设计了新的连接器API,通过
### **工程结构**

- ../`seatunnel-connectors-v2` connector-v2代码实现
- ../`seatunnel-translation` connector-v2的翻译层
- ../`seatunnel-translation` connector-v2的翻译层
- ../`seatunnel-transform-v2` transform-v2代码实现
- ../seatunnel-e2e/`seatunnel-connector-v2-e2e` connector-v2端到端测试
- ../seatunnel-examples/`seatunnel-flink-connector-v2-example` seatunnel connector-v2的flink local运行的实例
Expand Down Expand Up @@ -39,14 +39,19 @@ SeaTunnel为与计算引擎进行解耦,设计了新的连接器API,通过

3.新建两个package分别对应source和sink

​ package org.apache.seatunnel.connectors.seatunnel.{连接器名}.source

​ package org.apache.seatunnel.connectors.seatunnel.{连接器名}.sink
package org.apache.seatunnel.connectors.seatunnel.{连接器名}.source
package org.apache.seatunnel.connectors.seatunnel.{连接器名}.sink

4.将连接器信息添加到在项目根目录的plugin-mapping.properties文件中.

5.将连接器添加到seatunnel-dist/pom.xml,这样连接器jar就可以在二进制包中找到.

6.source端有几个必须实现的类,分别是{连接器名}Source、{连接器名}SourceFactory、{连接器名}SourceReader;sink端有几个必须实现的类,分别是{连接器名}Sink、{连接器名}SinkFactory、{连接器名}SinkWriter,具体可以参考其他连接器

7.{连接器名}SourceFactory 和 {连接器名}SinkFactory 里面需要在类名上标注 **@AutoService(Factory.class)** 注解,并且除了必须实现的方法外,source端需要额外再重写一个 **createSource** 方法,sink端需要额外再重写一个 **createSink** 方法

8.{连接器名}Source 需要重写 **getProducedCatalogTables** 方法;{连接器名}Sink 需要重写 **getWriteCatalogTable** 方法

### 启动类

和老的启动类分开,我们创建了两个新的启动类工程,分别是`seatunnel-core/seatunnel-flink-starter`和`seatunnel-core/seatunnel-spark-starter`.
Expand Down Expand Up @@ -154,13 +159,12 @@ Sink可以根据组件属性进行选择,到底是只实现`SinkCommitter`或`
为了实现自动化的创建Source或者Sink,我们需要连接器能够声明并返回创建他们所需要的参数列表和每个参数的校验规则。为了实现这个目标,我们定义了TableSourceFactory和TableSinkFactory,
建议将其放在和SeaTunnelSource或SeaTunnelSink实现类同一目录下,方便寻找。

- `factoryIdentifier` 用于表明当前Factory的名称,这个值应该和`getPluginName`返回的值一致,这样后续如果使用Factory来创建Source/Sink,
就能实现无缝切换。
- `createSink` 和 `createSource` 分别是创建Source和Sink的方法,目前不用实现。
- `factoryIdentifier` 用于表明当前Factory的名称,这个值应该和`getPluginName`返回的值一致,这样后续如果使用Factory来创建Source/Sink,就能实现无缝切换。
- `createSink` 和 `createSource` 分别是创建Source和Sink的方法。
- `optionRule` 返回的是参数逻辑,用于表示我们的连接器参数哪些支持,哪些参数是必须(required)的,哪些参数是可选(optional)的,哪些参数是互斥(exclusive)的,哪些参数是绑定(bundledRequired)的。
这个方法会在我们可视化创建连接器逻辑的时候用到,同时也会用于根据用户配置的参数生成完整的参数对象,然后连接器开发者就不用在Config里面一个个判断参数是否存在,直接使用即可。
可以参考现有的实现,比如`org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSourceFactory`。针对很多Source都有支持配置Schema,所以采用了通用的Option,
需要Schema则可以引用`org.apache.seatunnel.api.table.catalog.CatalogTableUtil.SCHEMA`。
这个方法会在我们可视化创建连接器逻辑的时候用到,同时也会用于根据用户配置的参数生成完整的参数对象,然后连接器开发者就不用在Config里面一个个判断参数是否存在,直接使用即可。
可以参考现有的实现,比如`org.apache.seatunnel.connectors.seatunnel.elasticsearch.source.ElasticsearchSourceFactory`。针对很多Source都有支持配置Schema,所以采用了通用的Option,
需要Schema则可以引用`org.apache.seatunnel.api.table.catalog.CatalogTableUtil.SCHEMA`。

别忘记添加`@AutoService(Factory.class)` 到类上面。这个Factory即TableSourceFactory 和 TableSinkFactory的父类。

Expand All @@ -173,4 +177,4 @@ Sink可以根据组件属性进行选择,到底是只实现`SinkCommitter`或`

## 实现

现阶段所有的连接器实现及可参考的示例都在seatunnel-connectors-v2下,用户可自行查阅参考。
现阶段所有的连接器实现及可参考的示例都在seatunnel-connectors-v2下,用户可自行查阅参考。
Loading