Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE #51] Optimize the source task commit method #52

Merged
merged 22 commits into from
Jul 14, 2022

Conversation

sunxiaojian
Copy link
Contributor

No description provided.

@sunxiaojian sunxiaojian changed the title [Issue #51] Optimize the source task commit method [ISSUE #51] Optimize the source task commit method Jul 4, 2022
@2011shenlin
Copy link
Contributor

优化背景是什么?

@sunxiaojian
Copy link
Contributor Author

优化背景是什么?

apache/rocketmq-connect#180

*/
public void commit(final List<ConnectRecord> connectRecords) throws InterruptedException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于某些Source保留Offset的场景,如果一次commit一批数据,Source可以根据不同的分区信息,选择每个分区的最新位点进行提交,以此降低提交的频次;如果是单个提交,类似的场景就不好处理。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于某些Source保留Offset的场景,如果一次commit一批数据,Source可以根据不同的分区信息,选择每个分区的最新位点进行提交,以此降低提交的频次;如果是单个提交,类似的场景就不好处理。

次处主要的考虑主要有以下几点,

  1. 每条数据经过transform 、converter、send 成功后就可以commit了,如果是批次的commit,系统内还需要缓存record offset信息,数据丢了会出现重复按照历史offset拉取数据的问题
  2. 如果要降低批次提交,看用户自身需要可以在插件中自缓存list , 同样可以解决相同的问题
    所以,觉得提交单条还是更纯粹一点

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里我理解你讲的是send单条数据的场景,但是有些场景下,我们需要支持send一批数据。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里我理解你讲的是send单条数据的场景,但是有些场景下,我们需要支持send一批数据。

嗯,考虑过这个问题,但是一个source task拉取的数据可能不是写入同一个topic,可能是多个,这个要看写插件的逻辑定;所以可能暂时用不到,也可能一直用不到,如果需要时可以加上

@odbozhou odbozhou merged commit 383c564 into openmessaging:master Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants