Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] LocalFile sink support multiple table #5931

Merged

Conversation

ruanwenjun
Copy link
Member

Purpose of this pull request

  • Use Factory API to create LocalFileSink
  • LocalFileSink support multiple table

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

Sorry, something went wrong.

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_localFileSinkSupportMultipleTable branch 7 times, most recently from 7705063 to 1447249 Compare November 29, 2023 11:05
@ruanwenjun ruanwenjun added the feature New feature label Nov 29, 2023
path = "/tmp/hive/warehouse/${table_name}"
file_format_type = "parquet"
sink_columns = ["name","age"]
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-table and single-table are the same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple table is same with single table, but it can change the path by inject the table name from catalogTable(Although single table can also do this)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update For multiple table to For extract source metadata

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For extract source metadata LGTM, I changed the doc.

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_localFileSinkSupportMultipleTable branch 3 times, most recently from ed4cd91 to 94c288d Compare December 1, 2023 02:21
@ruanwenjun ruanwenjun requested a review from hailin0 December 1, 2023 05:33

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_localFileSinkSupportMultipleTable branch from 94c288d to 8004274 Compare December 1, 2023 06:39
Copy link
Member

@hailin0 hailin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}
validateSingleChoice(option);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix we have some options which have the same key but different excepted values, which will be set under different condition.
e.g. compress_code in file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense to me.

Comment on lines +46 to +49
public class LocalFileSink
implements SeaTunnelSink<
SeaTunnelRow, FileSinkState, FileCommitInfo, FileAggregatedCommitInfo>,
SupportMultiTableSink {
Copy link
Member

@Hisoka-X Hisoka-X Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like there is some fragmentation. LocalFileSink is not based onBaseFileSink, but other file-related connectors are based on BaseFileSink. Do we have plans to unify all file-related connectors? Can you create an issue? It is best to mark BaseFileSink as deprecated in the code and link to the issue. cc @TyrantLucifer

Copy link
Member Author

@ruanwenjun ruanwenjun Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it may cause a lot of file changes if we want to make all file connector can support MultipleTable in one PR, since all file connectors doesn't use the new Factory API.So this pr only modify LocalFileSink, after we make all File connector support MultipleTable, then we can consider add a new common interface BaseMultipleTableFileSink(I am not clear if we need to add this class).

Create #5970 to describe this.

@Hisoka-X Hisoka-X merged commit 0fdf45f into apache:dev Dec 6, 2023
10 checks passed
@ruanwenjun ruanwenjun deleted the dev_wenjun_localFileSinkSupportMultipleTable branch December 6, 2023 05:45
Carl-Zhou-CN pushed a commit to Carl-Zhou-CN/incubator-seatunnel that referenced this pull request Dec 12, 2023
alextinng pushed a commit to alextinng/seatunnel that referenced this pull request Dec 19, 2023
chaorongzhi pushed a commit to chaorongzhi/seatunnel that referenced this pull request Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants