Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve][HiveSink]Fix Partition loading failed. #3886

Merged
merged 8 commits into from
Jan 17, 2023

Conversation

lightzhao
Copy link
Contributor

1.If user sets 'partition_dir_expression = "${v0}"' in hive-sink,it means that the partition directory does not have a partition name. Then the partition loading will fail.
2.Like this : /home/test/warehouse/test.db/tmp_test_day/20230106
3.A new loading method is added for this case to ensure successful loading of partitions.

Purpose of this pull request

Check list

@lightzhao
Copy link
Contributor Author

@EricJoy2048 @Hisoka-X @TaoZex please review .thanks.

Copy link
Member

@TyrantLucifer TyrantLucifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best solution is to limit the user's ability to set partition_name_expression not add a new config option.

cc @EricJoy2048

@lightzhao
Copy link
Contributor Author

@EricJoy2048 @Hisoka-X @TaoZex Please see if this pr is reasonable. In fact, there are such use scenarios.

@Hisoka-X
Copy link
Member

If does not have a partition name can't be loaded, why we should allow it? Maybe check partition_name_expression before execute job could be better way.

@lightzhao
Copy link
Contributor Author

If does not have a partition name can't be loaded, why we should allow it? Maybe check partition_name_expression before execute job could be better way.

ok, If it is unreasonable, I will close it.

@Hisoka-X
Copy link
Member

If does not have a partition name can't be loaded, why we should allow it? Maybe check partition_name_expression before execute job could be better way.

ok, If it is unreasonable, I will close it.

No need, can your add the code in this PR to check partition_name_expression to make sure the partition always load success? If so will be very helpful.

@lightzhao
Copy link
Contributor Author

If does not have a partition name can't be loaded, why we should allow it? Maybe check partition_name_expression before execute job could be better way.

ok, If it is unreasonable, I will close it.

No need, can your add the code in this PR to check partition_name_expression to make sure the partition always load success? If so will be very helpful.

@Hisoka-X @TyrantLucifer PTAL, partition_dir_expression parameter adds the validation of the partition name to prevent the partition from failing to load

@TyrantLucifer TyrantLucifer force-pushed the hive-partition-name-no-contaion branch from e178bf8 to f00aa1f Compare January 17, 2023 05:15
Copy link
Member

@TyrantLucifer TyrantLucifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@TyrantLucifer
Copy link
Member

cc @Hisoka-X

Copy link
Contributor

@TaoZex TaoZex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@TyrantLucifer TyrantLucifer merged commit b4348f6 into apache:dev Jan 17, 2023
harveyyue pushed a commit to harveyyue/seatunnel that referenced this pull request Feb 7, 2023
* Whether the partition directory contains the partition name. If not, the partition loading method is different to ensure successful partition loading

* partition_dir_expression parameter adds the validation of the partition name to prevent the partition from failing to load

* update change log.

* Fix the problem of duplicate hive partition loading.

* partition_dir_expression parameter adds the validation of the partition name to prevent the partition from failing to load

* revert

* revert

* [Improve][Connector-V2][Hive] Improve config check logic, add the limit of partition expression

Co-authored-by: zhaoliang01 <[email protected]>
Co-authored-by: tyrantlucifer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants