Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CoW incremental query #9

Closed
xushiyan opened this issue May 4, 2024 · 4 comments · Fixed by #236
Closed

Support CoW incremental query #9

xushiyan opened this issue May 4, 2024 · 4 comments · Fixed by #236
Assignees
Labels
feature p0 python Related to Python codebase rust Related to Rust codebase
Milestone

Comments

@xushiyan
Copy link
Member

xushiyan commented May 4, 2024

No description provided.

@xushiyan xushiyan added this to the release-0.1.0 milestone May 4, 2024
@xushiyan xushiyan modified the milestones: release-0.1.0, release-0.2.0 May 7, 2024
@xushiyan xushiyan added the p0 label Jul 19, 2024
@xushiyan xushiyan moved this to Todo in hudi-rs roadmap Jul 19, 2024
@xushiyan xushiyan modified the milestones: release-0.2.0, release-0.3.0 Aug 20, 2024
@gohalo
Copy link
Contributor

gohalo commented Sep 2, 2024

From the official docs, there are two ways to implement incremental queries.

  1. Configuration passed by options, details Spark Incremental Query For Hudi-0.13.0
  2. Through the hudi_table_changes TVF, details Spark Incremental Query For Hudi-0.14.1

Which method do you suggest using?

@jonathanc-n
Copy link
Contributor

@xushiyan Hullo, I would like to work on this. The high level implementation would be to:

  • Use timeline to retrieve latest commit or specific commit to use as a checkpoint
  • Get metadata for commits
  • Query on the changed data from that last check point.
    If there is any more specifics let me know!

@xushiyan
Copy link
Member Author

thank you both for the interest! we will do table api support first for incremental query, and then move on to sql support using datafusion. i'll lay out some groundwork first before splitting more follow up tasks.

@xushiyan xushiyan self-assigned this Nov 30, 2024
@xushiyan xushiyan added python Related to Python codebase rust Related to Rust codebase labels Nov 30, 2024
@xushiyan xushiyan moved this from Todo to In Progress in hudi-rs roadmap Nov 30, 2024
@jonathanc-n
Copy link
Contributor

@xushiyan Is there anything I can help with for oroviding table api support?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature p0 python Related to Python codebase rust Related to Rust codebase
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants