Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: rust usage documentation #3089

Merged
merged 1 commit into from
Jan 4, 2025
Merged

docs: rust usage documentation #3089

merged 1 commit into from
Jan 4, 2025

Conversation

Abdullahsab3
Copy link
Contributor

@Abdullahsab3 Abdullahsab3 commented Dec 29, 2024

Description

This PR is intended to add high-level (mainly usage) documentation of delta-rs using the Rust API.

Related Issue(s)

closes #3088

Progress:

  • Installation (Already documented for Rust)
  • Creating a Delta Lake Table
  • Loading a Delta Table
    • Loading a table
    • Verify table existence
    • Time travel
    • custom storage options
    • custom storage backends
  • Appending to and overwriting a Delta Lake table
  • Adding a Constraint to a table (Updated the already present example)
  • Reading the Change Data Feed from a Delta Table (Updated the already present example)
  • Examining a Table
  • Querying Delta Tables
  • Merging a Table
  • Managing Delta Tables
  • Writing Delta Tables
  • Writing to S3 with a locking provider
  • Deleting rows from a Delta Lake table
  • Delta Lake small file compaction with optimize
  • Delta Lake Z Ordering

Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@Abdullahsab3
Copy link
Contributor Author

I thought github supported draft PRs 😅 Maybe confusing it with another VCS

@Abdullahsab3 Abdullahsab3 changed the title [WIP] Rust API documentation docs: [WIP] Rust API documentation Dec 29, 2024
@Abdullahsab3 Abdullahsab3 marked this pull request as draft December 30, 2024 17:31
Comment on lines 13 to 37
arrow_cast::pretty::print_batches(&cdf)?;
let batches = collect_batches(
cdf.properties().output_partitioning().partition_count(),
&cdf,
ctx,
).await?;
arrow_cast::pretty::print_batches(&batches)?;


Ok(())
} No newline at end of file
}

async fn collect_batches(
num_partitions: usize,
stream: &impl ExecutionPlan,
ctx: SessionContext,
) -> Result<Vec<RecordBatch>, Box<dyn std::error::Error>> {
let mut batches = vec![];
for p in 0..num_partitions {
let data: Vec<RecordBatch> =
collect_sendable_stream(stream.execute(p, ctx.task_ctx())?).await?;
batches.extend_from_slice(&data);
}
Ok(batches)
}
Copy link
Contributor Author

@Abdullahsab3 Abdullahsab3 Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the CDF interface changed at some point. I was able to get around it by using the DF EXecutionPlan interface and a helper function I copied from delta-rs tests. Perhaps a more user-friendly interface for it is needed here?

@Abdullahsab3 Abdullahsab3 marked this pull request as ready for review January 4, 2025 18:02
@Abdullahsab3 Abdullahsab3 changed the title docs: [WIP] Rust API documentation docs:Rust API documentation Jan 4, 2025
@Abdullahsab3 Abdullahsab3 changed the title docs:Rust API documentation docs: Rust API documentation Jan 4, 2025
@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate and removed binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Jan 4, 2025
@Abdullahsab3 Abdullahsab3 changed the title docs: Rust API documentation docs: rust usage documentation Jan 4, 2025
Copy link

codecov bot commented Jan 4, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.53%. Comparing base (6430151) to head (d851959).
Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3089   +/-   ##
=======================================
  Coverage   72.52%   72.53%           
=======================================
  Files         128      128           
  Lines       41201    41201           
  Branches    41201    41201           
=======================================
+ Hits        29882    29886    +4     
- Misses       9408     9410    +2     
+ Partials     1911     1905    -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

let builder = deltalake::DeltaTableBuilder::from_uri(bucket_table_path).with_storage_options(storage_options);
builder.build()?.verify_deltatable_existence().await?;
// true
```


## Custom Storage Backends
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this is specific to the Python variant of the library, since it's using pyArrow, right?

@rtyler
Copy link
Member

rtyler commented Jan 4, 2025

There's some formatting fixes that need to happen before merging here, but I will squash the history and tidy things up before merging, thanks!

Signed-off-by: Abdullahsab3 <[email protected]>
rtyler
rtyler approved these changes Jan 4, 2025
@rtyler rtyler added this pull request to the merge queue Jan 4, 2025
Merged via the queue into delta-io:main with commit 5131850 Jan 4, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High level Rust API documentation (Usage)
2 participants