Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFI should support using sync client instead of default client #199

Closed
scovich opened this issue May 6, 2024 · 15 comments
Closed

FFI should support using sync client instead of default client #199

scovich opened this issue May 6, 2024 · 15 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@scovich
Copy link
Collaborator

scovich commented May 6, 2024

Follow-up to #195 (comment)

tl;dr:

[If you enable sync-engine] you'll be able to see the types for this, but not actually get one (because get_engine_builder is only #[cfg(feature = "default-engine")]).

We'd have to probably make an enum with the types, and then either have that as an arg to get_engine_builder, or we could have get_default_builder or get_sync_builder.

@abrassel
Copy link
Contributor

can I take this?

@roeap
Copy link
Collaborator

roeap commented May 12, 2024

absolutely!

@abrassel
Copy link
Contributor

@roeap do you have a preference between adding the enum approach vs the explicit builder methods? Also, how many different kinds of engines are there? What is the difference between a sync and default engine?

Finally, what's the criteria for success? Is it just cargo build succeeds with sync-engine feature?

@abrassel
Copy link
Contributor

Btw, running cargo build --features sync-engine works without any changes.

@scovich
Copy link
Collaborator Author

scovich commented May 23, 2024

running cargo build --features sync-engine works without any changes

Yes -- but it doesn't provide any "engine builder" entry points for the sync client, so the engine couldn't actually do anything. Probably cleanest to create some new builder methods, so the two can coexist peacefully. Not because anyone would actually use both, but if the API don't overlap then there's no risk of confusion about which engine they're working with.

We may be able to keep using the same get_engine_builder and set_builder_option functions, and just split the current builder_build function into build_default_engine and build_sync_engine? (also add a get_sync_engine to match existing get_default_engine, to handle the no-builder case)?

@abrassel
Copy link
Contributor

That makes sense! For the success criteria, @scovich , what tests do you expect to pass? Should I write new ones? For context, this is my first time really looking at delta-kernel-rs, especially the ffi

@scovich
Copy link
Collaborator Author

scovich commented May 23, 2024

what tests do you expect to pass? Should I write new ones?

In theory, extending the cffi-test to make a second read of the table using sync engine... but I believe @nicklan is working on an improved FFI testing mechanism so I'll defer to him on this.

@abrassel
Copy link
Contributor

Got it. One more question (sorry, noob here): what is the sync engine and what does it do?

@nicklan
Copy link
Collaborator

nicklan commented May 23, 2024

what is the sync engine and what does it do?

the sync engine is so called because it doesn't contain any async code :) In practice it's a little simpler than the default engine, and also doesn't support reading anything other than the local filesystem.

It's useful to have around if we suspect odd things are happening in the default client due to the complex async stuff there, as it lets us verify that that's where the issue is.

@nicklan
Copy link
Collaborator

nicklan commented May 23, 2024

In particular you can have a look at https://github.com/delta-incubator/delta-kernel-rs/tree/main/kernel/src/engine and see the default and sync subdirs for the implementations of each engine

@abrassel
Copy link
Contributor

Thanks @nicklan ! Better question, what does an "engine" do in the context of delta-kernel-rs. Is there a non-FFI engine or is it specific to the c interface?

@nicklan
Copy link
Collaborator

nicklan commented May 23, 2024

An excellent question, that we should probably answer better. The docs have this to say:

/// The `Engine` trait encapsulates all the functionality and engine or connector needs to provide
/// to the Delta Kernel in order to read the Delta table.
///
/// Engines/Connectors are expected to pass an implementation of this trait when reading a Delta
/// table.

Basically it's "here's stuff we think an engine should implement". Things like reading data, evaluating expressions, etc. Then we express all the delta logic in terms of those actions, so we can integrate with any real connector/engine that provides us with their implementation of that trait.

We also provide two implementations of it ourselves, in the sync and default engines. This is 1) so we can write unit tests that actually do something, but also 2) because we expect many connectors/engines to just use the default engine and get back arrow data that they can work with.

@nicklan
Copy link
Collaborator

nicklan commented May 23, 2024

but I believe @nicklan is working on an improved FFI testing mechanism so I'll defer to him on this.

I think just having the existing cffi-test.c create both kinds of engines and do the same operations with both for now is great (and shouldn't be much extra work). The pr for the more complete program (#203) is pretty big and you shouldn't block on having that merge

@abrassel
Copy link
Contributor

Thanks for the in depth description! Let's see if I can get the FFI tests running on Mac. I was having some difficulties earlier.

nicklan pushed a commit that referenced this issue May 25, 2024
## Change list
Add sync engine cfg option to the ffi. This will allow us to build with
local file paths only (mostly useful for testing).

## How is this tested?
Changed `ffi-test.c` to build both a sync and default engine. It should
pass both.

*Note* This is a draft because I can't get the test to build right. For
some reason, the codegenned c code isn't defining the `SYNC-ENGINE`
flag. Any ideas @nicklan

## Issue
This addresses #199
@nicklan
Copy link
Collaborator

nicklan commented Aug 6, 2024

Done by #223

@nicklan nicklan closed this as completed Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants