Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[discuss] next steps for interface definition language? #4293

Closed
data-sync-user opened this issue Jun 23, 2021 · 11 comments
Closed

[discuss] next steps for interface definition language? #4293

data-sync-user opened this issue Jun 23, 2021 · 11 comments
Labels

Comments

@data-sync-user
Copy link
Collaborator

data-sync-user commented Jun 23, 2021

When we started building this tool, the choice of WebIDL for interface definitions was deliberately short-term (ref ADR-0001 and we knew we'd need to revisit it someday. Now feels like a good time to start discussing what the next steps for interface definition might look like.

I'll try to do some scene-setting below, but let's consider this an open discussion thread for anyone interested in this topic to share whatever partially-baked thoughts they may have.


To begin, I want to note that I think WebIDL has worked out surprisingly well for our purposes, given that it was designed for a language ecosystem and set of constraints that are different to what we're doing here. It certainly helped with the goal of getting up and running quickly! But we have observed a few pain-points in practice:

  • Some of the syntax is awkward for our purposes, such as having to declare an empty namespace to name the component itself, like namespace example {};.
  • The syntax is quite constraining; we noticed this most recently when trying to implement tagged enums, where I think the fact that we found a decent syntax is a really lucky coincidence.
  • Error reporting is quite poor. If you make a typo in your .udl file then uniffi will tell you the equivalent of "there was an error somewhere in your .udl file; good luck!".
    • There are probably things we could do in the tool itself to make this error reporting better, but I also have the impression that the weedle crate is (quite justifiably!) designed more for working with known-good .webidl files than for helping its users author new ones.

There are probably more pain-points, so please feel free to list additional ones below!

So what should the next steps be for the "defining the component interface" part of this tool? We've discussed a few main options in the past:

  • Continue using WebIDL, investing more in the developer experience by e.g. providing better error reporting.
  • Making a custom IDL that maps more closely to the underlying concepts of a UniFFI component interface.
  • Getting rid of .udl files in favour of something with Rust macros, in the style of wasm-bindgen.

Love or hate any of those suggestions? Please share below! There may also be other approaches that we haven't noticed yet, and I'd love to hear suggestions for those as well.

┆Issue is synchronized with this Jira Task

@data-sync-user
Copy link
Collaborator Author

➤ Ryan Kelly commented:

Getting rid of .udl files in favour of something with Rust macros, in the style of wasm-bindgen.

Macros are personally very interesting to me, so I've been lightly messing around with what a macro-based syntax for uniffi might look like. Some early thoughts are visible in mozilla/uniffi-rs#386 but I hope to have more considered comments on this option sometime soon.

@data-sync-user
Copy link
Collaborator Author

➤ Dan Mosedale commented:

An interesting (possibly minor) advantage of the keeping WebIDL for UDL is that it forces us to align reasonably well with real WebIDL, which is required for the current geckoJS backend implementation to work.

@data-sync-user
Copy link
Collaborator Author

➤ Tarik Eshaq commented:

I'mma hop on this since I saw the invitation on Element 😛 I haven't been following too closely how the tool has grown over the past few months, but..

There are a few quick naive thoughts hovering in my brain regarding taking the Marcos approach:

  • I think it's a great idea for the scaffolding to fit naturally with wasm_bindgen and other tools, will save us a lot of awkwardness. (and also keeps us within reach to WASI in case that ever becomes viable)
  • The bindings might be a pain, the idea in Do not return over-scoped OAuth Access Tokens #386 of generating a .udl file seems like it might help, but we might need to be a little careful in the long run to make sure files are up to date (still better consistency than manually writing the idl though!)
  • I can't really get to my brain how it'll work, but I wonder if there's a scenario where we get to have no idl at all, the bindgen tool would have to somehow use the rust code directly, yuk... uhh anywho I'm just thinking out loud

I don't have too much recent context to add much more (especially on how the gecko bindings went), but I'm glad we're revisiting this, I remember defining error handling was a little forced onto the idl.
Alrighty, enough procrastination, back to school 🙃

@data-sync-user
Copy link
Collaborator Author

➤ Ryan Kelly commented:

tarikeshaq oh hi! Thanks for hopping in! 😁

I can't really get to my brain how it'll work, but I wonder if there's a scenario where we get to have no idl at all,
the bindgen tool would have to somehow use the rust code directly, yuk... uhh anywho I'm just thinking out loud

Well, without wanting to give away any spoilers, I actually have this working in a branch as an experiment and am surprisingly happy with the consumer experience so far; ref the fxa-client-features branch ( https://github.com/mozilla/uniffi-rs/compare/fxa-client-features ) if you're curious about the details.

I hope to spend a few down cycles on polishing that up and presenting a more thorough proposal within the next week.

@data-sync-user
Copy link
Collaborator Author

➤ Tarik Eshaq commented:

Holy cow... syn is so powerful. I'm super curious how that will grow, I'll definitely be lurking 🙄

@data-sync-user
Copy link
Collaborator Author

➤ Ryan Kelly commented:

As hinted above, I've been messing around with procedural macros and have the start of an idea that I quite like. I've pushed it here for early visibility:

mozilla/uniffi-rs#416

I'll write up some thoughts in the present issue for consideration, but I want to stress, I'm not wedded to any of the ideas here. I've been messing around with it because it's fun, and I think it seems promising, but there may be other better ideas to pursue here as well.

So, the idea in the above draft PR is to lean very heavily into defining the interface through a procedural macro. Here's a strawfox of what you might write in a uniffi-ed crate with this new approach:

#[uniffi_macros::declare_interface]
mod math {
pub fn add(a: u32, b: u32) -> u32 {
a + b
}
};This is a procedural macro applied to an inline sub-module, and the contents of the inline sub-module directly implement the desired component interface. There's no separate declaration of the interface, you just implement it directly in Rust code inside the decorated sub-module. When you compile this as a Rust crate, it produces the same .so artifact that you'd get out of UniFFI today, with a pub extern "C" function for add that can be linked to by the foreign language bindings.

Thinking specifically about writing the Rust code here, this approach as several advantages:

  • You only specify the interface once, right there as part of actually implementing it.
  • You don't have to have a mental model of how the interface maps onto your Rust code, it's just the Rust code.
  • If you make an error, it can be reported inline in the Rust code via Rust's excellent error mechanisms.
  • It's kind of a bit weird to have to put it in a submodule, but not too bad...

The putting-it-in-a-submodule thing is actually quite important, because it means that UniFFI can see the entire interface at once, just like it can today when parsing from a separate file. This lets us perform important consistency checks, and also lets us insert runtime footgun protection such as the "checksum" that we put into all the FFI symbol names.

When it comes time to generate the foreign-language bindings, there are a couple of options. The simplest would be to run uniffi-bindgen ./src/lib.rs rather than uniffi-bindgen ./src/module.udl

A different approach, which I think feels better would be to operate on the crate as a whole rather than on a specific Rust file. I'm imagining a command like cargo uniffi generate -l kotlin that would inspect the current crate, parse the Rust code to determine the component interface, then generate the corresponding kotlin bindings. Again, I can't quite put my finder on why this feels better, but it does.

So, a summary of what a possible future could look like:

$> # 1) write the code for your crate using the uniffi macros
$> cat ./src/lib.rs

#[uniffi_macros::declare_interface]
mod math {
pub fn add(a: u32, b: u32) -> u32 {
a + b
}
};

$>
$> # 2) build the crate to produce a .so containing the FFI
$> cargo build
$> ls ./target/debug | grep ".so$"
libexample.so
$>
$> # 3) generate foreign-language bindings from the crate
$> cargo uniffi generate -l kotlin
$> find ./target/debug -name "*.kt"
./target/debug/uniffi/example/example.kt
$>One downside of this approach, of course, is that it's a lot of work! We'd need to build and maintain a proc macro, we'd need to figure out how to map all the bits of UniFFI interface onto corresponding Rust syntax, we'd need to figure out how to give a good developer experience when the user tries to use a bit of Rust syntax that we don't support, and so on.

My experiments in mozilla/uniffi-rs#416 have so far been pretty re-assuring in this regard. Rust's macro ecosystem and syn in particular are pretty powerful, and a first version of the above actually wouldn't feel too different to build than the thing we've been building already - basically, we'd just be mapping syn parse nodes onto bits of a ComponentInterface definition instead of mapping weedle parse nodes.

I plan to keep messing around with this approach in some background cycles. It's not urgent, but it's interesting. Any and all feedback is most welcome!

@data-sync-user
Copy link
Collaborator Author

➤ jhugman commented:

I have been holding off on this thread. I'm not sure I'm convinced by the following argument as a deal breaker for a proc_macro approach, but I wanted to put it down anyway.

I, for one, will mourn the passing of the canonical IDL file.

The IDL file is a concise description of the API that is implemented in Rust, but used by higher level language programmers: iOS, android or desktop developers. It is, or could be a lingua franca for each of the engineering stakeholders in the system.

At its best, it is the common document with which these devs with disparate skill sets can gesture at, discuss and request changes: it is a communication medium for development teams.

A mobile engineer implementing a new feature can succinctly request a new method, new attribute, new parameter, without having to understand Rust.

A mobile engineer and a desktop engineer can design an API together then find a Rust engineer to implement it.

Without it, the Rust engineer gets to mediate between the needs Desktop and iOS and Android and Rust, design the API that they won't use, and fielding feature requests written for a variety of languages.

I'll add more thoughts and reasonings in another comment.

@data-sync-user
Copy link
Collaborator Author

➤ Mark Hammond commented:

I really like this, and will happily dance on the grave of the UDL file 😉 I do see James's point, but the risk with the .udl that I see is that for the developers of the component it's not really canonical, but is for the consumers, which means the consumer tends to lose. Over time, ISTM that the rust developers are far more likely to add interesting notes, comments or warnings into the .rs file and not the udl file. We can already see this in nimbus - reset_telemetry_identifiers has subtly different docstrings in the udl vs the rust and it's not clear they have different audiences.

That, combined with good tooling, should mean the mobile consumer never needs to read the .rs file, just the generated documentation. Maybe that's wishful thinking though and assumes projects have great CI and tooling (ie, if mobile consumers still end up reading the .rs file instead of the generated code, we've screwed up - but that doesn't help those poor consumers)

If Jame's felt really strong about this, I guess it would be an argument for keeping the UDL generator in that PR! I'm inclined to lean against that though.

When it comes time to generate the foreign-language bindings, there are a couple of options.

That struck me as very odd too - I was going to comment in the PR but then realized it's really just the status-quo. Specifically, using gradle to generate stuff when everything is so tightly bound here seemed somehow like an opportunity for stuff to get out of sync, but I've no helpful suggestions.

@data-sync-user
Copy link
Collaborator Author

➤ Ryan Kelly commented:

At its best, it is the common document with which these devs with disparate skill sets can gesture at,
discuss and request changes: it is a communication medium for development teams.

I like this framing at a high level, FWIW.

As a concrete example, one of the things I don't totally love about the way wasm-bindgen works is that you end up with a bunch of Rust code with some annotations sprinkled in amongst it, and it's hard to see what the actual resulting API surface will be. I do agree that there's value in being able to reason about the API surface as a concrete artifact, somewhat independent of the implementation details.

(Without expressing it in quite those concrete terms, I think this is partly what motivated my use of an internal crate to hide as many implementation details as possible over in #3876 - to make it as easy as possible to look at lib.rs and see just the exposed API surface, nothing else).

I'll chew on this a bit, as I feel like there must be a way for us to have our cake and eat it too here somewhere...

@data-sync-user
Copy link
Collaborator Author

➤ jhugman commented:

If Jame's felt really strong about this, I guess it would be an argument for keeping the UDL generator i

TBC: I'm not arguing for any particular IDL, just that the work flow include a hand-written/hand-writable IDL, and that a change in the IDL file should be reflected in the rust code (as all the bindings).

This could involve, say, a proc_macro consuming the a IDL file to generate structs, traits (instead of impls) and enums. A change in the IDL of an object (in UDL, an interface) would cause a trait to change, and a compile error in Rust to help the Rust dev find what needs doing.

How strongly do I feel about this? The upper bound is no stronger than: "I'll feel sad if we don't do this"; I'm also mindful of Rust devs who want to publish their crates on Carthage or Maven to not find this disgusting.

@data-sync-user
Copy link
Collaborator Author

➤ Mark Hammond commented:

While I see your point, I remain unconvinced. IIUC, we are discussing 2 main options:

  • We can come up with sane annotations to the actual rust implementation, from which we can generate high-quality documentation.
  • We can force the maintainer of the rust component to make the same, redundant change in 2 places, and that one of those places is entirely artificial and targeted at non-rust devs, because generated docs are too simplistic, but the canonical rust code is too complex.

To put it another way - I think artificially forcing a UDL is doing a dis-service to both the rust maintainer ("even though we could generate this, it's for your own good that you must hand-write this twice") and the non-rust consumer ("we've designed this simplistic, artificial language just for you, because you wouldn't understand the alternative")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants