-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add macros for locale/langid #75
Comments
Pending #43 |
Can you remind me of the details of the benefits? (Was it compile-time static constants with dead code elimination for Rust users?) Also, what are the tradeoffs? Proc macros sound similar to Scala/Haskell macros, although I haven't written them. Are there Rust style guides or rules of thumb about them? I think wisdom from Lisp macros should transfer over. Basically, the gist is don't write them unless you absolutely have to. (A concise and more nuanced guide is Ch 8 of On Lisp.) Along the spectrum of data<->fns<->macros, you tend trade off simplicity for power. Ex: macros can cause "macro-contagion" b/c they're not composable b/c they're not 1st-class data. Most common successful uses of macros seem to be either building up low level language constructs (for / while / println) or creating a streamlined sugary top layer for users of a library. For ICU4X, could the result of the proc macros for locale cause the corresponding WASM binary to be larger, for example? |
the proc macro here would compile down to a single constant, so, no.
They're basically transforms written in rust code that transform rust token trees. I don't think the macro contagion problem will happen here, this macro is essentially being written like a |
I don't think so.
Sure. They basically are an elegant and complete replacement for the current ICU's "Static Public Member Functions". Basically, if you need to encode in your code let x = Locale {
language: Language::from_str("en").unwrap(),
script: None,
region: Region::from_str("US").unwrap(),
variants: vec![]
}; but then you still pay for parsing of the That's what proc_macro is doing for you. It encapsulates the code that takes a string let x = locale!("en-US");
let lang = language!("en");
let region = region!("us"); // it'll canonicalize to US at build time
let broken = locale!("213d1"); // it'll fail at compile time
Today there are two - you need to expose some constructor that takes a precomputed internal value, and you need to play around to make the proc macros work on stable Rust. The latter is getting resolved now and in 12 weeks will be in stable, which means that all we need is a pair of methods that can be used by the macro to "get the raw data" and "create from raw data" without any computations. Here's the code I use today: https://github.com/zbraniecki/unic-locale/blob/master/unic-langid-macros-impl/src/lib.rs#L65-L108 I have a patch that adds the necessary In result you can skip them when building bindings to other languages and they don't need any additional binary size in case you don't use them. |
Ah, ok, thanks for the explanation. That makes sense -- since macro expansion happens pre-compile, you can use that to insert code to efficiently parse known valid locale string literals (without cluttering the source), and that makes it into the compiled code so that runtime is reduced. I guess that means that there is a small tradeoff between extra code size and performance, but maybe it is negligible. I wonder how often in i18n algorithms do we need to refer specifically to particular locales. |
Since I released It also allows you to do very performant comparisons like this: if langid.language == language!("en") {
}
Where do you see the extra code size coming? |
Nevermind, doesn't look like extra code, I didn't properly understand that the quoted values were being saved up for the single return expression in the above link. |
I have the PR in works, but we're waiting for rust-lang/rust#49146 which will be released on August 27 with rust 1.46 |
One of the features of
unic-locale
is that it allows for "free" encoding of language identifiers, locales and subtags thanks to proc macros.In the current model it's quite quirky, and I didn't want to include it in the initial landing, but for that code to work, we only need to add two methods per subtag, and at least one of them doesn't have to be public - they just need to make it possible to create a subtag from a pre-computed u64/u32 (which is what the proc macro will do).
The good news is that Rust is about to stabilize proc macros (targeting Rust 1.45) which will allow us to get this feature without multi-crate hacks required before.
The text was updated successfully, but these errors were encountered: