Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate effort? #3

Closed
jan-hudec opened this issue Jun 14, 2016 · 9 comments
Closed

Duplicate effort? #3

jan-hudec opened this issue Jun 14, 2016 · 9 comments
Labels

Comments

@jan-hudec
Copy link

There is also https://github.com/endoli/message-format.rs with apparently similar goals.

@vitiral
Copy link
Owner

vitiral commented Jun 14, 2016

That looks like a much more complicated format than the python/rust format function/macro. It looks to contain logic within the text itself (not just formatting logic, but actual logic).

The below is quite different from rust format!

{count, plural,
  =0 {Your search had no results.}
  =1 {Your search had one result.}
  other {Your search had # results.}}

@vitiral vitiral closed this as completed Jun 14, 2016
@vitiral
Copy link
Owner

vitiral commented Jun 14, 2016

I like their inteligent use of macros and the precompiling step -- I will be sure to add these things to improve speed for some use cases.

@jan-hudec
Copy link
Author

That looks like a much more complicated format than the python/rust format function/macro. It looks to contain logic within the text itself (not just formatting logic, but actual logic).

It is still a strict superset of the python/rust format function/macro. Nobody forces anybody to use the select and plural formats. They come handy sometimes though.

I like their inteligent use of macros

Macros are icing on the top. Most important is to get rid of the HashMap, which is unwieldy and not particularly efficient either. Made worse by the to_string()s all over the place — all of this could use &'a str, it just needs adding <'a> in a couple of places.

I would personally prefer the underlying syntax like:

strfmt("hi, my name is {name} and I am a {job}!").arg("name", "Bob").arg("job", "python developer").to_string()

(or .write(out) at the end). They didn't go exactly with that, putting the chain of arg calls in a parameter, but that's not as big difference.

I proposed how to build a list on stack without allocation in endoli/message-format.rs#2, which still uses dynamic dispatch. There is a more refined version with all static dispatch at http://src.codes/typed-linked-lists.html, but I suspect the code increase due to having to generate the formatting function for each combination of parameters would be worse than the dynamic dispatch.

precompiling step

While it is likely beneficial for the complex formats like plural and select, I wouldn't expect it to help with the simple ones all that much. Perhaps if you could precompile the argument formats, but you can't, because you don't know what types you'll actually get.

… except maybe you could. If you used the typed list idea, and made the parsing function generic in its complete type, you could. But it would also limit you to positional parameters, because Rust does not have a reasonable way to encode name in the type (hopefully yet).

@vitiral
Copy link
Owner

vitiral commented Jun 16, 2016

I'm not seeing how that language can handle format args like: {x:F<10.4E}. I've looked around a fair amount but I can't seem to find it in the spec. Can you link to that?

The format arg syntax would be nice for putting it in a macro, maybe it could be something like the constructor:

"hi, my name is {name} and I am a {job}! I have {cats} cats!"
    .fmt_arg_str("name", "Bob")
    .fmt_arg_str("job", "python developer")
    .fmt_arg_i64("cats", 42)
    .fmt_args()

This would allow you to build up all the arguments and format in one go, or create your formatters and format one at a time.

Perhaps if you could precompile the argument formats, but you can't, because you don't know what types you'll actually get.

I can easily precompile the argument formats, it is the on the onus of the user of strfmt_map to handle the types correctly. I could easily accept a list of something like this as well:

enum Types {
   String(String),
    I64(i64),
    F64(f64),
}

I'm not sure what is the best option (I'm tempted to leave it to the user of strfmt_map to handle everything in multi-type situations).

Of course, the fmt_arg_* constructors could be used with the macro and types could be automatically inferred (I think...)

@jan-hudec
Copy link
Author

I'm not seeing how that language can handle format args like: {x:F<10.4E}. I've looked around a fair amount but I can't seem to find it in the spec. Can you link to that?

Basically in ICU documentation the placeholder for argument is described to be, generally, of the form

simpleArg = '{' argNameOrNumber ',' argType [',' argStyle] '}'

The Rust version should combine the two, because dates, times and money amounts should all be represented by their dedicated types, not plain numbers, so there is no need for argType. But argStyle will be the pattern for the value. So the only difference is it won't be {x:F<10.4E} but {x,F<10.4E} (ok, I am not sure what the F is supposed to be here; I can see {x,<10.4E}.

"hi, my name is {name} and I am a {job}! I have {cats} cats!"
    .fmt_arg_str("name", "Bob")
    .fmt_arg_str("job", "python developer")
    .fmt_arg_i64("cats", 42)
    .fmt_args()

No, no, no. Just

"hi, my name is {name} and I am a {job}! I have {cats} cats!"
    .fmt_arg("name", "Bob")
    .fmt_arg("job", "python developer")
    .fmt_arg("cats", 42)
    .fmt_args()

There should be a trait that will work similarly to Display and related traits of std::fmt, except it needs to be just one and do complete interpretation of the pattern itself. Something like

trait Formattable {
    fn fmt(&self, fmt: &str, out: &mut std::fmt::Formatter) -> Result<(), SomeError>
}

I am just writing such trait in https://github.com/rust-locale/rust-locale/tree/next (it's not there yet; I am still working on it). However, that one will only support things relevant for localization, so only decimal format, and in future possibly rule-based words format (so 42 would be formatted as "fourty two").

I'm not sure what is the best option (I'm tempted to leave it to the user of strfmt_map to handle everything in multi-type situations).

IMO the best option is to accept a trait.

In fact, I am planning to propose, for message-format, to add yet another level of genericity so it can be adapted to different traits.

@jan-hudec
Copy link
Author

I can easily precompile the argument formats

How do you precompile them? A number format might look like {x:<10.4E} or just {x:05}. A time format might look like {x:jm} (that's a skeleton), {x:long} or {x:=HH:MM:ss+ZZZZ}. A monetary format might look like {x:intl} (but no precision; precision is given by the currency!) or may be one day {x:short} (formats Money("USD", 2100000) as $2.1M—also note the input format)). And hell knows what units might come up for, say, distance (so that user can specify things like they want 800*m printed as ½ mi(!), but 100*m as 330 ft).

So how do you want to interpret the pattern if you don't know the type yet?

Of course all this assumes that you make the type extensible. If you make an enum of supported types, you will know all possible formats. But I would consider such library basically useless.

@vitiral
Copy link
Owner

vitiral commented Jun 16, 2016

simpleArg = '{' argNameOrNumber ',' argType [',' argStyle] '}'

According to this, I would think the format would be something like:

{x,E,F<10.4

That is assuming that message-format could handle the exact same formatting options (which you say it can).

It would be nice if it could handle either. For instance, if it could handle python/rust fmt if there was a : after the identifier. I would think this is trivial to implement, although I don't know enough about message-format (is the : character used elsewhere in the spec?)

ok, I am not sure what the F is supposed to be here

It is the "fill character"

IMO the best option is to accept a trait.

If it accepted a trait, I would want it to default to being able to accept the Display trait when the custom trait isn't defined. Is that possible?

Overall I really like the idea of using the trait. I don't see why you are passing the full fmt str in though -- the creation of the Formatter should handle finding all the options and storing them in a struct-like format. It should not be the user's responsibility to deserialze and error check the format string!

@vitiral vitiral reopened this Jun 16, 2016
@jan-hudec
Copy link
Author

That is assuming that message-format could handle the exact same formatting options (which you say it can).

Well, currently, they don't. However I am working on that for locale and will be integrating some message formatting with it. And while I feared the plural and choice formats may be too taxing on the translators, I was assured they come in handy. I suppose it depends on how good translators you can afford; you don't have to use them if you have less techinical translators.

If it accepted a trait, I would want it to default to being able to accept the Display trait when the custom trait isn't defined. Is that possible?

I believe it requires rust-lang/rfcs#1210, which is tracked by rust-lang/rust#31844. That is unfortunately still open, but the main part implemented in rust-lang/rust#30652 already landed and, if I understand the annotation on github correctly, made it into 1.9.0.

With that, it is possible to impl<T> for T where T: Display and override it for the types that should have more advanced logic.

I don't see why you are passing the full fmt str in though -- the creation of the Formatter should handle finding all the options and storing them in a struct-like format. It should not be the user's responsibility to deserialze and error check the format string!

The reason is that different types need to recognize different format specifications. The simple [[ fill ] align ] [ width ] [ "." precision ] style works well for numbers (and money amounts and possibly quantities), but don't cut it for things like dates and times and possibly other more complex value types.

There could be some common handling for the fill and width, but at least there needs to be an option for more than one letter for the style.

Well, I do see another option for dates and times. I can imagine format like "{d:H}:{d:M}:{d:s}", where the value is repeated and each format extracts specific part of it. But it does look a bit unwieldy with the braces and repetition of the parameter name.

@vitiral
Copy link
Owner

vitiral commented Jun 16, 2016

With that, it is possible to impl for T where T: Display and override it for the types that should have more advanced logic.

Ah, yes I think you might be right. I followed that issue but I didn't think it could be used for this.

I think the main problem is that Display.fmt uses the fmt::Formatter struct, which is not implemented as a Trait (as far as I can tell). This will make it very difficult to use either I would think.

Possibly if we could replace Formatter with a trait in the stdlib many of our desires would be possible.

The reason is that different types need to recognize different format specifications. (...)

From my experience, not properly deserializing data is the root of all sorts of bugs and problems. I think that not serializing would be a grave mistake.

I could discuss with you some possibilities of how message-format could be structured that allows you to do this (things like having embeded structs come to mind).

Issue point

Outside of that, I think the goals of message-format and strfmt are very different. In the places that they converge, I think they should -- and we should work together to creating a unified Formatter trait that can be used conviniently by authors who already implement Display. However, message-format covers a much broader scope and does so in slightly different syntax (at least currently). I think it's goals are valid, I just don't think they are the same as strfmt's goals.

We should open separate issues for implementing the various functionality that we have discussed in this thread. However, I do not think the topic of this dicussion is correct: while message-format and strfmt may partially overlap, strfmt aims to be a wholly simpler format that is fully compliant with the python format string. If it ever comes to be that strfmt's functionality can be completely handled by message-format, then we should merge them, and strfmt can just be a layer on top of message-format. Until then, we should only merge them in the areas that they can work together (like we could make a formatlib for storing this Trait)

Thanks. I will be opening some issues to reference these, and you are welcome to as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants