Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implementation of fmt::prepare. Implements #618 (and probably #613) #949

Closed
wants to merge 3 commits into from

Conversation

stryku
Copy link
Contributor

@stryku stryku commented Nov 27, 2018

Foreword

Hello (: First of, this thing is huge. Please, take your time while reviewing it. I'm open for a discussion and aware that this overview doesn't explain everything, so don't hesitate to ask questions, I'll try to explain why I did this in such way, that in other way etc.

Abstract

This pull request's purpose is to implement a fmt::prepare<>() function and all functionality around it.

Problems

  1. Format should be prepared once, at creation. Later on, it should just format user arguments.
  2. If format string can be handled at compile time, format should be prepared at compile time.
  3. User should be able to declare prepared format as a e.g. class member.
  4. User should be able to do with prepared format whatever he wants (copy, move etc.).
  5. Because of above, prepared format can't work with raw string views, thus unlike current fmt features, it is statefull.
  6. Prepared format is statefull, because of that it needs to be allocator-aware.

TL;DR solutions

  1. Indeed, format is prepared at creation - in ctor.
  2. If FMT_USE_CONSTEXPR is enabled and fmt::prepare detects that passed format is_compile_string, then format will be prepared at compile time.
  3. User can declare format as a member, using fmt::prepared_format<> typedef/alias.
  4. Prepared format is copyable and movable.
  5. Prepared format keeps a copy of passed format string as a std::basic_string<>.
  6. Prepared format has two members that the user (who wants to provide his own allocators) needs to know about: a copy of the format string and something that I call prepared parts. Prepared parts are parts of the format, that format was able to prepare at a creation time (e.g. in "My name is {0}." format, there are three parts: text "My name is ", argument id 0, text "."). User is able to provide his own types for the format and prepared parts container via a fmt::basic_prepared_format<> (I'll talk about it later).

API

New API

format_part<> class

template <typename Char>
class format_part;

A class that represents a format prepared part. It's not internal, to give user possibility to pass his own parts container type (see below).

parts_container<> typedef

template <typename Char, typename Container = std::vector<format_part<Char>>
class parts_container;

A typedef of a class that is used as a parts container by prepared format. User can provide his own underlying container type (in, of course, the second parameter).

basic_prepared_format<> typedef

template <typename Format, typename PartsContainer, typename... Args>
using basic_prepared_format = /* ... */;

A typedef of a class that represents a fully customizable prepared format. User can provide his own Format and PartsContainer types.

prepared_format<> typedef

template <typename... Args>
using prepared_format = basic_prepared_format<std::string, parts_container<char>, Args...>;

A simple alias for basic_prepared_format with default Format and PartsContainer types.

wprepared_format<> typedef

template <typename... Args>
using wprepared_format = basic_prepared_format<std::wstring, parts_container<wchar_t>, Args...>;

Same as above, but for wide strings.

prepare<>()

template <typename... Args, typename Format>
prepare(const Format& format_str)

A function that returns a prepared format object. Args are the argument types that prepared format accepts in format functions.

Prepared formats created by all of the above ways, provide the same functionality (same as plain fmt::format*() functions):
-format()
-format_to()
-format_to_n()
-formatted_size()

Tests

There was a need to a little refactor them. I have had to change tests strictly related to formatting, to be able to test multiple ways of formatting. Those tests are moved to a new file: common-format-test.cc, and they test the old fmt::format() and two versions of fmt::prepare(): one with runtime format string passed and one with compiletime format string.

In common-format-test I've used gtest's TYPED_TESTs. There are three wrappers:
-FormatFunctionWrapper: simply call fmt::format*().
-RuntimePreparedFormatWrapper: explicitly convert passed format string to a runtime format (which is std::basic_string, with proper character type). Then, create a prepared format using the shiny new fmt::prepare(). The last step is to format string using prepared format's format() method.
-CompiletimePreparedFormatWrapper: this one is enabled only if FMT_USE_CONSTEXPR is enabled. It prepares format (using compiletime format string) and then, formats stuff using prepared format's format() method.

As you can see, in almost every test, format string is wrapped by the TEST_FMT_STRINGFMT. That's because I couldn't think of a better, generic way to pass compiletime format string to wrappers. I wanted to do this, to tests fmt::prepare() with compiletime format string.

[DONE](Now I realized, that I probably should create one more wrapper: RuntimeFormatFunctionWrapper (and rename FormatFunctionWrapper to CompiletimeFormatFunctionWrapper) that would explicitly convert a compiletime string to a runtime one and test fmt::format() against it. What do you think?)

Documentation

Didn't write it, simply because a lot can be changed during the review. I'll provide it as a last step of this pull request.

Hacks (or not)

basic_format_context::prepared_specs* member and dynamic_formatter.

basic_format_context has a new prepared_specs* member. It is set, only in internal::prepared_format::vformat_to() method. Then, it's used by dynamic_formatter to check whether it needs to parse specification, or it is already parsed and ready to use.
Actually, this workaround is needed for only one of the tests, the one with variant:

TYPED_TEST(RuntimeFormattersTest, DynamicFormatter) {
    auto num = variant(42);
    auto str = variant("foo");
    /* ... */
    EXPECT_EQ(" 42 foo ",
              TypeParam::format(TEST_FMT_STRING("{:{}} {:{}}"), num, 3, str, 4));
}

It fails with old approach because:
-when using fmt::format, automatic argument id is tracked by a parse context. Later on, dynamic formatter gets that id from the parsed context.
-when using fmt::prepare, all argument ids are tracked by the prepared format. Specification is parsed and prepared, long before actual formatting. Parse context doesn't track the id anymore, so when dynamic formatter was asking for it, parse context returned a corrupted id. Now, this prepared specification is handled by prepared format and set in basic_format_context::prepared_specs*. If dynamic_formatter detects that it can use an already prepared specs, it doesn't bother the parse context.

An alternative would be to create a new context e.g. prepared_format_context, that would have this member. Then in dynamic_formatter, we would overload format method to handle the new context. I decided to not do that (at least in scope of this pull request) because it would require a refactor, to pass format context template parameter in a couple (a lot?) of places. This pull request is already big, even without this change.

Unrestricted unions

These are the unions with e.g. non-POD types. I needed them in format_part and changed arg_ref.
The simplest solution when they are not supported was to simply use a structure instead. I agree that it is not the best solution but unrestricted unions are supported since GCC 4.6, Clang 3.3 and Visual Studio 15 (MSVS >= 1900), so we support a couple of really old compilers that doesn't support that.
An alternative would be to use PODs in all unions and implement separate class-wrappers, that would implement needed methods handling those PODs. This would introduce a fine amount of new code (e.g. I'd need to introduce a new POD type for parsed_format_specs + a wrapper with methods for it, and more..). That's why I went this way.

Other

basic_prepared_format<>

This typedef gives user a possibility to provide his own types for held stuff. Thanks to that, we're fully allocator-aware. User can provide:
-A format string type.
-A parts container type.

Parts container needs to fulfill the PartsContainer concept, which is:

class {
public:
    typedef /*held format part type */ format_part_type;
    void add(format_part_type part); // Adds part to the end of already held parts.
    void substitute_last(format_part_type part); // Substitutes part at the end of held parts with the provided one. fmt lib cares of not calling it on a empty container.
    format_part_type last(); // Returns part from end of held parts.
    /*format_part_type iterator*/ begin(); // Returns begin of the held parts.
    /*format_part_type iterator*/ end(); // Returns end of the held parts.
};

Concept is checked at a compile-time by the parts_container_concept_check struct.

parts_container_concept_check (prepare.h)

There is a lot of boilerplate. It could be done with a macro, but I decided that macros for such stuff are yuck.
[DONE] (I just realized that I probably should add checks for default, move and copy ctors there. What do you think?)

prepare.h

I decided to create a new file for features related to fmt::prepare(). Thanks to that, user can avoid compilation of heavy templates when he doesn't need the fmt::prepare().

@stryku
Copy link
Contributor Author

stryku commented Nov 28, 2018

I'll handle failed builds ASAP + will run benchmarks to see how my implementation looks among other ways of formatting.

@stryku stryku changed the title Implementation of fmt::prepare. Implements #618 (and probably #613) [WIP] Implementation of fmt::prepare. Implements #618 (and probably #613) Nov 28, 2018
@vitaut
Copy link
Contributor

vitaut commented Nov 28, 2018

Wow, that's an impressive PR, thank you for working on it!

I'll take a look in details but first a few quick questions regarding the design:

Prepared format is statefull, because of that it needs to be allocator-aware.

Could you elaborate how statefulness relates to allocator-awareness? Is it because you use a container (string) to store some of the state?

Prepared format keeps a copy of passed format string as a std::basic_string<>.

How does it work with constexpr?

Now it returns formatter_parse_result, which is basically a bool (whether it was parsed successfully) and the iterator (to where the formatter has stopped parsing).

Why and how does it work with existing parsers that report errors via exceptions?

@stryku
Copy link
Contributor Author

stryku commented Nov 30, 2018

Could you elaborate how statefulness relates to allocator-awareness? Is it because you use a container (string) to store some of the state?

Exactly. Prepared format has two members that can allocate memory - a format and a parts container. We need to provide a way to specify allocators that are used to allocate memory for these members.

How does it work with constexpr?

Sorry, I forgot to mention the constexpr case. When a format's type is the fmt's compile time string, the stored member is an instance of the struct that wraps the string literal (the struct, returned from the FMT_STRING's lambda), otherwise it's a std::basic_string<>.
I just realized, that there is probably an issue in my implementation. For runtime format strings, I always try to convert them to a std::basic_string<>. When user would pass his own format string type, this would mostly like fail during a compilation. I will take a look and write it properly.

Why and how does it work with existing parsers that report errors via exceptions?

Well, while writing an answer to this, I realized that I can take advantage of a parse_context to check whether a specification was successfully parsed by a custom formatter. Thanks to that, the change in the public API won't be needed. I'll do the changes and push a commit probably tomorrow.

@stryku
Copy link
Contributor Author

stryku commented Nov 30, 2018

I'll squash commits just before the merge. It'd be more convenient for me to keep them separated while working on this pr.

@vitaut
Copy link
Contributor

vitaut commented Dec 2, 2018

When user would pass his own format string type, this would mostly like fail during a compilation.

A user format string should be convertible to basic_string_view via to_string_view so you can use that to construct a basic_string.

the change in the public API won't be needed

Great! So the "Current API breaking changes" section is no longer relevant?

As you can see, in almost every test, format string is wrapped by the TEST_FMT_STRING.

That seems OK, but I it would be nice to have a shorter name to make tests more readable. Maybe just fmt(...)? We don't care about any name collisions in the tests.

(Now I realized, that I probably should create one more wrapper: RuntimeFormatFunctionWrapper (and rename FormatFunctionWrapper to CompiletimeFormatFunctionWrapper) that would explicitly convert a compiletime string to a runtime one and test fmt::format() against it. What do you think?)

Yes, please. Runtime format strings are the primary API and should be tested at least as much as the compile-time ones.

include/fmt/core.h Outdated Show resolved Hide resolved
include/fmt/color.h Show resolved Hide resolved
include/fmt/core.h Outdated Show resolved Hide resolved
include/fmt/format-inl.h Outdated Show resolved Hide resolved
include/fmt/format.h Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
@stryku
Copy link
Contributor Author

stryku commented Dec 2, 2018

A user format string should be convertible to basic_string_view via to_string_view so you can use that to construct a basic_string.

Didn't know that user format string needs to be a variation of a std::basic_string. Thanks for info (:
Anyway, I still think that we need to handle user's type of a format string, to handle cases when the string is not a simple std::basic_string<char/wchar_t> but e.g. a string with user provided allocator.
Actually I've already made changes to handle that. If passed format string's type is const Char* or const Char[], it is converted to a std::basic_string<Char>. In any other case, format is just std::moved all the way to the prepare_format's member. That handles above cases.

Great! So the "Current API breaking changes" section is no longer relevant?

Sure, done.

That seems OK, but I it would be nice to have a shorter name to make tests more readable. Maybe just fmt(...)?

Yes, please.

Will do.

include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.

include/fmt/core.h Outdated Show resolved Hide resolved
include/fmt/core.h Outdated Show resolved Hide resolved
include/fmt/format-inl.h Outdated Show resolved Hide resolved
include/fmt/format.h Show resolved Hide resolved
include/fmt/core.h Outdated Show resolved Hide resolved
include/fmt/format.h Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have another pass over the changes after you address the current comments. In general, I think this is a good approach.

include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
@stryku
Copy link
Contributor Author

stryku commented Dec 9, 2018

I hope that I removed all whitespaces-only changes, sorry again for that. Also, sorry for taking so long to reply/make changes - I had tough week with a cold.

I know that I didn't reply for one comment, I won't be able to do it today because this topic needs a deeper discussion and I want to prepare my point of view. I'll try to do it tomorrow. Now I wanted to address as much as I could today.

include/fmt/core.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for a slow reply, it's a busy time of year. I thought about this more and here are some things that IMHO need to be done:

  1. Remove prepared specs from basic_format_context. This will break dynamic formatter but it doesn't matter because custom formatting requires more work anyway. I think we should focus on core functionality here and add support for custom (including dynamic) formatters later.
  2. Merge prepared_format_specs into dynamic_specs (make dynamic_specs relocatable). This should simplify things a bit.

include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
include/fmt/format.h Outdated Show resolved Hide resolved
Implementation of fmt::prepare() function and features around it.
@stryku
Copy link
Contributor Author

stryku commented Jan 8, 2019

I squashed the commits and made changes as agreed:

  • fmt::prepare() doesn't support custom formatters (user types, arg joins etc). The prepared specs member was removed from the context (there is no static_assert implemented).
  • dynamic_format_specs uses string_view_metadata instead of string_value. Thanks to that, named argument reference is relocatable.
  • removed format_part_writer.

I'm aware that android build fails (not sure why gradle is not able to find Ninja). I'll handle this asap.

Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, I think we are getting close.

@@ -877,16 +881,18 @@ FMT_CONSTEXPR typename internal::result_of<Visitor(int)>::type
template <typename Char, typename ErrorHandler = internal::error_handler>
class basic_parse_context : private ErrorHandler {
private:
basic_string_view<Char> format_str_;
int next_arg_id_;
basic_string_view<Char> primary_format_str_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is primary_format_str_ needed? Shouldn't format_str_ be enough to resolve references?

Also the indent is a bit off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU, while parsing/formatting, basic_parse_context::advance_to() can be called, which calls format_str_.remove_prefix(), which 'invalidates' format_str_ and arg_ref::name (which is string_view_metadata) can not be created because it needs primary format string to calculate offset from the beginning of the format. Does that make sense, or I just messed up something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it is not a problem because we agreed to address custom formatting separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basic_parse_context::advance_to() is called only for custom formatters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More or less. The whole context machinery is there to support extension mechanism (custom formatters).

@@ -1585,68 +1596,79 @@ class specs_setter {
basic_format_specs<Char> &specs_;
};

template <typename ErrorHandler> class numeric_specs_checker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems no longer needed and can be integrated back into specs_checker (just revert specs_checker changes).

Copy link
Contributor Author

@stryku stryku Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's still useful because it's used in two places: dynamic_specs_handler and prepared_format while formatting, to check already prepared specs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the second use.

@vitaut
Copy link
Contributor

vitaut commented Jan 12, 2019

I merged the main part of this PR with some tweaks. The part that is not merged is test refactoring as I'm not sure that it's a good idea to type parameterize the tests. Will explain in more details a bit later. Thanks for working on this!

@vitaut vitaut closed this Jan 12, 2019
@vitaut
Copy link
Contributor

vitaut commented Jan 13, 2019

To elaborate on my comment about the tests, I think it would be nice to avoid type parametrization because

  1. It introduces another layer of calls which makes tests different from the usual way the API is used. This may potentially obscure bugs in the API and in general makes tests more complicated.

  2. We do not need to test the formatting logic multiple times through different APIs. This is why most of the tests are done through fmt::format. With prepare we only need to test the parts that behave differently, not all the possible format specifiers.

@stryku
Copy link
Contributor Author

stryku commented Jan 13, 2019

I am glad that the main part has been merged! Thanks for the time that you put into review.

Good points about the tests. I'll take a closer look on them, figure out what really needs to be tested and provide a pull request or create an issue to start a discussion.

@vitaut
Copy link
Contributor

vitaut commented May 12, 2019

@stryku, could you please review the updated CONTIBUTING document, particularly the part about licensing, and let me know if you agree with it being applied to your contributions to {fmt}? The library is likely to be relicensed (#1073) so I'm collecting approval from all earlier contributors. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants