Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: cbprintf: add support for deferred formatting #30675

Closed
wants to merge 5 commits into from

Conversation

pabigot
Copy link
Collaborator

@pabigot pabigot commented Dec 12, 2020

In applications like logging the call site where arguments to formatting are available may not be suitable for performing the formatting, e.g. when the output operation can sleep. Add API that supports capturing transient formatting data into a buffer.
Also API that produces the output later using the packaged data.

These changes increase stack requirements on the vprintk path from 152 bytes to 208 bytes on ARM, in part because code that could formerly be inlined must now be a separate function as it can be invoked from multiple locations. Other architectures have different magnitude changes, particularly SPARC which incurs a 96 byte cost for every nested call.

Storing the data could be changed fairly easily to use a ring buffer if that's preferred.

Relates to #30353 (comment).

@github-actions github-actions bot added area: API Changes to public APIs area: Tests Issues related to a particular existing or missing test labels Dec 12, 2020
@pabigot pabigot force-pushed the nordic/20201211b branch 4 times, most recently from 8f1d04b to 13cb193 Compare December 13, 2020 15:38
@pabigot pabigot force-pushed the nordic/20201211b branch 3 times, most recently from e530096 to d7d9b71 Compare December 13, 2020 23:49
@pabigot pabigot changed the title Nordic/20201211b lib: cbprintf: add support for deferred formatting Dec 13, 2020
@pabigot pabigot requested a review from nordic-krch December 14, 2020 00:03
@nordic-krch
Copy link
Contributor

@pabigot thanks for posting this, looks very good. I think that it could be extended with 2 macros:

  • macro for deciding if static package generation is possible. It will basically check if there are any char * arguments. Additionally, it should have skip argument which will indicate how many char * are still OK. This will be applied in cases when log message contains known, fixed strings like function name. CBPRINTF_CAN_STATIC_PACK(skip, ...)
  • macro for statically building a package CBPRINTF_STATIC_PACK(buf, len, ...). Same as in function version, if buf is null only length is calculated.

I can look into static packing as i think that it will give significant performance boost. Currently it takes 22us to pack "test %d %d" on nrf52, expecting to be almost instant with static packing.

@pabigot
Copy link
Collaborator Author

pabigot commented Dec 14, 2020

I don't understand how static package generation could work since you have to know what the formatting is going to do to determine what value gets packed.

Also the format specifier itself is a const char* which could point to a user-provided format string, so checking for format arguments that are strings doesn't cover all potential legal uses.

But go ahead and put together what you have in mind so it's more clear.

@nordic-krch
Copy link
Contributor

@pabigot please take a look at https://github.com/nordic-krch/zephyr/commits/nordic/20201211b
I've added macro: nordic-krch@d32f351
and modified the test to validate configuration with static packaging: nordic-krch@932f8a2
with minor modifications to cbprintf_complete.c test is passing for all configurations.

@pabigot
Copy link
Collaborator Author

pabigot commented Dec 16, 2020

Thanks. I've fixed the swapped ldbl/dbl packing in this PR.

I've also adopted your fix for wint_t, which was indeed incorrectly handled, but modified it slightly to do the right thing for pull_va_args() and to treat wint_t as signed (because char is documented and treated as signed in the current implementation; for char it doesn't matter, and I can't convince myself it matters for wchar_t either).

Two requests for when you move to a PR for the log infrastructure:

  • Add test cases for packing float arguments. These are unnecessary currently because all arguments go through standard argument promotions, but since the _Generic() approach can tell the difference between 1.2F and 1.2 it needs to be tested.
  • Related, when static packing can be performed please add a test that cbprintf_package() actually does produce exactly the same buffer, since that's the documented behavior. It would be nice if this was automatically done when USE_PACKAGED is selected rather than adding a whole new test axis (i.e. just check whether static packaging would work, and if so verify the cbprintf_package() results against it with memcmp).

@pabigot
Copy link
Collaborator Author

pabigot commented Jan 5, 2021

Updated with some fixes from #31102 and rebased after #30665 merged.

@pabigot
Copy link
Collaborator Author

pabigot commented Jan 7, 2021

Rebased on #31133; handling wchar_t cross-platform was nastier than I'd thought.

@zephyrbot zephyrbot added the area: Base OS Base OS Library (lib/os) label Jan 8, 2021
@nordic-krch nordic-krch mentioned this pull request Jan 22, 2021
The extraction of values from varargs was inlined in cbvprintf()
because va_list arguments cannot be passed into subroutines by
reference, and there was only one place that needed this
functionality.  An upcoming enhancement requires extracting the
arguments but processing them later, so outline the extraction code.

Signed-off-by: Peter Bigot <[email protected]>
Format width and precision can be extracted from arguments.  A
negative value is interpreted by changing flag state and using the
non-negative value.

Refactor pull_va_args() so the raw value from the stack is preserved
in the state so it can be packed, deferring the interpretation to
point-of-use.

Signed-off-by: Peter Bigot <[email protected]>
An upcoming enhancement supports conversion with information provided
from sources other than a va_list.  Add a level of indirection so the
core conversion operation isn't tied to the stdarg API.

Signed-off-by: Peter Bigot <[email protected]>
Several tests require additional stack space to accommodate a deeper
call stack due to inability to inline functions in an upcoming
enhancement.

Signed-off-by: Peter Bigot <[email protected]>
In applications like logging the call site where arguments to
formatting are available may not be suitable for performing the
formatting, e.g. when the output operation can sleep.  Add API that
supports capturing data that may be transient into a buffer that can
be saved, and API that then produces the output later using the
packaged arguments.

Signed-off-by: Peter Bigot <[email protected]>
@pabigot pabigot requested a review from dcpleung as a code owner February 12, 2021 15:51
@pabigot pabigot removed the request for review from andrewboie February 12, 2021 15:55
@zephyrbot zephyrbot requested a review from ceolin February 15, 2021 17:12
@carlescufi carlescufi requested a review from npitre February 24, 2021 10:25
@npitre
Copy link
Collaborator

npitre commented Feb 24, 2021

I'm looking at this PR and something makes me feel uneasy. Let me explain.

I don't dispute the need for deferred printouts, either for log gathering or
because the context where the print is generated doesn't accommodate actual
output. But to my unacquainted view this seems to be the wrong approach.

Firstly is code complexity. This is using non negligible code footprint and
additional stack usage. And I'm not even talking about understanding and
maintaining the code itself.

Secondly I think the cbprintf_package() is doing too much parsing and
validation. If this is meant to be used where sleeping can't happen this
usually means IRQ context. Therefore it better do as little processing as
possible, and leave the actual parsing and validation to the consumer side.

Lastly this makes coupling between the log facility and the print facility
a bit too intimate. The fact that nano can't simply be substituted to complete
feels wrong in that regard.

Now here's what I think could be done. I might be completely in the weeds
about this and if so please tell me.

Each pieces should be optimized to do only one thing and do it well.
The printing facility should be left at that and only that. It shouldn't
get involved in the logging business.

The logging facility should do just that: gathering data, store it
efficiently and restore the original data for later consumption.

Coupling between those two should be as small as possible so any of those parts
can be exchanged for another (e.g. nano vs complete).

So what the logging facility should gather is actually the format string
(or possibly a pointer to it if it lives in ROM) and the entire va_list
content. So yeah, some parsing of the format string is necessary, but
not that much. A much simpler state machine is needed to find out how much
of the va_list needs to be copied over and stuffed in the log record,
likely smaller than the code footprint of what is proposed now. This has to
be quick and simple. There is only one case where this might be suboptimal
i.e. "%.*s" where the actual string length is limited, but that should be
quite rare for such strings to be in non ROM, and is probably be worth the
compromize of storing the entire string in the log otherwise.

On the log consumption side then we only need to reconstruct the va_list
context and call whatever regular printf variant we want with no compromize
on efficiency this time.

Am I missing something obvious?

@pabigot
Copy link
Collaborator Author

pabigot commented Feb 24, 2021

I'm looking at this PR and something makes me feel uneasy.

The whole thing makes me uneasy.

So yeah, some parsing of the format string is necessary, but
not that much. A much simpler state machine is needed to find out how much
of the va_list needs to be copied over and stuffed in the log record,
likely smaller than the code footprint of what is proposed now.

I'm not sure that's true, but I'm happy to withdraw this and have somebody else come up with an alternative.

@npitre
Copy link
Collaborator

npitre commented Feb 25, 2021 via email

@nordic-krch
Copy link
Contributor

@npitre please not that this PR is followed by #31102 which introduces "static" packaging (with the one here would be called "runtime"). Static packaging is attempting to form a package at compile time as simple assignments to an array. It is utilizing _Generic C11 keyword to handle promotions (e.g. store float as double). Additionally, static packaging extension adds macro for detecting at compile time if it can be applied. Basically, runtime packaging is required if there is a possibility that there is a string argument that needs to be stored by value in the package. Detection searches for any char pointers then but does not inspect format specifiers in the string so there might be some false positive like "%p", (char *)p". I consider a runtime method as a fallback to be used whenever static packaging cannot be used (presence of string arguments or lack of support for C11 compiler, see #30105).

Before @pabigot proposed that i was experimenting with just dumping va_list since it is on the stack but there are two main issues there:

  • platform dependency, in case of 86_64 it is not even continues array on the stack
  • storing string arguments by value cannot be done. It is very important from deferred logging perspective to have string arguments in self-contained package

@pabigot
Copy link
Collaborator Author

pabigot commented Feb 25, 2021

Also when pursuing alternative solutions keep in mind #32373 which identifies that va_arg must be used in some cases to pull off floating point values as floating point values, not arbitrary blobs of memory.

@npitre
Copy link
Collaborator

npitre commented Feb 26, 2021 via email

@npitre
Copy link
Collaborator

npitre commented Mar 1, 2021 via email

@pabigot
Copy link
Collaborator Author

pabigot commented Mar 4, 2021

Superseded by #32734

@pabigot pabigot closed this Mar 4, 2021
@pabigot pabigot deleted the nordic/20201211b branch March 15, 2021 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Base OS Base OS Library (lib/os) area: Kernel area: Tests Issues related to a particular existing or missing test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants