RFC: Logging subsystem overhaul #30353

nordic-krch · 2020-12-01T13:02:19Z

Introduction

Logging subsystem has been in zephyr for more than 2 years. During that time there were feature requests which could not be fulfilled with current design. As number of those requests has grown I've started to look into overhaul of logger internals. Overhaul does not plan to change logger application API but it will change backend API. Currently logger has following limitations:

64 bit values cannot be logged
floats cannot be logged logging: 32 bit float values don't work. #18351
logging transient strings is problematic and requires explicit log_strdup RFC: adjust policies and introduce new macros for logging strings #18147
log message is not self-contained as it arguments may point to transient strings RFC: adjust policies and introduce new macros for logging strings #18147
log hexdump macros does not accept arguments for meta string HEXDUMP log gives warning when logging function name. #26170
logging from user space is time consuming

Further weaknesses:

processing log message takes a lot of stack and flash as log message arguments need to be resolved to vla function:

zephyr/subsys/logging/log_output.c

Line 278 in a953ac8

static void std_print(struct log_msg *msg,
internal handling of log messages is troublesome because of fragmented nature.

There were suggestions (@andyross) to switch to use ring buffer instead of current fragmented messages.

Proposed change

Plan is to rework internals of the logger to store log message in a ring buffer. This will require that backends handle message synchronously (make a copy if needed).

There will be only one type of a log message (instead of current distinction between standard and hexdump). Message will consist of header, arguments, data (optional), transient strings (optional). Logger will attempt to store arguments in a way that it can be directly casted to va_list (validated on majority of platforms). String arguments will point to space within the message.

_Generic keyword (https://en.cppreference.com/w/c/language/generic) will be used to detect types of logger message arguments at compile time. Any float arguments will be promoted to double, any char * arguments will trigger further processing which will involve detecting %s arguments, calculating length needed to store strings, copying strings to the message body. log_strdup will be deprecated.

This approach shall solve all weaknesses and limitations mentioned above.

As already mentioned, internal architecture change will impact backend interface. Currently, log_message handling is asynchronous, message is alive until last backend indicates that message is not used. This approach may give better memory utilization in certain scenarios because backend does not need to copy the message but, in general, new approach will more memory efficient due to ring buffer. Current approach comes from nrf5 sdk where it was used in a bare metal applications where asynchronous handling is more welcomed. In RTOS environment value of this approach diminishes.

Ring buffer needed for the logger may be generic since it is a variable length messages buffer with following arguments:

multiple producers
3 modes: saturate, overwrite the oldest, block thread context until available
two consumers: standard and panic which may interrupt standard

There are further requests which does not imply architecture change and they will be gradually fixed:

Storing log filtering data in persistent memory Store logger filtering data in persistent memory #29147
LOG_RAW macro logging: add LOG_RAW macro #23019
User definable logger tag User definable logger tag #20430

@andyross please let me know if you have any comments.

Dependencies

Logger backends interface will change. It will be simplified as there will be only single message type, message will not be fragmented and arguments will form va_list which can be directly passed to the formatter.

Concerns and Unresolved Questions

_Generic keyword is C11 (#30105) and it is unclear which C standard is currently used by Zephyr. However, it currently compiles without enforcing C11. I didn't who enables C11.

The text was updated successfully, but these errors were encountered:

koffes · 2020-12-03T11:18:19Z

Great! In relation to this work, I would like to highlight #23021 as well

pabigot · 2020-12-07T13:52:56Z

I want to point out again that float arguments are already promoted to double when passed to a function that lacks a prototype (or in a varargs position), so there is nothing to do for "Any float arguments will be promoted to double". What has to be done is detecting that the value to be formatted takes more than one word on the stack.

I added cbprintf_arglen() to help with that issue, but that doesn't help with caching transient strings. If the refactored log system still needs to copy out state for later processing rather than perform it immediately then cbprintf could also be enhanced with a function that walks the format string extracting the specifications and provides them to a callback function, which would allow detecting string parameters precisely.

That wouldn't work with CBPRINTF_NANO, though, which lacks the format specification processing infrastructure. So if it can be done with _Generic that would be more flexible. I'm just not clear on how generic varargs functions would work.

nordic-krch · 2020-12-08T15:32:40Z

@pabigot
So with _generic i am able to serialize all log arguments into an array, detect strings and make the copy into the message. What i missing now is on backend side when string needs to be formatted to be able to get args from serialized array instead va_list.

I was thinking about extending cbprint api with something like:

union cbprint_args {
  va_list ap;
  void *stream;
};

int z_cbvprintf(char_out, ctx, fmt, union cbprint_args args, bool is_va_list);

static inline cbvprintf(char_out, ctx, fmt, va_list ap)
{
  union cbprint_args args = {.ap = ap};
  z_cbvprintf(char_out, ctx, fmt, args, true);
}

static inline cbsprint(char_out, ctx, fmt, void *stream)
{
  union cbprint_args args = {.stream = stream};
 z_cbvprintf(char_out, ctx, fmt, args, false);
}

then change current cbvprint body and replace va_arg with something like:

#define arg_get(is_va_list, args, type) is_va_list ? va_arg(args.va_list, type) : get_from_stream(args.stream, type)

didn't yet try it but it seems to be doable.
What do you think?

pabigot · 2020-12-08T16:09:57Z

Can you point me to the proposed solution using _Generic? And how you determine the amount of space required for the stream with that interface?

I think the general idea of what you propose has potential, but I'm still concerned about ensuring argument processing is consistent throughout. Unless you're matching arguments against format specifications, which would require the callback API I proposed, I don't believe you can correctly detect strings. E.g. how do you handle a char* argument in a position that corresponds to %p? Making a copy of the string pointed to by the argument would be wrong.

I have some ideas of how to generalize the cbprintf API but want to understand the rest of your solution architecture before determining the most flexible enhancement.

nordic-krch · 2020-12-08T17:50:09Z

All macro wizardry starts here: https://github.com/nordic-krch/zephyr/blob/7960fd342b65f1779e0c47c1dfc360237d61bf4d/include/logging/log_msg2.h#L399

First, _Generic is used to detect if there are any char * arguments by summing using FOR_EACH on arguments. If there are no char * then case is simple, there are no strings to copy. If there are any char * then complex path is taken (Z_LOG_MSG_EXT). In that path, next step is to build an array which on nth index contains pointer to the char * pointer in the serialized buffer or null if corresponding argument is other than char *. Next z_log_get_s_mask is used to get bit mask of %s in the string. Next, having those 3 things number of bytes needed for those strings are determined. Space is allocated. Then message is filled, strings are copied into message and array with pointers are used to update char * pointers in the serialized arguments.

The most important thing is that all this complex operations not compiled if log message does not contain char *.

There are two versions of macros: one that writes directly to the space allocated for message and second version which creates array with serialized arguments on the stack and pass it to the function. This is a bit slower but may take less rom and is suitable for user space. I will experiment to determine if it makes sense to keep both versions.

I will experiment tomorrow with cbprint extension.

pabigot · 2020-12-08T18:19:02Z

First, _Generic is used to detect if there are any char * arguments by summing using FOR_EACH on arguments.

But that's wrong. If I write:

LOG_DBG("label %s at %p", cfg->label, cfg->label);

I expect to see the string value of cfg->label and the pointer value of cfg->label. Displaying the address of some copy of the string made by the logging infrastructure is not acceptable.

The only way I can see to get correct output is to walk the format string, decode the specifiers, and use the specifier character to determine that a copy of a string needs to be stored somewhere. No macro wizardry would be involved.

I have API in mind that would do this and be flexible enough to handle a few other use cases I can imagine. Let me prototype that API and see whether it'll do what you need.

nordic-krch · 2020-12-08T18:45:11Z

The only way I can see to get correct output is to walk the format string, decode the specifiers, and use the specifier character to determine that a copy of a string needs to be stored somewhere. No macro wizardry would be involved.

yes, that is exactly as you wrote. _Gereric is used to determine at compile time if extended version of macro is used or the simple one. Extended involves call to z_log_get_s_mask which decode specifiers and returns mask of %s types. So if if have LOG_DBG("label addr: %p", (char *)label) it will compile extended version but z_log_get_s_mask will return 0 and no strings will be copied.

In case of your example, logger will present exactly what you want, %p will point to original label address but label string will be copied into message payload and formatted from there.

nordic-krch · 2020-12-08T18:47:08Z

Also it is possible to apply extension only to char * and not to const char * to save space but it may not always be correct (eg. if transient string comes as const char * function parameter).

pabigot · 2020-12-08T18:47:21Z

Ew. OK; where's z_log_get_s_mask()? Can you point me to the branch?

nordic-krch · 2020-12-08T19:34:59Z

this function is already used in master:

zephyr/subsys/logging/log_core.c

Line 84 in 87dddda

uint32_t z_log_get_s_mask(const char *str, uint32_t nargs)

The branch i'm working on: https://github.com/nordic-krch/zephyr/tree/log_64bits

pabigot · 2020-12-08T19:44:23Z

Thanks; I see that now. It is another decoding of the format specifier, which I'd hoped to avoid because parsing the format sequence to extract conversions correctly is not trivial. (For example z_log_get_s_mask() will do the wrong thing if %n or the * width or precision flags are present.)

But my first attempt at sharing the conversion extraction fails, because of the way va_list objects need to be processed. It could be made to work by turning a fairly long sequence of code into a macro, and using that macro in multiple functions. Or possibly by completely inverting the cbprintf_complete implementation. That has some potential but I'm not going to dive into it right now.

My attempt is at: https://github.com/pabigot/zephyr/commits/nordic/20201208b

Please read the message of the top commit, which identifies the core problem. If, in principle, you think something like what's prototyped in the API could work, and agree that we should try to re-use the existing conversion extraction code, I'll revisit this.

nordic-krch · 2020-12-09T06:11:02Z

That is more or less what I have in mind regarding struct cbprintf_packed_args but I assume that it is possible to build struct cbprintf_packed_args only based on types of arguments provided to the string using _Generic.

Walk might be second option when __VA_ARGS__ is not available and va_list is used (e.g. hooking external logger functions).

nordic-krch · 2020-12-09T06:31:46Z

And definitely z_log_get_s_mask() belongs in cbprintf.

pabigot · 2020-12-09T11:15:50Z

Can we make logging select CBPRINTF_COMPLETE? Because if it has to work with nano the cbprintf changes will be a problem.

IMO it's reasonable to require complete support when using logging. @carlescufi you particularly wanted nano as an option, what do you think?

carlescufi · 2020-12-11T13:32:49Z

Can we make logging select CBPRINTF_COMPLETE? Because if it has to work with nano the cbprintf changes will be a problem.

IMO it's reasonable to require complete support when using logging. @carlescufi you particularly wanted nano as an option, what do you think?

Logging is big enough as a subsystem that selecting CBPRINTF_COMPLETE when enabling it is reasonable to me. The nano version of cbprintf is there for tiny apps that do not use the logging subsystem, just the console/printk.

pabigot · 2020-12-14T00:02:44Z

The solution proposed here as I understand it is based on external processing of the log format string and argument va_list to determine what each parameter is, mutating some arguments (such as transient strings), and saving them all off the stack for subsequent processing.

At a high level this seems reasonable, but I believe the packaging of data that will be processed by cbprintf must be done by cbprintf, for both correctness and maintainability. An example of a case that can't easily be handled externally is a %.5s print specification, for which the argument may be char * pointing to RAM text that has no terminating NUL: it must be copied, and knowing that only the first 5 characters will ever be accessed is the only way to make a copy. A more complicated related case is %.*s where the number of characters to display is provided in a preceding argument. Handling that sort of thing requires much more than knowing which positional arguments correspond to strings as with the existing z_log_get_s_mask().

I've created #30675 which extends the cbprintf infrastructure with something that captures formatting information into a buffer, allowing it to be processed later.

I have some hopes that this would make use of _Generic and varargs macro wizardry around log calls unnecessary, though that may be optimistic.

nordic-krch added area: Logging RFC Request For Comments: want input from the community labels Dec 1, 2020

nordic-krch self-assigned this Dec 1, 2020

carlescufi changed the title ~~RFC: Logger subsystem overhaul~~ RFC: Logging subsystem overhaul Dec 3, 2020

koffes mentioned this issue Dec 3, 2020

Logging: Using asserts with LOG in high pri ISR context blocks output #30418

Closed

nordic-krch mentioned this issue Dec 4, 2020

net: syslog: Make sure CONFIG_LOG_IMMEDIATE is not set #30290

Merged

pabigot mentioned this issue Dec 14, 2020

lib: cbprintf: add support for deferred formatting #30675

Closed

nordic-krch mentioned this issue Jan 4, 2021

Add support and sample backend for dictionary-based logging #30539

Merged

This was referenced Jan 18, 2021

logging: 32 bit float values don't work. #18351

Closed

Logging: Overhaul #31535

Merged

galak added the LTS Long term release branch related label Mar 17, 2021

nashif closed this as completed in #31535 Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Logging subsystem overhaul #30353

RFC: Logging subsystem overhaul #30353

nordic-krch commented Dec 1, 2020 •

edited by carlescufi

Loading

koffes commented Dec 3, 2020 •

edited

Loading

pabigot commented Dec 7, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 9, 2020

nordic-krch commented Dec 9, 2020

pabigot commented Dec 9, 2020 •

edited

Loading

carlescufi commented Dec 11, 2020 •

edited

Loading

pabigot commented Dec 14, 2020

RFC: Logging subsystem overhaul #30353

RFC: Logging subsystem overhaul #30353

Comments

nordic-krch commented Dec 1, 2020 • edited by carlescufi Loading

Introduction

Proposed change

Dependencies

Concerns and Unresolved Questions

koffes commented Dec 3, 2020 • edited Loading

pabigot commented Dec 7, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 8, 2020

pabigot commented Dec 8, 2020

nordic-krch commented Dec 9, 2020

nordic-krch commented Dec 9, 2020

pabigot commented Dec 9, 2020 • edited Loading

carlescufi commented Dec 11, 2020 • edited Loading

pabigot commented Dec 14, 2020

nordic-krch commented Dec 1, 2020 •

edited by carlescufi

Loading

koffes commented Dec 3, 2020 •

edited

Loading

pabigot commented Dec 9, 2020 •

edited

Loading

carlescufi commented Dec 11, 2020 •

edited

Loading