-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: adjust policies and introduce new macros for logging strings #18147
Comments
Example, just to show what I mean, I don't care too much what the interface looks like:
In the log:
|
@nordic-krch actually perhaps there is a way we can solve this without changing the interface. What if log_strdup() returned a string pointer that had the lowest bit set, as a marker that this is a transient string. we can use the low 2 bits for flags since they will be at least 4 byte aligned. this gets saved with the event. When the backend looks at the event arguments, it will note that the low bit is set, and that will be a trigger to dump the string value out to the log data instead of logging its pointer value. If the backend formats the messages at runtime, naturally it will clear the low bit before actually dereferencing the string pointer argument. |
Regarding reducing log message footprint and removing log message strings from the binary. I'm not that good in linking/elf stuff but i was wondering if it would be possible to store those strings in a section which would not go into binary but would stay in elf. Then on PC it would be possible to decode logs with access to elf file. Regarding transient strings handling: it may be based on PR i hope at some point will get merged (#17367). There i proposed a change where no only strings but also any other structs can be duplicated, e.g. instead of formatting ble address before passing it to string you create object with ble address and formatting function and formatting happens during log processing. Object are allocated from the same pool as log messages (there is no need for custom pool for There is one risk with that however. Log argument can have value which matches address of element from the pool and it might not be it. This can be solved on host side. Whenever reference candidate is detected we send additional object and on host side when forming human readable version format specifier will help to determine if reference should be treated as value or as address to an object. |
Disregard, this won't work.
Yeah that should be doable.
But if it's a reference candidate, don't you have to dump out its contents to the log? What happens if it tries to do that with a value it thinks is a pointer within the pool when it isn't?
I haven't had time to look at that one yet, but doing this by introducing new format strings will not be able to validate at compile time with __printf_like(), and exacerbates the usage of stdarg.h which is required to not be used by MISRA-C. You need to talk to @ceolin and other fusa stakeholders. |
It using same approach a linux does (i heard) it's using |
but the fact is that in that PR it is freeing allocated message based on |
@andrewboie @nordic-krch Thanks for thinking about this. I would wonder if dynamically generated strings in the embedded side would not be better avoided altogether, due to their overhead. What about the idea of doing something a bit more radical, for a lean logger version which does the absolute minimum in the embedded side: Edit: after reading @andrewboie and @nordic-krch comments I realized I mostly repeated what they said here. Sorry for that noise. |
For things like Bluetooth address and MAC address (probably majority of string duplicates) we could have them stored as arguments in the log message and create custom specifiers:
where this off course looks awful but would store 7 byte address into 2 32bit log arguments and then formatter would print it nicely. That is the best solution in terms of memory and performance. We can add markers like then you have something like:
that would be also applicable to printk. Similar approach could be taken for MAC address. What do you think? I agree with @aescolar that transient strings might need to be discarded for this limited logger mode. |
Another thought: Not sure if that is too unuser-friendly but if string formatting is extended to support something like:
Then if transient string is of known size (e.g. declared on stack in the function) there could be a macro which would expand
Limitations:
|
|
logger is not analyzing string when creating a log message, just copies sizeof(void *) arguments and forms a message. If it is split into multiple standard arguments then logger would treat them without any special approach. |
I would prefer if we do not put any special treatment for MAC addresses like assuming that they are certain length etc. For example in networking code the MAC/link address address can be from 2 to 8 bytes long. |
@jukkar Do you know in a particular logging statement the size of the MAC address you want to log (number of bytes of the address, not the expanded string) |
Still need this question answered. I don't think this design is workable. You feed it a value which it thinks is an object, it will then start writing potentially arbitrary memory to the log stream, and then attempt to free it out of the pool. I think there should be dedicated APIs for logging strings or binary data. It is the safest way to do this. I don't see the current design as workable when the format strings are not available at runtime. When logging data, we either need a pointer/size pair, or reliable knowledge that we are working with a C string. |
I do agree about this. |
Yes, the size of the link address is known when printing it. |
@nordic-krch should we close this? was it addressed in the overhaul? |
The mechanism we have right now for logging strings will not work in an environment where the format strings for log messages have been stripped out of the binary. The binary logging backend will only be able to log the address of the log message format string (to identify it for later offline post-processing), and the raw values of the arguments supplied.
If a transient string is logged, we have some problems. The format string is unavailable, which does not allow us to identify at runtime that a particular parameter is a string that needs to be logged. We will only be able to log its raw pointer value. If the pointer value isn't a ROM string we won't be able to look it up what the string is during post-processing.
I propose we make the following changes:
@nordic-krch what do you think of this?
The text was updated successfully, but these errors were encountered: