-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make embedded diagnostic more human readable #886
Make embedded diagnostic more human readable #886
Conversation
This PR relies on the robotology/icub-firmware-shared#84 PR. |
Therefore, we ought to keep it in draft 😃 |
382826a
to
7cf18ea
Compare
As @marcoaccame suggested, I updated the note explaining the needed step in case a developer wants to add a new message or change an existing one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change is quite big, it touches a lot of diagnostics messages (216) and it proposes a new way to process them. So, I have preferred to analyze things first: at first a recap of how it is working now, then what is the proposed change and finally the review.
All this is useful for me as well, so that I can recall what has been done all over the years.
In robotology
In here, a recap of the operations done at reception of each diagnostic message.
-
Every 1 ms the RX thread in
icub-head
processes all the UDP frames coming from a board. Inside each frame there can be some ROPs with a diagnostic message w/ a given{code, par16, par64}
. -
The
code
identifies one of the 216 errors, which belong to 9 categories, thepar16
andpar64
contain parameters specific to a particular error. So, for example ifcode
refers toeoerror_value_SYS_ctrloop_execoverflowRX
from categoryeoerror_category_System
which is emitted when duration of the control loop RX phase exceeds the maximum time,par16
andpar64
contain the packed information of the latest 5 durations of the all the phases of the control loop. -
For each diagnostic ROP the RX thread executes the callback
s_eoprot_print_mninfo_status()
which takes care of producing a meaningful string for the received{code, par16, par64}
. -
For the majority of messages we just pick up a constant string from a LUT inside
icub-firmware-shared
, append the stringisedpar16
andpar64
and print. This can be done in efficient way but w/ poor readability. -
Over the time we produced a formatted string only for some messages which we felt more meaningful. They are the following.
eoerror_value_CFG_candiscovery_started
,_ok
,_detectedboard
,_boardsmissing
,_boardinvalid
. They are processed inside functions_process_category_Config()
which prints the result of a CAN discovery done by the ETH board.eoerror_value_SYS_canservices_canprint
. The CAN PRINT messages contains a string split inside a burst of CAN frames and aeoerror_value_SYS_canservices_canprint
diagnostic message contains just one CAN frame. These messages are managed insideEthResource::CANPrintHandler()
which collects several values ofpar64
(containing the actual CAN messages of the CAN PRINT) and when the burst is finished it produces the string. It is thecan_string_eth
classe which does the job. It must be persistent and have a memory for each CAN board below each ETH board. So it was placed insideEthResource
in form of one instance per CAN board below the managed ETH board.
In this PR
This Pr adds more readability to all the other messages in the following way.
-
The entry point is still
s_eoprot_print_mninfo_status()
. -
The management of messages
eoerror_value_SYS_canservices_canprint
does not change and stays insideEthResource
. -
All the other messages are managed by the new function
feat_manage_diagnostic()
. -
The
eoerror_value_CFG_candiscovery_started
etc. are now processed insidefeat_manage_diagnostic()
with the same code retrieved from s_process_category_Config() and placed inside a new class. -
The function
feat_manage_diagnostic()
uses some new classes which produce the string for each message. The new classes are all created in runtime starting from function and then destroyed or are automatic variables constructed and destructed. They are the ones described in here.-
Diagnostic::EmbeddedInfo
: just contains the strings to be printed for the message, -
Diagnostic::LowLevel::InfoFormatter
: is the wrapper class to the various parsers and fills information insideEmbeddedInfo
. -
In its inside,
InfoFormatter
creates and destroys one class derived fromDefaultParser
which does the actual job. The specific class depends on theeOerror_category_t
of the error. They are 8 + the default parser:ConfigParser
, etc.. The class foreoerror_category_Debug
is missing but for that it is used theDefaultParser
.
-
The review
There are two points: one related to the behavior of the code, the second related to the SW architecture.
The behavior
There are some small changes related to the management of some messages that should correct most things. See the table.
error value | note |
---|---|
eoerror_value_SYS_canservices_canprint |
The relevant code inside SysParser::parseInfo() is never executed, so it is better to remove it and add a comment where the decoding happens. |
eoerror_value_CFG_candiscovery_started , _ok , _detectedboard , _boardsmissing , _boardinvalid |
The code inside function s_process_category_Config() is not needed anymore, it is better to clean it up. |
eoerror_value_SYS_ctrloop_execoverflowRX |
The par64 is parsed incorrectly. It is not separated in the four values and it prints just one big number. |
eoerror_value_SYS_ctrloop_execoverflowDO |
same as before |
eoerror_value_SYS_ctrloop_execoverflowTX |
same as before |
eoerror_value_SYS_canservices_parsingfailure |
It does not contain the CAN payload contained in par64 |
eoerror_value_SYS_canservices_monitor_lostcontact, eoerror_value_SYS_canservices_monitor_stillnocontact |
OK thanks. I have seen that @MSECode has added them. Just pls specify that the time is in ms. |
eoerror_value_SYS_proxy_forward_callback_fails and others inside the specialize parser |
Some of them have a non zero par16 and par64 which are not printed. So it can be useful for them to print the values just in case. Having said that, messages like that are probably unlikely to be seen. |
Those inside eoerror_category_Debug |
Just a memo: the specialized class is not implemented, so for them it is used the DefaultParser . |
The architecture
The new classes used by feat_manage_diagnostic()
such as Diagnostic::LowLevel::InfoFormatter
etc are created and destroyed each time, so potentially many times in a ms. Typically we don't create and destroy in runtime, and possibly at that high frequency, to avoid un-necessary processing in a critical point.
The classes could be instantiated in advance and stay alive for all the duration of yarprobotinterface
. In this way also we could also manage diagnostics messages that must be decoded in bursts such as the eoerror_value_SYS_canservices_canprint
or future ones. If we keep a memoryless approach we could not for instance move what inside EthResource
to process the CAN prints.
It is just a suggestion of what I would have done. About the decision I prefer to keep it under discussion.
eOerror_category_t category = eoerror_code2category(m_infobasic->properties.code); | ||
switch(category) | ||
{ | ||
case eoerror_category_Config: parser_ptr = std::make_unique<ConfigParser>(dnginfo, entityNameProvider); break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside switch(category)
the pointer parser_ptr
uses a class that is created and then destroyed at each call of this function getDiagnosticInfo()
which is called at the reception of every diagnostics message received by yarprobotinterface
, so potentially many times every ms.
That could be heavy.
To avoid the continuous creation and destruction of the classes, an instance of ConfigParser
, MotionControlParser
etc could be created at startup and stay alive for the whole life of yarprobotinterface
. The function getDiagnosticInfo()
would just assign to parser_ptr
the pointer to the relevant class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, update icub-firmware-shared
tag and check.
Currently, @MSECode and I are doing some fixes related to the TODO left in the code and the changes required in the PR of fw shared PR: robotology/icub-firmware-shared#84 |
f2d2855
to
96100fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me 👍🏻
Don't know what is the status of the PR wrt @marcoaccame's comments/requests.
Any updates @valegagge @MSECode?
…s_lostcontact and eoerror_value_SYS_canservices_boards_retrievedcontact
…boards. todo: bring that to method
96100fb
to
4612f0b
Compare
Today @MSECode and I tested this PR and its related PR on icub-fw-shared robotology/icub-firmware-shared#84 During the test we check some errors like fw version wrong, disconnected a mais, etc All work fine. So we can merge. We increase of the required fw version to 1.35.1 cc @marcoaccame |
Great! Awaiting robotology/icub-firmware-shared#84 to be merged before turning this PR into ready for review to enable the CI. |
@pattacini can you pls change the status of the PR into ready for review? |
Done 👍🏻 |
In this PR, I created the new diagnostic formatter and a parser for each diagnostic message class.
The goal of this work is to start to improve the robot's embedded diagnostic making it more human-readable.
Overview of the code
I developed a formatter that is called on the reception of a diagnostic message.
Depending on the diagnostic class (Motioncontrol, Hardaware,Configuartion, etc ) the formatter calls the propper parser.
Class Diagram
The
DefaultParser
prints the diagnostic information as now.Sequence Diagram:
yarprobotinterface
receives an error belonging to the motion-control classNOTE
If an unknown error is received or the parser doesn't know how to parse the message the default parser is invoked: it prints the massage like it now.
How does this PR affect the developer?
Change
param16
andparam64
If you (as a developer) want to change the contents of
param16
andparam64
of the errore
belonging to the error-classc
, you need to update also the parsing of the message in thec
class parser.Add a new error
e
belonging to the already existing classc
First of all, you need to update the LUT in the
icub-firmware-shared
as usual.If you want to add a human-readable parser, you should add the parser case in the
c
class parser, otherwise, the default parser will be used whenyarprobotinterface
receives this new errore
.Files
The new files are:
diagnosticInfoFormatter.h/cpp
: contains the formatter definition/implementation and some auxiliary classdiagnosticInfoParser.cpp
: contains the definition of all new parsersdiagnsoticInfo.h/cpp
: contains the definition/implementation of the interface classEbeddedInfo
diagnosticInfoFormatter_hid.h
: all auxiliary classes are here defined.Test performed
Currently, this code has been tested on the small joint setup and on iCubGenova11.
cc @MSECode