-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add QCA chip reset, Link detection, Version information #599
Conversation
3dc1f68
to
fa1b6e0
Compare
I heard about this problem, but never saw it by myself on a setup. So i'm interested in the details of the cause (QCA7K firmware version, Linux QCA driver, what's the symptom of the QCA7K which requires a reset). |
I tried to understand for quite a while and gave up. It depends on the EV. Some EV never trigger the bug, some trigger it almost always (E.g. Volvo XC40 with a PLC sniffer attached, it does not do this without the sniffer though). It happens both with firmware 1.2.5 and 3.0.0 on the EVSE side, linux driver version: qcaspi spi0.0: ver=0.2.7-i, clkspeed=12000000, burst_len=5000, pluggable=0 The symptoms are as follows: After unplug, a new key is set using CM_SET_KEY.REQ. A matching CNF is received. I also verified that the correct key is set by reading it back using plctool. If the bug occurs, the chip seems to behave normally from a linux sw point of view, it replies to all requests, no errors are logged in debugfs/dmesg or ethtool. On the next plug in, SLAC is performed correctly. After the matching cnf, the EV tries to send an association request (CM_ASSOC.REQ or so) to the EVSE PLC and it does not reply to it (we saw this on an external PLC sniffer). Since this is PLC to PLC chip communication we cannot do much about it, it is a firmware bug in the QCA as far as I understand. The result is that the link does not come up and no SDP is possible etc. The link is never reported as up (using plctool -L). EVerest does not verify this at the moment as it is a non-standard PLC message and hence gets stuck after slac. But even if, the chip never recovers, so even restarting SLAC would not help at all in this case. I verified in a lot of testing on the Charin events in Valencia and Cleveland (where i had good reproducability) that resetting the chip after setting new keys after unplug works 100% of the time, this is what is implemented here. It is disabled by default though, I have not verified if other PLC chips (lumissil, vertexcom) implement the RS_DEV command (they also don't need it as I never had problems with stability with those). Next step would be to implement link up detection in a separate PR and restart SLAC in that case as there are probably still cases which are not solved by this PR. |
I also discussed this with several car OEMs and they said that also on the EV side they reset the QCA on every unplug due to some firmware bugs and said it is good practice to do so. I prefer to switch to Lumissil or VertexCom instead ;-) |
Thanks for the detailed explanation |
slac::messages::HomeplugMessage hp_message; | ||
hp_message.set_protocol_version(protocol_version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a new method, defined in libslac? If so, the dependencies.yaml should also be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it depends on EVerest/libslac#7 which needs to be merged first. Then dependency yaml can be updated
Added another commit to also wait for link to be established before entering matched state. Successfully tested on real cars that it retries if the link does not come up. Also this option is disabled by default as it may only work on Qualcomm chips. |
type: boolean | ||
default: false | ||
chip_reset_delay_ms: | ||
description: Delay between SET_KEY.CNF and RS_DEV.REQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: i've noticed that the QCA7K does not always send a SET_KEY.CNF (or the host has a chance to receive it). Unfortunately i never had the time to investigate this further. AFAIR this was the case for firmware before 3.0.0 and non-fast SLAC configuration which does a real reset after CM_SET_KEY, but i'm not totally sure.
This is just a warning, not a request to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. The non-fast SLAC config, does that mean the QCA resets itself when setting a key? I think we always used a config where the QCA does not automatically reset when the key is changed, that is why we do a reset here manually afterwards. I have not seen the issue of missing CNF in our setup, but it would make sense that e.g. the QCA itself resets before the CNF packet is really transmitted to the host
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe using fast config option is preffered as we have more control over when (or if at all) we need to reset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. The non-fast SLAC config, does that mean the QCA resets itself when setting a key? I think we always used a config where the QCA does not automatically reset when the key is changed, that is why we do a reset here manually afterwards. I have not seen the issue of missing CNF in our setup, but it would make sense that e.g. the QCA itself resets before the CNF packet is really transmitted to the host
From my understanding the behavior in this case is the following:
- EVSE host send CM_SET_KEY.REQ
- QCA7000 stores the the NMK in the internal flash
- QCA7000 send CM_SET_KEY.CNF
- QCA7000 triggers a reset, which clears the messages buffers
Yes, i agree this process is not recommend for production because of the flash wearout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fast way also writes it to flash memory, after power cycle the key is still there. I think the chip was designed for home usage where the PLC modems are paired a few times per lifetime. It was not really designed for changing a key very often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, the other advantage of fast way was avoiding the reset which takes ~ 7 second before SLAC is really operational again.
I checked on VertexCom that it still works when the reset and link detect options are disabled. I'm not aware of VertexCom or Lumissil vendor specific messages that could allow for reset/link detection/version information etc. Anyone knows? |
We implemented the link detection in a GreenPHY compatible way (not vendor specific) in our propietary stack, which has been successful tested on QCA7000 and VertexCom. Is the involved MME sufficient? Version information is always vendor specific AFAIK. |
Which one did you use? |
CM_NW_INFO? NSCM_LINK_STATS? |
CM_NW_STATS.CNF->NumSTAs |
ok. Funny to see how many ways there are to find out if there is a link. Lumissil has a bunch of different packages implemented as well. Edit: Lumissil supports CM_NW_INFO but not CM_NW_STATS |
69bb59a
to
5ad58f6
Compare
Now link detection and version information also working for Lumissil. VertexCom is supported, but only without link detection/version information/reset as of now. Will add vendor specific support for VertexCom in a later pull request when we get access to the documentation. |
Signed-off-by: Cornelius Claussen <[email protected]>
…ected Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: aw <[email protected]>
Signed-off-by: aw <[email protected]>
Signed-off-by: aw <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Cornelius Claussen <[email protected]>
Signed-off-by: Kai-Uwe Hermann <[email protected]>
2a9dcda
to
7f82831
Compare
Signed-off-by: aw <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fine so far. Please test with the latest changes. If everything works as expected, feel free to merge. Remember to squash.
Signed-off-by: aw <[email protected]>
* Add QCA chip reset * Option: wait for link status ready before signalling DLINK_READY=connected Signed-off-by: Cornelius Claussen <[email protected]> * chore: using type traits * refactoring * formatting Signed-off-by: aw <[email protected]> * Slac: print PLC device attributes for log analysis * Update description in manifest * Move qualcomm specific messages to namespace * Move version logging into separate function * set default to false * Add Lumissil vendor messages * fix lumissil link detection and version information * Update description in manifest Signed-off-by: Cornelius Claussen <[email protected]> * Bump libslac dependency version to v0.3.0 Signed-off-by: Kai-Uwe Hermann <[email protected]> * chore(code-quality) * chore(clang-format) Signed-off-by: aw <[email protected]> --------- Signed-off-by: Cornelius Claussen <[email protected]> Signed-off-by: aw <[email protected]> Signed-off-by: Kai-Uwe Hermann <[email protected]> Co-authored-by: aw <[email protected]> Co-authored-by: Kai-Uwe Hermann <[email protected]>
Describe your changes
In some scenarios, QCA7k requires a chip reset after setting up keys. This PR adds an option (not enabled by default) to send a RS_DEV.REQ custom vendor MMU to reset the chip. Works at least on QCA7k.
Requires add-qca-reset branch from libslac as well.
Update: Now also waits for link to actually come up using LINK_STATUS message before switching to MATCHED state. Also this option is disabled by default for now as it may only work on Qualcomm chips.
Issue ticket number and link
Checklist before requesting a review