Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NK3 not detected/discovered at OEM Factory Reset/Re-Ownership? #48

Closed
tlaurion opened this issue Apr 9, 2024 · 10 comments · Fixed by linuxboot/heads#1638
Closed

NK3 not detected/discovered at OEM Factory Reset/Re-Ownership? #48

tlaurion opened this issue Apr 9, 2024 · 10 comments · Fixed by linuxboot/heads#1638

Comments

@tlaurion
Copy link

tlaurion commented Apr 9, 2024

Edit: fixed with linuxboot@67f1dae

Cause: race condition between USB controller kernel modules being loaded and the USB Security dongle being discovered and used. By moving the sleep function after all USB controller kernel modules are loaded, the issue disappears.


Sorry for lack of details here. It was reported that Heads was not able to properly do OEM-Factory reset/Re-Ownership on nv41 because the dongle (nk3) was first detected (those checks have been moved earlier in oem-factory-reset for dongle type detection to choose algo) but when comes the time to factory reset the dongle and generate keys inside of them, it fails.

I initially thought physical presence was the issue (touch the key) but tests showed that the dongle was then properly used while moving the dongle around on different usb ports (on a x230).

I'm wondering what happened in the NK3 firmware maybe? Just slower to be brought up
Searched and didn't find any issue opened. Any hint on what is happening? Is it possibly linked to firmware version that was flashed on the NK3? How users are supposed to know, what are the guidelines for fixing this? Are users supposed to know how to flash NK3 firmware? This behavior was not present on NK2 pro/NK2/NK1 (which I stopped using after moving my key backup to nk3 after non-default USB thumb drive key material backup that landed under master a while ago).

TLDR: random issues with NK3 at Re-Ownership vs OEM factory reset from OEM pre-shipment?
A work group tackling those issues would be needed, since its NK3 related (firmware cannot do much here. If NK3 dongle detected early in oem-factory-reset, Heads expects to be able to talk to the dongle when factory resetting it through gpg2 calls and then be able to change PINs. That seems to be where the problem is today, which might have missed proper QA since no re-ownership, or provisioning with non-default PINs, which requires talking with the dongle for PIN changes?)

@jans23 @szszszsz @daringer ?
Can you do some testing on your side? Nothing explains behavior change on Heads side of things, so needs to be a NK3 related firmware thing? Not sure how to troubleshoot isolate the problem here for resolution.

@tlaurion
Copy link
Author

Also, hotp-verification hash upstream points to later available commit while nitrokey fork points to 1.4?

Upstream:
https://github.com/linuxboot/heads/blob/ee1978ffc0afa4db1b0a46517aff946912cdca14/modules/hotp-verification#L5-L10

Nitrokey:

# v1.4 (change me)
hotp-verification_version := 34b47aa75c07522d416c915911eed820556f1da8
hotp-verification_dir := hotp-verification-$(hotp-verification_version)
hotp-verification_tar := nitrokey-hotp-verification-$(hotp-verification_version).tar.gz
hotp-verification_url := https://github.com/Nitrokey/nitrokey-hotp-verification/archive/$(hotp-verification_version).tar.gz
hotp-verification_hash := cb5a8d8cf3da57a1e3fe82251220a0dfafbbf8e25a816c322fb5b00d485d7a9b

Was the latest hotp-verification code not aimed to be used upstream @daringer ?

@daringer
Copy link
Collaborator

daringer commented Apr 12, 2024

That's weird, we don't see any issues currently - but let's do some additional testing and/or verification ...
can you tell if the nk3 is properly starting in the not-working-scenario:

  • you should see that the nk3 LED goes white shortly after being powered-on and afterwards the expectation is it should stay off until an action happens
  • do you observe any LED activity during the not working oem-factory reset?
  • this might also boil down to a hardware issue, maybe even with x230 ? do you observe consistent failure on a specific port, I remember we had something like this before reported by some customer

@alexgithublab can you please try to reproduce the reported behavior with our firmware and compare against an upstream build from here: nv41 or ns50 or x230

@tlaurion
Copy link
Author

tlaurion commented Apr 12, 2024

That's weird, we don't see any issues currently - but let's do some additional testing and/or verification ... can you tell if the nk3 is properly starting in the not-working-scenario:

  • you should see that the nk3 LED goes white shortly after being powered-on and afterwards the expectation is it should stay off until an action happens

Yes. nk3a NFC v.1.5.0 (can't upgrade NK3 firmware to latest 1.6.0 at time of writing, will open new issue)

  • do you observe any LED activity during the not working oem-factory reset?

No led activity. Nothing happens until I move the NK3 to another port and restart the wizard.

  • this might also boil down to a hardware issue, maybe even with x230 ? do you observe consistent failure on a specific port, I remember we had something like this before reported by some customer

Another customer is tempting to troubleshoot the issue for a little while under https://matrix.to/#/!RNcjJXCGHiyxXCHpKv:matrix.org/$20rR2YYEZlMs9iD1w9giODaCzqtz1ERSz7iBA8C7gLk?via=matrix.org&via=nitro.chat&via=invisiblethingslab.com

And I can now replicate with my spare NK3a NFC firmware 1.5.0.

@alexgithublab can you please try to reproduce the reported behavior with our firmware and compare against an upstream build from here: nv41 or ns50 or x230

tlaurion added a commit to tlaurion/heads that referenced this issue Apr 12, 2024
…t USB Security dongles are sure to be detected prior of "Verifying presence of GPG card"

Otherwise we get ehci-pci and xhci_hcd kernel messages pop in dmesg on debug AFTER "Verifying presence of GPG card" which explains why dongle might not be found in time and fails in oem-factory-reset
Might be linked to Nitrokey#48

Signed-off-by: Thierry Laurion <[email protected]>
tlaurion added a commit to tlaurion/heads that referenced this issue Apr 12, 2024
…t USB Security dongles are sure to be detected prior of "Verifying presence of GPG card"

Otherwise we get ehci-pci and xhci_hcd kernel messages pop in dmesg on debug AFTER "Verifying presence of GPG card" which explains why dongle might not be found in time and fails in oem-factory-reset
Might be linked to Nitrokey#48

Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion
Copy link
Author

tlaurion commented Apr 12, 2024

Seems like tlaurion@78fe0f9

Fixes it. Weird that the timing was not an issue before and became one.

tlaurion added a commit to tlaurion/heads that referenced this issue Apr 12, 2024
Otherwise we get ehci-pci and xhci_hcd kernel messages in dmesg debug AFTER "Verifying presence of GPG card" which explains why dongle might not be found in time and fails in oem-factory-reset

Fixes Nitrokey#48

Signed-off-by: Thierry Laurion <[email protected]>
tlaurion added a commit to tlaurion/heads that referenced this issue Apr 12, 2024
Otherwise we get ehci-pci and xhci_hcd kernel messages in dmesg debug AFTER "Verifying presence of GPG card" which explains why dongle might not be found in time and fails in oem-factory-reset

Fixes Nitrokey#48

Signed-off-by: Thierry Laurion <[email protected]>
@alexgithublab
Copy link
Collaborator

alexgithublab commented Apr 15, 2024

OEM factory reset works with this image (with the sleep commit) on the x230 and NK3 1.6.0

@tlaurion
Copy link
Author

@alexgithublab please tag me or refer to PR or better, comment upstream telling that à PR fix an issue to steamline review and fasten PR merges.

tlaurion added a commit to tlaurion/nitrokey-hotp-verification that referenced this issue Apr 17, 2024
… is true: the dongle is in a clean state here without bad PIN entered.

This doesn't imply that we are using the default PINS.

This seems to have been based on wrong assumption that if no prior PIN attempt we are in factory state with default PINS (USER 123456, ADMIN 12345678).
Calling code should be, and is responsible of interpreting artifacts telling that the USB Security dongle is not in factory reset mode with default PINS.

---

History:

Prior of linuxboot/heads@99673d3, Heads was doing the same wrong assumption.
Heads was consuming 1/3 of the PIN to check if it was the default one without, resulting with the user only having 2/3 PIN input attempts before being locked out.

Because of Nitrokey/nitrokey-pro-firmware#54 (unfixed, linking to unfixed Nitrokey/libnitrokey#137), if Heads attempts to use scdaemon/libnitrokey, the dongle hangs.
Let if be libnitrokey/gpg expecting exclusive dongle access, this cause hangs.

Therefore linuxboot/heads@99673d3 bases its assumptions on Heads previously created gpg keyring without relaying on neither scdaemon/libnitrokey.
It uses public key creation vs current timsetamp to determine if the user should be reminded that is using default PINs, they should be changed.

TODO:
- Fix Nitrokey/nitrokey-pro-firmware#54
- Fix Nitrokey/libnitrokey#137

Otherwise, if on any situation, libnitrokey/scdaemon operations are intertwined, this causes Nitrokey/heads#48, which i'll reopen.

Any Heads developer will come to the same problems:
- Develop on host. Push signed commits with said dongle, which use scdaemon on host to interact with USB Security dongle to do signing ops.
- Call make BOARD=qemu-coreboot-fbwhiptail-tpm2-hotp USB_TOKEN=NitrokeyPro/NitrokeyStorage/Nitrokey3NFC/LibremKey to test Heads.
- Land under kvm/qemu, observe reported locked problems, blame QubesOS, blame Heads, blame gnupg.
- Truth is that its libnitrokey/firmware bug.

----

hotp-verification should only report on : firmware version(currently wrong), serial number and success/fail state and not do any assumption reporting false information, confusing the end user.

Signed-off-by: Thierry Laurion <[email protected]>
tlaurion added a commit to tlaurion/nitrokey-hotp-verification that referenced this issue Apr 17, 2024
… is true: the dongle is in a clean state here without bad PIN entered.

This doesn't imply that we are using the default PINS.

This seems to have been based on wrong assumption that if no prior PIN attempt we are in factory state with default PINS (USER 123456, ADMIN 12345678).
Calling code should be, and is responsible of interpreting artifacts telling that the USB Security dongle is not in factory reset mode with default PINS.

---

History:

Prior of linuxboot/heads@99673d3, Heads was doing the same wrong assumption.
Heads was consuming 1/3 of the PIN to check if it was the default one without, resulting with the user only having 2/3 PIN input attempts before being locked out.

Because of Nitrokey/nitrokey-pro-firmware#54 (unfixed, linking to unfixed Nitrokey/libnitrokey#137), if Heads attempts to use scdaemon/libnitrokey, the dongle hangs.
Let if be libnitrokey/gpg expecting exclusive dongle access, this cause hangs.

Therefore linuxboot/heads@99673d3 bases its assumptions on Heads previously created gpg keyring without relaying on neither scdaemon/libnitrokey.
It uses public key creation vs current timsetamp to determine if the user should be reminded that is using default PINs, they should be changed.

TODO:
- Fix Nitrokey/nitrokey-pro-firmware#54
- Fix Nitrokey/libnitrokey#137

Otherwise, if on any situation, libnitrokey/scdaemon operations are intertwined, this causes Nitrokey/heads#48, which i'll reopen.

Any Heads developer will come to the same problems:
- Develop on host. Push signed commits with said dongle, which use scdaemon on host to interact with USB Security dongle to do signing ops.
- Call make BOARD=qemu-coreboot-fbwhiptail-tpm2-hotp USB_TOKEN=NitrokeyPro/NitrokeyStorage/Nitrokey3NFC/LibremKey to test Heads.
- Land under kvm/qemu, observe reported locked problems, blame QubesOS, blame Heads, blame gnupg.
- Truth is that its libnitrokey/firmware bug.

----

hotp-verification should only report on : firmware version(currently wrong), serial number and success/fail state and not do any assumption reporting false information, confusing the end user.

Signed-off-by: Thierry Laurion <[email protected]>
tlaurion added a commit to tlaurion/nitrokey-hotp-verification that referenced this issue Apr 17, 2024
… is true: the dongle is in a clean state here without bad PIN entered.

This doesn't imply that we are using the default PINS.

This seems to have been based on wrong assumption that if no prior PIN attempt we are in factory state with default PINS (USER 123456, ADMIN 12345678).
Calling code should be, and is responsible of interpreting artifacts telling that the USB Security dongle is not in factory reset mode with default PINS.

---

History:

Prior of linuxboot/heads@99673d3, Heads was doing the same wrong assumption.
Heads was consuming 1/3 of the PIN to check if it was the default one without, resulting with the user only having 2/3 PIN input attempts before being locked out.

Because of Nitrokey/nitrokey-pro-firmware#54 (unfixed, linking to unfixed Nitrokey/libnitrokey#137), if Heads attempts to use scdaemon/libnitrokey, the dongle hangs.
Let if be libnitrokey/gpg expecting exclusive dongle access, this cause hangs.

Therefore linuxboot/heads@99673d3 bases its assumptions on Heads previously created gpg keyring without relaying on neither scdaemon/libnitrokey.
It uses public key creation vs current timsetamp to determine if the user should be reminded that is using default PINs, they should be changed.

TODO:
- Fix Nitrokey/nitrokey-pro-firmware#54
- Fix Nitrokey/libnitrokey#137

Otherwise, if on any situation, libnitrokey/scdaemon operations are intertwined, this causes Nitrokey/heads#48, which i'll reopen.

Any Heads developer will come to the same problems:
- Develop on host. Push signed commits with said dongle, which use scdaemon on host to interact with USB Security dongle to do signing ops.
- Call make BOARD=qemu-coreboot-fbwhiptail-tpm2-hotp USB_TOKEN=NitrokeyPro/NitrokeyStorage/Nitrokey3NFC/LibremKey to test Heads.
- Land under kvm/qemu, observe reported locked problems, blame QubesOS, blame Heads, blame gnupg.
- Truth is that its libnitrokey/firmware bug.

----

hotp-verification should only report on : firmware version(currently wrong), serial number and success/fail state and not do any assumption reporting false information, confusing the end user.

Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion tlaurion reopened this Apr 17, 2024
@tlaurion
Copy link
Author

tlaurion commented Apr 17, 2024

No doubt linuxboot#1638 helped, but once gpg --card-status calls happen followed by accesses to libnitrokey library usage, the problem reappears.

Seems like the real issue is Nitrokey/nitrokey-pro-firmware#54 and underlying Nitrokey/libnitrokey#137

@sosthene-nitrokey @szszszsz ?

@tlaurion
Copy link
Author

tlaurion commented Apr 18, 2024

No doubt linuxboot#1638 helped, but once gpg --card-status calls happen followed by accesses to libnitrokey library usage, the problem reappears.

Seems like the real issue is Nitrokey/nitrokey-pro-firmware#54 and underlying Nitrokey/libnitrokey#137

@sosthene-nitrokey @szszszsz ?

From investigation on nk3 alone, it doesn't seem to be true.

Unfortunately without further work from Nitrokey, results from hotp_verification are invalid as of now, and it's output is taken for decision making under Heads. Please contact me off-channel, I already spent too much time into this

Firmware version reported is invalid, output of info is invalid.

@tlaurion
Copy link
Author

tlaurion commented May 3, 2024

@wessel-novacustom
Copy link

@daringer @wessel-novacustom: this was fixed upsteam 3 weeks ago. Modified OP #48 (comment)

Customers are expecting this to be fixed in forks Cross-ref matrix discussions (require to be Dasharo subscriber to access):

* https://matrix.to/#/!RNcjJXCGHiyxXCHpKv:matrix.org/$9IMiPjkK-XujuiZ2yMlv7Fr1fkvdlQaoXIcMkCm7KHo?via=matrix.org&via=nitro.chat&via=invisiblethingslab.com

* https://matrix.to/#/!RNcjJXCGHiyxXCHpKv:matrix.org/$uuc-UIYgByDH5sImAK3DxYQdBjOrjZz1WePQtV9izZE?via=matrix.org&via=nitro.chat&via=invisiblethingslab.com

Thank you for your work!

@macpijan We need to make sure that important fixes and improvements like these are being added to the next release.

@nestire nestire closed this as completed Jun 12, 2024
macpijan pushed a commit to Dasharo/heads that referenced this issue Jun 20, 2024
Otherwise we get ehci-pci and xhci_hcd kernel messages in dmesg debug AFTER "Verifying presence of GPG card" which explains why dongle might not be found in time and fails in oem-factory-reset

Fixes Nitrokey#48

Signed-off-by: Thierry Laurion <[email protected]>
macpijan pushed a commit to Dasharo/heads that referenced this issue Jun 20, 2024
Otherwise we get ehci-pci and xhci_hcd kernel messages in dmesg debug AFTER "Verifying presence of GPG card" which explains why dongle might not be found in time and fails in oem-factory-reset

Fixes Nitrokey#48

Signed-off-by: Thierry Laurion <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
@tlaurion @daringer @nestire @wessel-novacustom @alexgithublab and others