Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick start with EM stuck at 87%; can't complete write-netplan step #143

Closed
jpablo-eqx opened this issue Jun 30, 2022 · 1 comment · Fixed by tinkerbell/tink#632
Closed

Comments

@jpablo-eqx
Copy link

Following the guide for Equinix Metal, the workflow is stuck at 87% and it seems to have issues with write-netplan step

+----------------------+--------------------------------------+
| FIELD NAME           | VALUES                               |
+----------------------+--------------------------------------+
| Workflow ID          | e193da91-f82a-11ec-986a-0242ac120003 |
| Workflow Progress    | 87%                                  |
| Current Task         | os-installation                      |
| Current Action       | disable-apparmor                     |
| Current Worker       | 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 |
| Current Action State | STATE_SUCCESS                        |
+----------------------+--------------------------------------+

Expected Behaviour

The workflow should complete at 100%

Current Behaviour

+----------------------+--------------------------------------+
| FIELD NAME           | VALUES                               |
+----------------------+--------------------------------------+
| Workflow ID          | e193da91-f82a-11ec-986a-0242ac120003 |
| Workflow Progress    | 87%                                  |
| Current Task         | os-installation                      |
| Current Action       | disable-apparmor                     |
| Current Worker       | 0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94 |
| Current Action State | STATE_SUCCESS                        |
+----------------------+--------------------------------------+

Here's relevant logs (I think)

tink-server_1                    | {"level":"info","ts":1656563265.5266938,"caller":"server/dbserver_workflow.go:234","msg":"done getting a workflow context","service":"github.com/tinkerbell/tink","workflowID":"e193da91-f82a-11ec-986a-0242ac120003","currentWorker":"0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94","currentTask":"os-installation","currentAction":"disable-apparmor","currentActionIndex":"6","currentActionState":"STATE_SUCCESS","totalNumberOfActions":8}
tink-server_1                    | {"level":"info","ts":1656563266.8126962,"caller":"server/dbserver_worker_workflow.go:82","msg":"received action status: STATE_RUNNING","service":"github.com/tinkerbell/tink","actionName":"write-netplan","workflowID":"e193da91-f82a-11ec-986a-0242ac120003","taskName":"os-installation"}
boots_1                          | {"level":"info","ts":1656563266.8126342,"caller":"syslog/receiver.go:107","msg":"host=192.168.56.43 facility=daemon severity=ERR app-name=eafa7d318d9c procid=1198 msg=\"{\\\"level\\\":\\\"info\\\",\\\"ts\\\":1656563265.1012092,\\\"caller\\\":\\\"worker/worker.go:442\\\",\\\"msg\\\":\\\"reporting Action Status\\\",\\\"service\\\":\\\"github.com/tinkerbell/tink\\\",\\\"workerID\\\":\\\"0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94\\\",\\\"workflowID\\\":\\\"e193da91-f82a-11ec-986a-0242ac120003\\\",\\\"actionName\\\":\\\"write-netplan\\\",\\\"taskName\\\":\\\"os-installation\\\",\\\"workflowID\\\":\\\"e193da91-f82a-11ec-986a-0242ac120003\\\",\\\"workerID\\\":\\\"0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94\\\",\\\"actionName\\\":\\\"write-netplan\\\",\\\"taskName\\\":\\\"os-installation\\\",\\\"status\\\":\\\"STATE_RUNNING\\\"}\\n\"","service":"github.com/tinkerbell/boots","pkg":"syslog"}
boots_1                          | {"level":"info","ts":1656563266.8141456,"caller":"syslog/receiver.go:107","msg":"host=192.168.56.43 facility=daemon severity=ERR app-name=eafa7d318d9c procid=1198 msg=\"{\\\"level\\\":\\\"error\\\",\\\"ts\\\":1656563265.1027963,\\\"caller\\\":\\\"worker/worker.go:445\\\",\\\"msg\\\":\\\"failed to report action status: rpc error: code = FailedPrecondition desc = invalid action index for workflow\\\",\\\"service\\\":\\\"github.com/tinkerbell/tink\\\",\\\"workerID\\\":\\\"0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94\\\",\\\"workflowID\\\":\\\"e193da91-f82a-11ec-986a-0242ac120003\\\",\\\"actionName\\\":\\\"write-netplan\\\",\\\"taskName\\\":\\\"os-installation\\\",\\\"workflowID\\\":\\\"e193da91-f82a-11ec-986a-0242ac120003\\\",\\\"workerID\\\":\\\"0eba0bf8-3772-4b4a-ab9f-6ebe93b90a94\\\",\\\"actionName\\\":\\\"write-netplan\\\",\\\"taskName\\\":\\\"os-installation\\\",\\\"status\\\":\\\"STATE_RUNNING\\\",\\\"error\\\":\\\"failed to report action status: rpc error: code = FailedPrecondition desc = invalid action index for workflow\\\",\\\"errorVerbose\\\":\\\"rpc error: code = FailedPrecondition desc = invalid action index for workflow\\\\nfailed to report action status\\\\ngithub.com/tinkerbell/tink/cmd/tink-worker/worker.(*Worker).reportActionStatus\\\\n\\\\t/home/runner/work/tink/tink/cmd/tink-worker/worker/worker.go:445\\\\ngithub.com/tin\"","service":"github.com/tinkerbell/boots","pkg":"syslog"}

Possible Solution

Steps to Reproduce (for bugs)

  1. Follow the quick start guide for Equinix Metal

Context

Your Environment

  • Operating System and version (e.g. Linux, Windows, MacOS):
    MacoS
  • How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details:
    Metal with terraform
  • Link to your project or a code example to reproduce issue:
@jpablo-eqx jpablo-eqx changed the title Stuck at 87%; can't complete write-netplan step Quick start with EM stuck at 87%; can't complete write-netplan step Jun 30, 2022
@jacobweinstock
Copy link
Member

Hey @jpablo-eqx, thanks for reporting this. I believe that this PR: tinkerbell/tink#632 should fix the issue.

mergify bot added a commit to tinkerbell/tink that referenced this issue Jul 6, 2022
## Description


This allows the final action in a workflow to successfully report its status and consequently allow a workflow to complete.
This issue: #559 incorrectly describes a fix that was applied in this PR: #576

## Why is this needed



Fixes: tinkerbell/playground#143

## How Has This Been Tested?



Unit tests were added and the sandbox (vagrant virtualbox quickstart) was used to manually test.

## How are existing users impacted? What migration steps/scripts do we need?





## Checklist:

I have:

- [ ] updated the documentation and/or roadmap (if required)
- [x] added unit or e2e tests
- [ ] provided instructions on how to upgrade
mergify bot added a commit that referenced this issue Aug 2, 2022
## Description
The current image that is used in the main branch doesn't work due to the issue which is reported here: #143. There was a PR which addressed this issue already which can be found here tinkerbell/tink#632. This PR uses the latest `tink-server`, `tink-worker` and `tink-cli` which has the fix already in place. 


## Why is this needed



Fixes: #

## How Has This Been Tested?



- Checked out the main branch  
- Update the env var `vTINK` to `sha-16186501` 
- Run docker compose 
- Provision all the needed manifests(HW, WFL, TPL) 
- Reboot the machine 

## How are existing users impacted? What migration steps/scripts do we need?
I haven't updated any other components or image except the tink image.




## Checklist:

I have:

- [ ] updated the documentation and/or roadmap (if required)
- [ ] added unit or e2e tests
- [ ] provided instructions on how to upgrade
ttwd80 pushed a commit to ttwd80/tinkerbell-playground that referenced this issue Sep 7, 2024
…ell#146)

## Description
The current image that is used in the main branch doesn't work due to the issue which is reported here: tinkerbell#143. There was a PR which addressed this issue already which can be found here tinkerbell/tink#632. This PR uses the latest `tink-server`, `tink-worker` and `tink-cli` which has the fix already in place. 


## Why is this needed



Fixes: #

## How Has This Been Tested?



- Checked out the main branch  
- Update the env var `vTINK` to `sha-16186501` 
- Run docker compose 
- Provision all the needed manifests(HW, WFL, TPL) 
- Reboot the machine 

## How are existing users impacted? What migration steps/scripts do we need?
I haven't updated any other components or image except the tink image.




## Checklist:

I have:

- [ ] updated the documentation and/or roadmap (if required)
- [ ] added unit or e2e tests
- [ ] provided instructions on how to upgrade
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants