Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add troubleshooting doc about encrypted private key when providing TLS cert #1282

Closed
pthurnherr opened this issue Jan 5, 2018 · 22 comments
Closed
Assignees
Labels
area/pub Published documentation for end-users product/ova Related to the OVA packaging of vSphere Integrated Containers source/customer Reported by a customer, directly or via an intermediary

Comments

@pthurnherr
Copy link

pthurnherr commented Jan 5, 2018

VIC Product version:

OVA version vic-v1.3.0-f8cc7317.ova

Deployment details:

OVA import to vSphere 6.5. Fist boot up shows EXT4-fs (sda2): couldnt't mount as ext3 due to feature incompatibilities

Steps to reproduce:

OVA import to vSphere Host and power up

Actual behavior:

VIC Management dosen't run

Expected behavior:

Logs:

Additional details as necessary:

@ghost
Copy link

ghost commented Jan 5, 2018

@pthurnherr I’ve seen this error about ext3 incompatibilities before and it hasn’t had system impact.
There’s probably another reason for Vic management not startint up. Have you completed registration through the getting started page?

@AngieCris AngieCris added source/customer Reported by a customer, directly or via an intermediary status/need-info Additional information is needed to make progress team/lifecycle labels Jan 5, 2018
@AngieCris
Copy link
Contributor

Cc @andrewtchin

@AngieCris AngieCris added the product/ova Related to the OVA packaging of vSphere Integrated Containers label Jan 5, 2018
@pthurnherr
Copy link
Author

pthurnherr commented Jan 5, 2018

It looks like the ext4 message is not the problem. All disks are mounted but harbor and admiral do not start. So the vic management page is not accessible. Starting harbor manual by using systemctl restart harbor didn't help, same with admiral. docker ps shows quick running the container but stops immediately.

@pthurnherr
Copy link
Author

Another thing: if you go to the official vmware.com vic website and try to download the vic ova, there is a wrong redirection of https://www.vmware.com/go/download-vic to https://my-test41.vmware.com/en/web/vmware/info/slug/datacenter_cloud_infrastructure/vmware_vsphere_integrated_containers/1_3

@mdubya66
Copy link
Contributor

mdubya66 commented Jan 5, 2018

we haven't released the 1.3 yet. When it has released it will be available on vmware.com @pthurnherr. someone jumped the gun on download-vic :(.

@anchal-agrawal
Copy link
Contributor

@pthurnherr Are you able to successfully initialize the appliance via the Getting Started page as mentioned in Step 13 of the Procedure section in https://vmware.github.io/vic-product/assets/files/html/1.3/vic_vsphere_admin/deploy_vic_appliance.html? The Getting Started page is hosted on $appliance_ip:9443 and successful initialization is a pre-requisite to starting Admiral and Harbor. Thanks!

@AngieCris AngieCris added the status/needs-attention The issue needs to be discussed by the team label Jan 5, 2018
@pthurnherr
Copy link
Author

@mdubya66 Sorry, Vmware site was or is still broken with the link and 1.3 ova was from https://storage.googleapis.com/vic-product-ova-releases and not from ...ova-builds ;)

@anchal-agrawal Initialize was not showing up on $appliance_ip:9443

I'll wait for the official 1.3 release.

@mdubya66
Copy link
Contributor

mdubya66 commented Jan 5, 2018

Someone jumped the gun, that's the test link. We have a new OVA published now due to #1286

Were you able to reach $appliance_id:9443 at all? On my setup using Chrome I could not until I forced the certificate acceptance. I did that by going to http://appliance_id first. Convoluted I know and something we need to document and fix.

@mdubya66 mdubya66 reopened this Jan 5, 2018
@mdubya66
Copy link
Contributor

mdubya66 commented Jan 5, 2018

Also I'm re-opening this since the bits used are identical except for the version used.

@anchal-agrawal
Copy link
Contributor

anchal-agrawal commented Jan 5, 2018

@pthurnherr Thanks for the update - the build you used from https://storage.googleapis.com/vic-product-ova-releases is in the pipeline for the official 1.3 release, so if you have the time, it'd be worth root-causing your issue.

After you powered on the OVA from the vSphere web client, did you see the web console of the OVA? It looks like this:
console

Please note that it takes a few minutes (sometimes even 5-10 mins) after powering on for this screen to appear, since the OVA prepares for starting the services and only shows the web console when it is ready to be initialized.

@andrewtchin
Copy link
Contributor

Additionally after the console screen shows up it may take additional time for the web server running on the VIC Appliance to start. During this time connection attempts to the Getting Started Page will time out. You must wait for the web server to start and then initialize the appliance from the Getting Started Page before attempting to use VIC services.

@pthurnherr
Copy link
Author

pthurnherr commented Jan 8, 2018

@anchal-agrawal vSphere web client shows welcome screen but webservices wont start. I'm not able to initialize the appliance. On the vSphere web client I also get a gateway Mismatch but it's set correct.

IP stack is up and ssh access to the appliance is possible.

vic

@andrewtchin
Copy link
Contributor

andrewtchin commented Jan 8, 2018

@pthurnherr by webservices do you mean the Getting Started Page webserver?
If you SSH in can you provide the output of systemctl status fileserver and journalctl -u fileserver? That should show the status of the Getting Started Page webserver.

@pthurnherr
Copy link
Author

@andrewtchin

root@svthps02t [ ~ ]# systemctl status fileserver
● fileserver.service - VIC Unified Installer Web Server
Loaded: loaded (/usr/lib/systemd/system/fileserver.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2018-01-09 05:30:49 UTC; 909us ago
Docs: https://github.com/vmware/vic
Main PID: 2248204 (start_fileserve)
Tasks: 1
Memory: 244.0K
CPU: 192us
CGroup: /system.slice/fileserver.service
└─2248204 /usr/bin/bash /etc/vmware/fileserver/start_fileserver.sh
root@svthps02t [ ~ ]# systemctl status fileserver
● fileserver.service - VIC Unified Installer Web Server
Loaded: loaded (/usr/lib/systemd/system/fileserver.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Tue 2018-01-09 05:30:49 UTC; 4s ago
Docs: https://github.com/vmware/vic
Process: 2248204 ExecStart=/etc/vmware/fileserver/start_fileserver.sh (code=exited, status=1/FAILURE)
Main PID: 2248204 (code=exited, status=1/FAILURE)
Tasks: 0
Memory: 0B
CPU: 0
CGroup: /system.slice/fileserver.service

Jan 09 05:30:49 svthps02t.gv.li systemd[1]: fileserver.service: Main process exited, code=exited, status=1/FAILURE
Jan 09 05:30:49 svthps02t.gv.li systemd[1]: fileserver.service: Unit entered failed state.
Jan 09 05:30:49 svthps02t.gv.li systemd[1]: fileserver.service: Failed with result 'exit-code'.
root@svthps02t [ ~ ]# systemctl status fileserver

@pthurnherr
Copy link
Author

@andrewtchin

Found the issue. I'm usinga custom certificate and the private key as encrypted! Following 1.3 Dokumentation it must be a unencrypted PEM-encoded PKCS#8-formatted file.

@andrewtchin
Copy link
Contributor

andrewtchin commented Jan 9, 2018

Great thanks @pthurnherr Could you provide the log line(s) that show the error message for the encrypted private key? These should be found in journalctl -u fileserver
@stuclem Should we consider a troubleshooting topic for this?

Problem:

  • Getting Started Page fails to load after significant time when using custom certificates

Symptoms:

  • Provided custom certificates during VIC appliance deployment, including encrypted private key
  • Able to SSH into the VIC appliance
  • systemctl status fileserver shows that fileserver failed to start

Solution:

  • Private key must be supplied in unencrypted PEM-encoded PKCS#8-format
  • Shutdown guest OS of the VIC appliance
  • Edit the vApp settings for the appliance to include the unencrypted private key
  • Start the appliance
  • Verify that Getting Started Page shows up

@andrewtchin andrewtchin removed status/need-info Additional information is needed to make progress status/needs-attention The issue needs to be discussed by the team labels Jan 9, 2018
@ghost
Copy link

ghost commented Jan 9, 2018

@pthurnherr furthermore, it seems like the gateway ip is blank because net-tools is missing in the ova, which would be causing your mismatch. We didn't catch it because our test environments are using DHCP - this was tracked in #1267. @andrewtchin I have this fix in pr #1266, can pull it out to a separate 1.3.1 pr if #1266 doesn't make it in.

@pthurnherr
Copy link
Author

@andrewtchin

-- Logs begin at Tue 2018-01-09 17:04:26 UTC, end at Tue 2018-01-09 17:06:59 UTC
. --
Jan 09 17:06:34 server.local systemd[1]: Started VIC Unified Installer Web Se
rver.
Jan 09 17:06:35 server.local start_fileserver.sh[1754]: time="2018-01-09T17:0
6:35Z" level=info msg="Current UID/GID = 0/0"
Jan 09 17:06:35 server.local start_fileserver.sh[1754]: time="2018-01-09T17:0
6:35Z" level=info msg="Loading certificate /opt/vmware/fileserver/cert/server.cr
t and key /opt/vmware/fileserver/cert/server.key"
Jan 09 17:06:35 server.local start_fileserver.sh[1754]: time="2018-01-09T17:0
6:35Z" level=fatal msg="Failed to load certificate /opt/vmware/fileserver/cert/s
erver.crt and key /opt/vmware/fileserver/cert/server.key: tls: failed to parse p
rivate key"
Jan 09 17:06:35 server.local systemd[1]: fileserver.service: Main pro
cess exited, code=exited, status=1/FAILURE
Jan 09 17:06:35 server.local systemd[1]: fileserver.service: Unit ent
ered failed state.
Jan 09 17:06:35 server.local systemd[1]: fileserver.service: Failed w
ith result 'exit-code'.
Jan 09 17:06:40 server.local systemd[1]: fileserver.service: Service hold-off
time over, scheduling restart.
Jan 09 17:06:40 server.local systemd[1]: Stopped VIC Unified Installer Web Se
rver.
Jan 09 17:06:40 server.local systemd[1]: Started VIC Unified Installer Web Se
rver.
Jan 09 17:06:40 server.local start_fileserver.sh[1916]: time="2018-01-09T17:0
6:40Z" level=info msg="Current UID/GID = 0/0"
Jan 09 17:06:40 server.local start_fileserver.sh[1916]: time="2018-01-09T17:0
6:40Z" level=info msg="Loading certificate /opt/vmware/fileserver/cert/server.cr
t and key /opt/vmware/fileserver/cert/server.key"
Jan 09 17:06:40 server.local start_fileserver.sh[1916]: time="2018-01-09T17:0
6:40Z" level=fatal msg="Failed to load certificate /opt/vmware/fileserver/cert/s
erver.crt and key /opt/vmware/fileserver/cert/server.key: tls: failed to parse p
rivate key"
Jan 09 17:06:40 server.local systemd[1]: fileserver.service: Main pro
cess exited, code=exited, status=1/FAILURE
Jan 09 17:06:40 server.local systemd[1]: fileserver.service: Unit ent
ered failed state.
Jan 09 17:06:40 server.local systemd[1]: fileserver.service: Failed w
ith result 'exit-code'.
Jan 09 17:06:46 server.local systemd[1]: fileserver.service: Service hold-off
time over, scheduling restart.
Jan 09 17:06:46 server.local systemd[1]: Stopped VIC Unified Installer Web Se
rver.
Jan 09 17:06:46 server.local systemd[1]: Started VIC Unified Installer Web Se
rver.
Jan 09 17:06:46 server.local start_fileserver.sh[2076]: time="2018-01-09T17:06:46Z" level=info msg="Current UID/GID = 0/0"
Jan 09 17:06:46 server.local start_fileserver.sh[2076]: time="2018-01-09T17:06:46Z" level=info msg="Loading certificate /opt/vmware/fileserver/cert/server.crt and key /opt/vmware/fileserver/cert/server.key"
Jan 09 17:06:46 server.local start_fileserver.sh[2076]: time="2018-01-09T17:06:46Z" level=fatal msg="Failed to load certificate /opt/vmware/fileserver/cert/server.crt and key /opt/vmware/fileserver/cert/server.key: tls: failed to pars
e private key"
Jan 09 17:06:46 server.local systemd[1]: fileserver.service: Main process exited, code=exited, status=1/FAILURE
Jan 09 17:06:46 server.local systemd[1]: fileserver.service: Unit entered failed state.
Jan 09 17:06:46 server.local systemd[1]: fileserver.service: Failed with result 'exit-code'.
Jan 09 17:06:51 server.local systemd[1]: fileserver.service: Service hold-off time over, scheduling restart.
Jan 09 17:06:51 server.local systemd[1]: Stopped VIC Unified Installer Web Server.
Jan 09 17:06:51 server.local systemd[1]: Started VIC Unified Installer Web Server.
Jan 09 17:06:51 server.local start_fileserver.sh[2137]: time="2018-01-09T17:06:51Z" level=info msg="Current UID/GID = 0/0"
Jan 09 17:06:51 server.local start_fileserver.sh[2137]: time="2018-01-09T17:06:51Z" level=info msg="Loading certificate /opt/vmware/fileserver/cert/server.crt and key /opt/vmware/fileserver/cert/server.key"
Jan 09 17:06:51 server.local start_fileserver.sh[2137]: time="2018-01-09T17:06:51Z" level=fatal msg="Failed to load certificate /opt/vmware/fileserver/cert/server.crt and key /opt/vmware/fileserver/cert/server.key: tls: failed to pars
e private key"
Jan 09 17:06:51 server.local systemd[1]: fileserver.service: Main process exited, code=exited, status=1/FAILURE
Jan 09 17:06:51 server.local systemd[1]: fileserver.service: Unit entered failed state.
Jan 09 17:06:51 server.local systemd[1]: fileserver.service: Failed with result 'exit-code'.
Jan 09 17:06:56 server.local systemd[1]: fileserver.service: Service hold-off time over, scheduling restart.
Jan 09 17:06:56 server.local systemd[1]: Stopped VIC Unified Installer Web Server.
Jan 09 17:06:56 server.local systemd[1]: Started VIC Unified Installer Web Server.
Jan 09 17:06:56 server.local start_fileserver.sh[2244]: time="2018-01-09T17:06:56Z" level=info msg="Current UID/GID = 0/0"
Jan 09 17:06:56 server.local start_fileserver.sh[2244]: time="2018-01-09T17:06:56Z" level=info msg="Loading certificate /opt/vmware/fileserver/cert/server.crt and key /opt/vmware/fileserver/cert/server.key"
Jan 09 17:06:56 server.local start_fileserver.sh[2244]: time="2018-01-09T17:06:56Z" level=fatal msg="Failed to load certificate /opt/vmware/fileserver/cert/server.crt and key /opt/vmware/fileserver/cert/server.key: tls: failed to pars
e private key"
Jan 09 17:06:56 server.local systemd[1]: fileserver.service: Main process exited, code=exited, status=1/FAILURE
Jan 09 17:06:56 server.local systemd[1]: fileserver.service: Unit entered failed state.
Jan 09 17:06:56 server.local systemd[1]: fileserver.service: Failed with result 'exit-code'.

@stuclem
Copy link
Contributor

stuclem commented Jan 9, 2018

@andrewtchin yes it looks like a candidate for a TS topic. Can I just repurpose this issue, or do you want to keep this one and have me open a new one?

@andrewtchin andrewtchin added area/pub Published documentation for end-users and removed team/lifecycle labels Jan 10, 2018
@andrewtchin
Copy link
Contributor

Thanks @pthurnherr!
@stuclem You can repurpose, removing dev labels

@andrewtchin andrewtchin changed the title Issue with 1.3.0 OVA image Add troubleshooting doc about encrypted private key when providing TLS cert Jan 26, 2018
@stuclem stuclem self-assigned this Aug 17, 2018
@stuclem
Copy link
Contributor

stuclem commented Aug 17, 2018

It seems that more than one customer has hit this. We do state that the key must be unencrypted, but a TS topic that includes the error wouldn't hurt.

@stuclem
Copy link
Contributor

stuclem commented Aug 30, 2018

@vidya-v can you please take the info from #1282 (comment) above and make it into a new troubleshooting topic? It would need to be nested under https://vmware.github.io/vic-product/assets/files/html/1.4/vic_vsphere_admin/ts_deploy_appliance.html

You can use one of the other troubleshooting topics as a template, for example https://vmware.github.io/vic-product/assets/files/html/1.4/vic_vsphere_admin/ts_cert_error.html.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/pub Published documentation for end-users product/ova Related to the OVA packaging of vSphere Integrated Containers source/customer Reported by a customer, directly or via an intermediary
Projects
None yet
Development

No branches or pull requests

7 participants