Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint becomes unresponsive after several hours #7680

Closed
hickeng opened this issue Apr 10, 2018 · 6 comments · Fixed by #8154
Closed

Endpoint becomes unresponsive after several hours #7680

hickeng opened this issue Apr 10, 2018 · 6 comments · Fixed by #8154
Assignees
Labels
kind/defect Behavior that is inconsistent with what's intended priority/p1 source/customer Reported by a customer, directly or via an intermediary status/needs-attention The issue needs to be discussed by the team team/container team/foundation team/lifecycle

Comments

@hickeng
Copy link
Member

hickeng commented Apr 10, 2018

The following is seen in journalctl output:

Mar 20 08:32:11 Linux systemd[1]: vic-init.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 20 08:32:11 Linux systemd[1]: vic-init.service: Unit entered failed state.
Mar 20 08:32:11 Linux systemd[1]: vic-init.service: Failed with result 'exit-code'.

Current speculation is that this is related to DHCP lease failure on the management network, but no clarity on why this would result in vic-init exit.

Speculative fix is to use --asymmetric-routes but that's pure speculation due to martian packets being logged.

Bug2084707 tracks this and contains the original details and logs.

@hickeng hickeng added status/needs-attention The issue needs to be discussed by the team source/customer Reported by a customer, directly or via an intermediary priority/p1 labels Apr 10, 2018
@mdubya66 mdubya66 added kind/defect Behavior that is inconsistent with what's intended team/foundation team/lifecycle team/container labels Apr 10, 2018
@hickeng
Copy link
Member Author

hickeng commented Apr 10, 2018

@dbarkelew Opened this issue for tracking. If you manage a recreate please let me know.

@hickeng
Copy link
Member Author

hickeng commented Apr 10, 2018

Added to the release as we've had several customers run into this. We need a least a root cause and documentation covering it for 1.4 if not a code fix.

@gigawhitlocks
Copy link
Contributor

Moving back to ToDo to pick up a higher priority issue, sorry for all the pipeline spam in the history of this ticket.

@hickeng
Copy link
Member Author

hickeng commented Jun 20, 2018

Next step:
Modify the dhcp client to support asymmetric routes. rp_filter may be configured correctly for asymmetry but I suspect that given the dhcp client is listening on a specific interface we will not be getting the benefit of that.

This could also address bug2139912 if it the asymmetric routing configuration was propagated into the containerVMs (maybe associated with a specific network?).

@zjs
Copy link
Member

zjs commented Sep 13, 2018

I think this is worth a release note, as it was a customer-reported issue.

@zjs zjs added the impact/doc/note Requires creation of or changes to an official release note label Sep 13, 2018
@stuclem
Copy link
Contributor

stuclem commented Sep 17, 2018

Added as resolved issue in https://github.com/vmware/vic/releases/tag/v1.4.3

@stuclem stuclem removed the impact/doc/note Requires creation of or changes to an official release note label Sep 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/defect Behavior that is inconsistent with what's intended priority/p1 source/customer Reported by a customer, directly or via an intermediary status/needs-attention The issue needs to be discussed by the team team/container team/foundation team/lifecycle
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants