Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job restart policy: always does not work if platform-api-poller losts connection with platform-api #2150

Open
YevheniiSemendiak opened this issue Oct 23, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@YevheniiSemendiak
Copy link
Contributor

src: https://neuromation.slack.com/archives/CE9SZE5B3/p1698047620634109

Symptoms:

  1. Created job with --restart-policy always
  2. platform-api-poller lost connection to platform-api
  3. Job was not restarted / recreated when the connectivity fixed

It might be the case, that the network interruption between poller and API is not a root cause, but networking issue within the cluster itself caused such problem. Need to investigate further.

@YevheniiSemendiak YevheniiSemendiak added the bug Something isn't working label Oct 23, 2023
@YevheniiSemendiak YevheniiSemendiak changed the title Job restart policy: always does not work if platform-api-poller losts connection to platform-api Job restart policy: always does not work if platform-api-poller losts connection with platform-api Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant