Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installs a handler to delete active pods on termination #46

Merged
merged 17 commits into from
Feb 12, 2019

Conversation

dleehr
Copy link
Member

@dleehr dleehr commented Feb 8, 2019

  • Adds k8s.PodMonitor class that keeps track of submitted but not deleted pods
  • Adds k8s.delete_pods() function to be called on sigterm or if cwlmain() raises an exception
  • installs SIGTERM handler into main.main() to invoke k8s.delete_pods()
  • Addresses a couple inconsistencies in tests

Fixes #41

@dleehr dleehr requested a review from johnbradley February 8, 2019 21:20
Copy link
Contributor

@johnbradley johnbradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about a possible resource leak. A couple testing tweaks.
It would also be nice to have a comment explaining which parts are run in a background thread to help understanding of how these locks are being used - Is the main thread responsible for receiving the signal and doing the final pod cleanup while the background threads create and delete pods individually?

@@ -46,6 +47,7 @@ def __init__(self):
def submit_pod(self, pod_body):
pod = self.core_api_instance.create_namespaced_pod(self.namespace, pod_body)
log.info('Created k8s pod name {} with id {}'.format(pod.metadata.name, pod.metadata.uid))
PodMonitor.add(pod)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility that a SIGTERM arrives just before this method, the main thread runs delete_pods() and the pod created here is leaked?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we test that PodMonitor.add is called when users run submit_pod

@@ -98,7 +97,8 @@ def test_wait_skips_pod_when_state_is_running(self, mock_watch, mock_get_namespa
self.assertIsNotNone(kc.pod)

@patch('calrissian.k8s.watch')
def test_wait_finishes_when_pod_state_is_terminated(self, mock_watch, mock_get_namespace, mock_client):
@patch('calrissian.k8s.PodMonitor')
def test_wait_finishes_when_pod_state_is_terminated(self, mock_podmonitor, mock_watch, mock_get_namespace, mock_client):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we test that PodMonitor.remove is called here?

@dleehr
Copy link
Member Author

dleehr commented Feb 11, 2019

Thanks for the feedback @johnbradley. I made a few changes to guard against the case you described and added some documentation on threading assumptions.

@dleehr dleehr requested a review from johnbradley February 12, 2019 14:35
Copy link
Contributor

@johnbradley johnbradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dleehr dleehr merged commit a2fcc7a into master Feb 12, 2019
@dleehr dleehr deleted the 38-cleanup-on-terminate branch February 13, 2019 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants