Improve or eliminate the TCP Kubernetes worker scheme #278

ghjm · 2020-12-16T20:08:07Z

Currently, Receptor has two ways of running Kubernetes jobs. It can launch a pod that runs the same kind of worker as a work-command (ie, the Pod command just expects to receive stdin and then produce stdout), and use the k8s logger function to retrieve the resulting stdout. Or, it can launch a worker that is expected to make a one-time TCP connection back to the Receptor process, and then process stdin/stdout via that connection.

The advantage of the TCP scheme is that Receptor work streams are not tied up with Kubernetes internals. For example, a worker that produces many gigabytes of output via the Kubernetes logger is probably imposing on Kubernetes the duty to store all that data somewhere.

The disadvantage of the TCP scheme is that in its current form, it cannot survive a Receptor restart. With other types of work such as work-command or the logger version of work-kubernetes, Receptor will resume monitoring already-running jobs after a Receptor restart. This is not possible with the current TCP scheme because at Receptor shutdown, the worker loses its TCP connection.

So our choices are either:

Drop the TCP worker scheme and only use the k8s logger, or
Come up with a new and more complex TCP worker scheme that solves this problem.

To do the latter, we would need some protocol via the TCP stream that allows reconnecting and resuming a stream from a given byte position, and we would need workers to support this. The most user friendly way of handling this would probably be to have a worker pod include two containers, one of which is a normal stdin/stdout worker and the other of which is a Receptor-provided image that handles the TCP communication.

stanislav-zaprudskiy mentioned this issue Apr 28, 2022

Runner seems to have difficulties to process large outputs ansible/ansible-runner#998

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve or eliminate the TCP Kubernetes worker scheme #278

Improve or eliminate the TCP Kubernetes worker scheme #278

ghjm commented Dec 16, 2020

Improve or eliminate the TCP Kubernetes worker scheme #278

Improve or eliminate the TCP Kubernetes worker scheme #278

Comments

ghjm commented Dec 16, 2020