Send messages from scheduler to tasks #2426

riga · 2018-05-18T14:27:36Z

Description

This PR enables the scheduler to send messages to particular tasks. More precisely, messages are sent to workers via the RPC message system which dispatches them to a specified task. Running tasks
can access a multiprocessing.Queue object (named scheduler_messages, set via the StatusReporter) which stores incoming messages. Users can implement custom behavior to react to certain messages (see code example below).

The new config scheduler.send_messages controls whether a scheduler is capable of sending messages in the first place. By defining accepted_messages a task can define whether it accepts messages, independent of the scheduler the worker is talking to. The message button is only visible in the UI when the scheduler is able to send messages, and the task can receive them (see screenshots).

There is also a simple "respond" mechanism so tasks can immediately respond to incoming messages which is shown in the scheduler.

"Send message" action button:

Message prompt:

Awaiting response:

Response arrived:

When Await response is not checked, the prompt closes upon Send. Otherwise, the prompt remains open and a response container is shown. This triggers a simple polling to fetch and display the actual response.

Motivation and Context

There are many use cases when the messaging could be useful. One particular example is machine learning. Our training tasks publish their current results to the scheduler. Although they contain predefined termination criteria, it could be nice to manually trigger graceful training termination if the training stagnates:

class Training(luigi.Task):

    def requires(self):
        ...

    def output(self):
        return luigi.LocalTarget("my_ml_model.pb")

    def run(self):
        # setup the training
        model = ...

        for _ in training_loop():
            # training stuff with internal termination criterion
            model.train(...)

            # check messages
            if not self.scheduler_messages.empty():
                msg = self.scheduler_messages.get()
                if msg.content == "terminate":
                    break
                else:
                    msg.respond("unknown message")

        # save the model
        model.save(self.output().path)

Have you tested this? If so, how?

Yep, I added unit tests and docs.

Tarrasch

Really cool!

In addition to the code comments, please consider:

Docs would be really great!
I think adding the button unconditionally in the UI isn't a great idea. Most people will play around with it and see that nothing happens. Anyway we can make the worker register that it's open to taking messages? Perhaps in the future it could be a description of what kind of messages it accepts even.

Keep up the awesome work!

Tarrasch · 2018-05-18T19:49:12Z

luigi/scheduler.py

+            return
+
+        kwargs = dict(task_id=task, message=message)
+        self._state.get_worker(worker).add_rpc_message('dispatch_scheduler_message', **kwargs)


Maybe cleaner to to inline the kwargs?

Yes, will be in the next commits.

Tarrasch · 2018-05-18T19:59:45Z

luigi/worker.py

@@ -1148,3 +1155,14 @@ def set_worker_processes(self, n):

        # tell the scheduler
        self._scheduler.add_worker(self._id, {'workers': self.worker_processes})
+
+    @rpc_message_callback
+    def dispatch_scheduler_message(self, task_id, message):


add , **kwargs): so that we can add more things in the future (without breaking old workers). One next step could be that a message has an ID so the worker can report back that it got the message, perhaps even "reply" etc.

Implemented a draft if this reply/response mechanism.

riga · 2018-05-23T15:03:19Z

I implemented the two changes you requested, and now, the action button is only shown when the worker is allowed to receive messages.

In addition I added some simple response functionality (the screenshots in the PR description show the new behavior). In the next iteration, one could also add descriptions of what messages a task accepts and maybe even which values are accepted with some nice dropdown buttons in the scheduler. However, this requires a well-defined interface on task level which might be something for the next PR.

If you're fine with the idea, I'll append additional tests and docs.

Tarrasch

So far so good really. Thanks for doing the message_id thing!

Yes, please focus on fixing/documenting existing functionalities of this PR rather than adding yet another one. I'm happy to review new features too once this PR is in. :)

Tarrasch · 2018-05-24T20:56:48Z

luigi/scheduler.py

@@ -932,6 +936,31 @@ def disable_worker(self, worker):
    def set_worker_processes(self, worker, n):
        self._state.get_worker(worker).add_rpc_message('set_worker_processes', n=n)

+    @rpc_method()
+    def send_scheduler_message(self, worker, task, message):


The parameter name message maybe be should be renamed to content or payload throughout to make it easier to follow. I at least got confused that message became content in the scheduler.

Tarrasch · 2018-05-24T21:00:01Z

luigi/scheduler.py

+        self._state.get_worker(worker).add_rpc_message('dispatch_scheduler_message', task_id=task,
+                                                       message_id=message_id, message=message)
+
+        return {"messageId": message_id}


yuck, I just realize that we mix camelCase and snake_case in the key-names. But it seems snake_case is more used. Maybe use that?

Tarrasch · 2018-05-24T21:03:10Z

luigi/worker.py

@@ -392,6 +420,10 @@ class worker(Config):
                                          description='If true, use multiprocessing also when '
                                          'running with 1 worker')

+    receive_messages = BoolParameter(default=True,


Maybe can_receive_messages?

Also I'm thinking of having the default to False, so the UI button doesn't appear by default.

Tarrasch · 2018-05-24T21:03:59Z

luigi/worker.py

+
+        assert self._config.wait_interval >= _WAIT_INTERVAL_EPS, "[worker] wait_interval must be positive"
+        assert self._config.wait_jitter >= 0.0, "[worker] wait_jitter must be equal or greater than zero"
+


Why the need to move this code?

(feel free to if you have a reason, I'm mostly curious)

This was required as _generate_worker_info() needs to access _config.

Tarrasch · 2018-05-24T21:15:27Z

luigi/worker.py

@@ -540,7 +572,8 @@ def _generate_worker_info(self):
        # Generate as much info as possible about the worker
        # Some of these calls might not be available on all OS's
        args = [('salt', '%09d' % random.randrange(0, 999999999)),
-                ('workers', self.worker_processes)]
+                ('workers', self.worker_processes),
+                ('receive_messages', self._config.receive_messages)]


Hmmm... wait, why is this a property of the worker...

Thinking about it I'm sure it makes sense. But I really would like to see some user docs.

Mh, but you have a point. Another alternative would be to have sth like a "can_receive_messages()" hook (defaulting to false) defined on a task. Actually this makes more sense as the worker should just dispatch and let tasks decide what to do with messages. Are you fine with that change?

Well. I think it would make sense to both have message for tasks and workers.

Worker messages could be "shut down asap", "shut down task x", "decrease num workers" (we already have that, don't we)? Tasks messages I imagine are mostly task-specific.

I'm not sure how to implement it. Anyway, I'm fine with any changes, I'm sure if you think it becomes better so will I. :)

Task can configure on their own if incoming scheduler messages are accepted using the `accepted_messages` property. In future PR, this can be extended to accept only certain messages.

Tarrasch

Starts to look good! :)

(sorry for the slow review)

Tarrasch · 2018-06-05T19:22:58Z

doc/luigi_patterns.rst

+        ...
+
+        # configure the task to accept all incoming messages
+        accepted_messages = True


"accepts_messages" sounds better no?

Tarrasch · 2018-06-05T19:24:50Z

luigi/task.py

+    @property
+    def accepted_messages(self):
+        """
+        Configures which scheduler messages can be received and returns them. When falsy, this tasks


Maybe "For configuring which scheduler messages can be received."?

Tarrasch · 2018-06-06T19:35:26Z

Are you happy with this now? Is it ready to merge? Is it reasonably tested?

riga · 2018-06-07T13:22:14Z

Yep, I'm happy with the PR. Docs and tests should be fine.

Tarrasch · 2018-06-16T16:58:21Z

Thanks!! This is so cool!

riga added 3 commits May 18, 2018 14:55

Add scheduler message feature.

d7c5995

Add scheduler message interface to visualizer.

417e6e0

Add tests and docs.

6c79e93

Tarrasch reviewed May 18, 2018

View reviewed changes

riga added 2 commits May 23, 2018 15:38

Conditional display of message button in scheduler.

be7a140

Add scheduler message responses.

849f663

riga added 2 commits May 23, 2018 17:08

Account for new message object in tests.

3a76819

Remove debug line.

08bb0ec

Tarrasch reviewed May 24, 2018

View reviewed changes

riga added 4 commits June 5, 2018 17:14

Implement review comments.

6d96414

Task can configure on their own if incoming scheduler messages are accepted using the `accepted_messages` property. In future PR, this can be extended to accept only certain messages.

Merge branch 'master' into feature/schedulerMessages.

77aa16a

Fix flake8 errors.

95f5254

Add test for scheduler methods.

dd200e8

Tarrasch reviewed Jun 5, 2018

View reviewed changes

Implement further review comments.

e8949c2

Tarrasch merged commit bda236d into spotify:master Jun 16, 2018

riga mentioned this pull request Jul 23, 2018

Fix Scheduler.add_task to overwrite accepts_messages attribute. #2469

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Send messages from scheduler to tasks #2426

Send messages from scheduler to tasks #2426

riga commented May 18, 2018 •

edited

Loading

Tarrasch left a comment

Tarrasch May 18, 2018

riga May 23, 2018

Tarrasch May 18, 2018

riga May 23, 2018

riga commented May 23, 2018

Tarrasch left a comment

Tarrasch May 24, 2018

riga May 25, 2018

Tarrasch May 24, 2018

riga May 25, 2018

Tarrasch May 24, 2018

Tarrasch May 24, 2018

Tarrasch May 24, 2018

riga May 25, 2018

Tarrasch May 24, 2018

riga May 25, 2018

Tarrasch May 25, 2018

Tarrasch left a comment

Tarrasch Jun 5, 2018

riga Jun 6, 2018

Tarrasch Jun 5, 2018

riga Jun 6, 2018 •

edited

Loading

Tarrasch commented Jun 6, 2018

riga commented Jun 7, 2018

Tarrasch commented Jun 16, 2018


		assert self._config.wait_interval >= _WAIT_INTERVAL_EPS, "[worker] wait_interval must be positive"
		assert self._config.wait_jitter >= 0.0, "[worker] wait_jitter must be equal or greater than zero"

Send messages from scheduler to tasks #2426

Send messages from scheduler to tasks #2426

Conversation

riga commented May 18, 2018 • edited Loading

Description

Motivation and Context

Have you tested this? If so, how?

Tarrasch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

riga commented May 23, 2018

Tarrasch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tarrasch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

riga Jun 6, 2018 • edited Loading

Choose a reason for hiding this comment

Tarrasch commented Jun 6, 2018

riga commented Jun 7, 2018

Tarrasch commented Jun 16, 2018

riga commented May 18, 2018 •

edited

Loading

riga Jun 6, 2018 •

edited

Loading