Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added worker-id commandline parameter #2655

Merged
merged 2 commits into from Apr 2, 2019
Merged

Added worker-id commandline parameter #2655

merged 2 commits into from Apr 2, 2019

Conversation

TMaYaD
Copy link
Contributor

@TMaYaD TMaYaD commented Feb 12, 2019

Re-opening #2267 with a rebased branch.

Description

This change adds a --worker-id which will be used for worker to identify itself. The user is expected to make sure these are unique.

Motivation and Context

We needed to start and (safely) shutdown workers based on resource availability. In order to use disable-worker rpc call we needed the worker id. But there is no way to know the same. We can't rely on the included host name since we are using ECS to run the workers inside containers
This change allows us to set the worker-id using a known uuid and other details and use the same to disable it later.

Have you tested this? If so, how?

This is already available as a function parameter (which doesn't seem to be used anywhere). We tested it in our own setup without any issues.

luigi/worker.py Outdated
@@ -407,6 +407,8 @@ def check_complete(task, out_queue):
class worker(Config):
# NOTE: `section.config-variable` in the config_path argument is deprecated in favor of `worker.config_variable`

id = Parameter(default=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameters should have a default of ''. If you truly want default None, consider OptionalParameter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @dlstadther, but for this case certainly use '' and just do if on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this to '' should be good enough. We are already doing _config.id or anyway.

luigi/worker.py Outdated
self._config = worker(**kwargs)

worker_id = worker_id or self._config.id or 'Worker(%s)' % ', '.join(['%s=%s' % (k, v) for k, v in self._worker_info])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you've already modified this line, any objection to going ahead and replacing that ugly % string interpolation with .format()? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also not do a one liner for this, so it's more readable

Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally makes sense. Thanks!

luigi/worker.py Outdated
@@ -407,6 +407,8 @@ def check_complete(task, out_queue):
class worker(Config):
# NOTE: `section.config-variable` in the config_path argument is deprecated in favor of `worker.config_variable`

id = Parameter(default=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @dlstadther, but for this case certainly use '' and just do if on it.

luigi/worker.py Outdated
self._config = worker(**kwargs)

worker_id = worker_id or self._config.id or 'Worker(%s)' % ', '.join(['%s=%s' % (k, v) for k, v in self._worker_info])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also not do a one liner for this, so it's more readable

@@ -625,6 +626,10 @@ def _generate_worker_info(self):
pass
return args

def _generate_worker_id(self, worker_info):
worker_info_str = ', '.join(['{}={}'.format(k, v) for k, v in worker_info])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think format looks any less uglier. f-strings would be a different matter but i don't think they're supported on older python.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We promote format since it's more pythonic. It's implemented without magic unlike %. Luigi only supports 2.7+ (other versions are long deprecated)

@@ -625,6 +626,10 @@ def _generate_worker_info(self):
pass
return args

def _generate_worker_id(self, worker_info):
worker_info_str = ', '.join(['{}={}'.format(k, v) for k, v in worker_info])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We promote format since it's more pythonic. It's implemented without magic unlike %. Luigi only supports 2.7+ (other versions are long deprecated)

@Tarrasch
Copy link
Contributor

Can you fix the merge conflict? Sorry about this, I would do it in the browser but I'm afraid I'll break the line length limit if I do it in the browser.

@TMaYaD
Copy link
Contributor Author

TMaYaD commented Apr 1, 2019

@Tarrasch Done. You can merge it in.

@dlstadther
Copy link
Collaborator

Looks like this is ready! Merging

@dlstadther dlstadther merged commit a655af9 into spotify:master Apr 2, 2019
@TMaYaD TMaYaD deleted the set-worker-id branch April 3, 2019 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants