-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check exact equality for worker state #3483
Conversation
cc @StephanErb |
Thanks for picking this up @bnaul This looks good to me. I'll merge after either a +1 from @StephanErb or a couple days. |
My guess is that all of the test failures are intermittent. When we changed to spawn by default it unearthed a bunch of things. Squashing down a couple now. |
Well, this one is interesting @gen_cluster(timeout=1000, client=True)
def test_recompute_released_key(c, s, a, b):
x = c.submit(inc, 100)
result1 = yield x
xkey = x.key
del x
import gc
gc.collect()
yield gen.moment
> assert c.refcount[xkey] == 0
E assert 1 == 0
E -1
E +0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing the follow up. As you have requested my review I have looked into the code a bit deeper. Please see my questions below.
distributed/scheduler.py
Outdated
return type(self) == type(other) and (self.name, self.host) == ( | ||
other.name, | ||
other.host, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if using name
and host
is correct in all cases:
name
is optional and could be Nonehost
is not necessary unique as multiple workers could be running on the same machine
I am lacking necessary Dask context, so I am wondering why you have not picked address
instead? It is used for indexing the WorkerState
and the documentation says it is [t]his worker's unique key.
.
So I would assume, this here should work as expected:
def __hash__(self):
return hash(self.address)
def __eq__(self, other):
return type(self) == type(other) and self.address == other.address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just pulling items from the identity()
dictionary, no personal reason to prefer one to the other. @TomAugspurger @mrocklin would you agree address
makes more sense here?
Yes, address is probably the best thing to do in both cases.
…On Sun, Feb 16, 2020 at 1:23 PM Brett Naul ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In distributed/scheduler.py
<#3483 (comment)>:
> + return type(self) == type(other) and (self.name, self.host) == (
+ other.name,
+ other.host,
+ )
I was just pulling items from the identity() function, no personal reason
to prefer one to the other. @TomAugspurger
<https://github.com/TomAugspurger> @mrocklin <https://github.com/mrocklin>
would you agree address makes more sense here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3483?email_source=notifications&email_token=AACKZTHHWYYG6RFFP5E66PLRDGVFTA5CNFSM4KV3HXIKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCVWHPWY#discussion_r379934436>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTAXOVLPZTZ4VHWKZBLRDGVFTANCNFSM4KV3HXIA>
.
|
4724d85
to
3cf2bab
Compare
thanks @StephanErb @mrocklin , updated w/ your suggestions |
Per #3321 (comment), I agree it makes more sense to check for exact equality instead of relying on the hash (just in case 🙂).
I could see an argument for wrapping this tuple in a property so it can be reused, do we think that's preferable? I'd originally wanted to use
identity
forhash
andeq
but that has a bit too much going on IMO.