bpo-43475: Fix worst case collision behavior for NaN instances #25493

rhettinger · 2021-04-21T03:13:41Z

https://bugs.python.org/issue43475

bedevere-bot · 2021-04-21T04:08:17Z

🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 056a4f7 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

bedevere-bot · 2021-04-21T15:16:56Z

🤖 New build scheduled with the buildbot fleet by @rhettinger for commit 9f3f9e9 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

realead · 2021-06-10T07:35:05Z

Is my understanding right, that this PR would break the following code:

import math

class A:
    def __init__(self, a):
        self.a=a
    def __hash__(self):
        return hash(self.a)
    def __eq__(self, other):
        if(math.isnan(self.a) and math.isnan(other.a)):
            return True
        return self.a == other.a
    def __repr__(self):
        return str(self.a)
        
set([A(float("nan")), A(float("nan"))])  # result: {nan}

I.e. when somebody tries to wrap Float and change __eq__ in such a way, that all float-nans will be equivalent?

With this PR, the chances are high, that the result will be {nan, nan}, as hashes from both objects will be different.

Until now, it was clear - don't put nans into set/dict because the default "="-relation for floats isn't an equivalence relation. People worked around this by redefining the "="-relation and didn't so for hash function because until now "a,b - nans => hash(a)=hash(b)" was given.

I think the intuitive behavior for set([float("nan"), float("nan")] is {nan} and not {nan, nan}. Given how Py_EQ is defined for floats, this is not possible. Maybe there is need for a new Py_EQ_FOR_HASH comparator, which would be used in hashset/hashdict and be more or less the same as Py_EQ but would yield true for nan==nan.

bpo-43475: Fix worst case collision behavior for NaN instances

b9cde10

rhettinger added the performance Performance or resource usage label Apr 21, 2021

rhettinger requested review from mdickinson and tim-one April 21, 2021 03:13

rhettinger requested a review from tiran as a code owner April 21, 2021 03:13

the-knights-who-say-ni added the CLA signed label Apr 21, 2021

bedevere-bot added the awaiting core review label Apr 21, 2021

Fix decimal and docs as well

20b73c3

rhettinger removed the request for review from tiran April 21, 2021 03:38

Add casts

056a4f7

rhettinger added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 21, 2021

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 21, 2021

rhettinger added 2 commits April 21, 2021 07:50

Use existing pointer hash logic

dc73c3c

Improve comment wording

9f3f9e9

rhettinger added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 21, 2021

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Apr 21, 2021

Fix typo in comment

6ac6975

rhettinger merged commit a07da09 into python:master Apr 22, 2021

bedevere-bot removed the awaiting core review label Apr 22, 2021

mdickinson mentioned this pull request Jun 14, 2021

bpo-43475: Add what's new entry for NaN hash changes #26725

Merged

stuartarchibald mentioned this pull request Nov 8, 2021

Exp/3.10+new with+np121 numba/numba#7544

Closed

gvanrossum mentioned this pull request Dec 20, 2023

Add Py_HashDouble() function capi-workgroup/decisions#2

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-43475: Fix worst case collision behavior for NaN instances #25493

bpo-43475: Fix worst case collision behavior for NaN instances #25493

rhettinger commented Apr 21, 2021 •

edited by bedevere-bot

Loading

bedevere-bot commented Apr 21, 2021

bedevere-bot commented Apr 21, 2021

realead commented Jun 10, 2021

bpo-43475: Fix worst case collision behavior for NaN instances #25493

bpo-43475: Fix worst case collision behavior for NaN instances #25493

Conversation

rhettinger commented Apr 21, 2021 • edited by bedevere-bot Loading

bedevere-bot commented Apr 21, 2021

bedevere-bot commented Apr 21, 2021

realead commented Jun 10, 2021

rhettinger commented Apr 21, 2021 •

edited by bedevere-bot

Loading