-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hash of URIRef is not the same across python runs #500
Comments
i'm not entirely sure this is a must, but it's definitely a should... i'll implement this with the full qualified name though: also the current implementation of |
@drewp: what's your actual use case for this? discussing this with @gromgull it turns out that python 3.3+ use randomized hashing in order to counter DoS attacks (e.g., http://lemire.me/blog/archives/2014/04/23/do-you-realize-that-you-are-using-random-hashing/ ). Hence fixing this from that perspective doesn't seem to make much sense at all. I'm still somewhat for merging #501 as the current implementation doesn't do what it probably was intended to do: currently it hashes Any other thoughts on this? |
make Identifier.__hash__ consistent with str.__hash__ stability over runs, fixes #500
Use case: I was generating C code for an arduino, and to get a C variable name for a URIRef, I used Probably I should use hashlib.md5(dev).hexdigest() if I want a stable hash, or 'pwm_'+dev.encode('base64').strip('=\n') if I want to try something that's properly unique for each URI. Or re.sub(r'[^0-9a-zA-Z]', '_', dev) for something that's probably unique and also readable. |
Another use case is creating two rdflibURIRef("strign") from two strings with identical content. These two URIRef are not equal due to different hash values, as far as I can tell. |
python -c 'import rdflib; print hash(rdflib.URIRef("hi"))'
13312079974552043
python -c 'import rdflib; print hash(rdflib.URIRef("hi"))'
13312079974020587
It would be nice if this was stable. The fix is in term.py, where we should perhaps use hash(type(self).name) instead of hash(type(self)) .
The text was updated successfully, but these errors were encountered: