Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dart should provide a way of combining hashes #11617

Closed
blois opened this issue Jun 28, 2013 · 36 comments
Closed

Dart should provide a way of combining hashes #11617

blois opened this issue Jun 28, 2013 · 36 comments
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. core-2 type-enhancement A request for a change that isn't a bug

Comments

@blois
Copy link
Contributor

blois commented Jun 28, 2013

Basically if one overloads operator == they should also overload hashCode and generate a 'good' hash, but Dart doesn't offer any utilities to do so. This means that many users will generate hashes with poor distributions.

The JS string class internally uses http://en.wikipedia.org/wiki/Jenkins_hash_function, it would be great if there were a public API for this.

@lrhn
Copy link
Member

lrhn commented Sep 27, 2013

Removed Type-Defect label.
Added Type-Enhancement label.

@lrhn
Copy link
Member

lrhn commented Sep 27, 2013

How about something like:

class JenkinsHasher {
  int _hash = 0;

  void add(int value) {
    _hash += value;
    _hash += _hash << 10;
    _hash ^= _hash >> 6;
  }
  int get hash {
    int hash = _hash;
    hash += (hash << 3);
    hash ^= (hash >> 11);
    hash += (hash << 15);
    return hash;
  }
}

Then you can use it as:
  var hash = (new JenkinsHash()..add(foo.hashCode)..add(bar.hashCode)).hash;

?

@blois
Copy link
Contributor Author

blois commented Sep 27, 2013

That looks decent.

We currently have something close to this at:
https://code.google.com/p/dart/source/browse/branches/bleeding_edge/dart/sdk/lib/html/html_common/jenkins_smi_hash.dart

My understanding is that the masking in the above is needed for performance.


cc @rakudrama.

@jmesserly
Copy link

for the curious, the jenkins_smi_hash moved and is now at sdk/lib/math. Still private though.

@DartBot
Copy link

DartBot commented Apr 10, 2014

This comment was originally written by @chalin


Any progress on this? I agree with blois's original concern, and the comment in the source file

class _JenkinsSmiHash {
// TODO(11617): This class should be optimized and standardized elsewhere.

@DartBot
Copy link

DartBot commented Apr 10, 2014

This comment was originally written by [email protected]


FYI it's now part of Quiver:

https://github.com/google/quiver-dart/blob/master/lib/src/core/hash.dart

@DartBot
Copy link

DartBot commented Apr 10, 2014

This comment was originally written by @chalin


Nice. Then maybe this issue should be closed?

@blois
Copy link
Contributor Author

blois commented Apr 10, 2014

We have to have this code in dart: libraries for other uses, requiring that a package version be used just adds additional app bloat. Would prefer to keep this issue open, though the presence of Quiver probably lessens the criticality of it.

@jmesserly
Copy link

+1, we can't (currently) use quiver even in many of our packages (the ones in SVN)

@blois blois added Type-Enhancement area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. labels Apr 14, 2014
@stereotype441
Copy link
Member

@floitschG Thanks--I failed to notice the duplicate issue when I filed #25553.

Given that this issue is "priority-unassigned" and has been open for 2.5 years, what is the likelihood of it getting addressed soon? As I noted in #25553, an upcoming IntelliJ feature would benefit from it, so it would be nice if we could get to it soon.

It seems like it would be a pretty simple matter of renaming a class and adding some documentation. Would it be helpful if I submitted a CL for review?

@floitschG
Copy link
Contributor

It's not the actual work. Adding new classes to core-libraries have a very high bar to pass.
In general, the tendency is to remove things, not to add features. Especially, when the functionality fits nicely into a package (as is the case here).

We will discuss it, though.

@stereotype441
Copy link
Member

Ok, thanks. In case the context is useful to the discussion, the IntelliJ feature @jwren is working on is automatic generation of a hashcode getter for a class (if the user requests it). It would be nice to be able to use JenkinsSmiHash to create the hashcode (since this would give a far higher quality hashcode than a naive approach like XORing together the hashes of the components). But if JenkinsSmiHash is in a package, the feature would be a lot more complex, since IntelliJ would have to check for the presence of the package in pubspec.yaml and add it if it's not already present. Also the feature wouldn't work properly in systems that don't make use of pubspec.yaml.

@bwilkerson
Copy link
Member

And if the package containing the function needed to be added, it would have to ensure that adding it wouldn't introduce any version constraint conflicts. And then it would need to run pub in order to add the package to the .packages file. All of which would make this a very expensive operation.

If it's in a package I expect that the feature just won't use it and we'll have lower quality code being generated.

@a14n
Copy link
Contributor

a14n commented Jan 22, 2016

BTW quiver has this kind of functions for a while (see https://github.com/google/quiver-dart/blob/master/lib/src/core/hash.dart - Commits on Nov 14, 2013). Is it worth adding quiver as dependency and using it ?

@jwren
Copy link
Member

jwren commented Jan 22, 2016

+1 to what @bwilkerson and @stereotype441 said: adding a dependency is a heavier weight operation than I'd like for the Dart IntelliJ Live Template support. Adding an import isn't sufficient as pub run has the be executed.

@lrhn
Copy link
Member

lrhn commented Jan 22, 2016

If the code already exists in the platform libraries, just being private, then exposing it might be worth considering.
We don't necessarily want to add things, but exposing something that's already there is less of a concern to me - even if it means adding more public names.

What I can't decide is whether it belongs more in dart:core or dart:collection. :)

That said, I'm definitely more positive towards adding things than average :)

@stereotype441
Copy link
Member

@floitschG Have you had a chance to discuss this?

@floitschG
Copy link
Contributor

Not yet. Still on my TODO list, though.

@jwren
Copy link
Member

jwren commented Feb 2, 2016

For the time being, IntelliJ now generates with XORs:

class Example {
  var a, b, c, d, e;

  @override
  bool operator ==(Object other) {
    if (identical(this, other)) {
      return true;
    }
    return other is Example &&
        this.a == other.a &&
        this.b == other.b &&
        this.c == other.c &&
        this.d == other.d &&
        this.e == other.e;
  }

  @override
  int get hashCode {
    return a.hashCode ^ b.hashCode ^ c.hashCode ^ d.hashCode ^ e.hashCode;
  }


}

@natebosch
Copy link
Member

Any update on this? Do we think this is something we can expose from either the SDK or package:collection?

@lrhn
Copy link
Member

lrhn commented Dec 15, 2017

A canonical way to combine hashes is planned for Dart 2.0.

@natebosch
Copy link
Member

@lrhn - are you open to contributions for this?

If I create a patch with the interface described in #11617 (comment) do we see a path to getting it landed?

@matanlurey
Copy link
Contributor

Personally I think something like this would be more usable for folks:

class Object {
  static int hash(List<Object> any) {
    // ...
  }
}

... but even just JenkinsHasher is better than copy-pasting code everywhere.

@natebosch
Copy link
Member

We could also have jenkinsHasher.addAll(Iterable<Object>).

Similarly I'd be in favor of

class Object {
  static int hash(Object a, Object b, [Object c, ...]);
  static int hashAll(List<Object> objects);
}

@lrhn
Copy link
Member

lrhn commented Sep 6, 2018

I'm already working on it (so let me upload the CL instead of letting it sit in my repository while I figure out why some tools won't allow it), but I agree with the design here.

I'd love to add a proper generalized Hasher interface at some point, but for now we'll just use the same hasher everywhere. It's not like you actually need more than one, if it's good enough.

@dnfield
Copy link
Contributor

dnfield commented Mar 12, 2019

@lrhn any updates on this?

@rakudrama
Copy link
Member

Looking at the difficulty in writing a hashCode for a 'value'-Map implementation where == means the same keys and values independent of insertion order, I would suggest that corelib also supply a way of generating order-independent hashes (suitable for Sets and Maps)

class Object {
  static int hash(Object a, Object b, [Object c, ...]);
  static int hashAll(List<Object> objects);
  static int hashAllUnordered(Iterable<Object> objects);
}

@natebosch
Copy link
Member

I would suggest that corelib also supply a way of generating order-independent hashes (suitable for Sets and Maps)

IMO that would be a better fit in package:collection.

@rakudrama
Copy link
Member

@natebosch While hashing of collections is a natural fit for package:collection, I think unordered hashing has other applications (e.g. geometric invariance), and there is a discoverability tax to having the hashing methods in several places.

I did an investigation of what would be required for a general Object.hash to be as efficient as the hand-rolled hashCode methods and the methods did need to be known to the compiler (i.e. treated specially) to get there. The main problem with all of these utility methods is that they call o.hashCode with static type of o being Object, which is the most complex dispatch. There are patterns of specialization that would hit a sweet spot in between slow dispatch and excessive inlining, but the exact pattern of specialization depends on the compiler (dart2js would be very different to AOT).

@jmesserly
Copy link

+1 Stephen's point. In addition, package:collection can't be used from our SDK impls, which means we may have duplicate code in the SDK. Lastly, there are proposed Dart language features that may require the compilers to automatically implement hashing/equality for value types (e.g. dart-lang/language#125).

@techfan101
Copy link

I am new to Dart and, based on this thread, wondering if giving it a chance is a mistake.

Is something this simple (and necessary) truly still an open issue after almost seven years of discussion?

Please tell me I'm wrong. I am hoping an enhancement has already shipped and that someone simply forgot to close this issue. If so, please post the new technique / method.

If the issue truly is still open, please consider promoting its priority. Thank you.

Given the ease of implementation and testing, this sure seems like low-hanging fruit.

@dnfield
Copy link
Contributor

dnfield commented Mar 15, 2020

https://dart-review.googlesource.com/c/sdk/+/73360 Is the CL, unfortunately it's gotten held up by some other langauge design considerations.

@techfan101
Copy link

@dnfield thank you for the pointer. Its a shame something this easy is held up. Though, in this particular case, it benefited me. Working around its absence forced me to recognize a better design in an unrelated part of my project :)

@Erhannis
Copy link

Erhannis commented Nov 9, 2021

To recap, looks like now there's a Object.hash(...) function, and a few similar. Yay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. core-2 type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests