Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Commit

Permalink
Improve get/set performance in HashSet<T> (#40106)
Browse files Browse the repository at this point in the history
* Improve get/set performance in HashSet<T>

Similar optimization in Dictionary<,> has been ported to HashSet<>.

`_comparer` has been made nullable, and default by `null`. This
makes us possible to check if it's null for faster branches, instead
of comparing it to `EqualityComparer<T>.Default`. Since `_comparer`
is `null` by default, the constructors and the `OnDeserialization`
method have been modified accordingly to save `null` instead of the
`EqualityComparer<T>.Default` if it's been passed/saved. Also, the
places that use `_comparer` directly much do a null check and switch
to `EqualityComparer<T>.Default` instead.

The major optimization is under `Contains` and `AddIfNotPresent`
methods. Basically the code was:

``
// Code uses _comparer
``

Now it becomes:

```
var comparer = _comparer;
if (comparer == null)
{
  if (default(T) != null)
  {
    // Code uses EqualityComparer<T>.Default for de-virtualizing.
  }
  else
  {
    var defaultComparer = EqualityComparer<T>.Default;
    // Code uses cached defaultComparer
  }
}
else
{
  // Code uses comparer
}
```

I've intentionally kept the origin implementation and comments as
much as possible, and the optimized code is not exactly the same as
in `Dictionary<T>`. Further micro-optimizations might be possible
but I think it's cleaner to keep it that way in the first step.

The `Remove` method hasn't been optimized since it's complicated,
and not been done in `Dictionary` either. Some other methods like
`AddOrGetLocation` or `InternalIndexOf` are not optimized since
they're only used in less-common scenarios, and not in the scope
of the current issue.

The benchmark shows there're obvious improvements in get and set
operations for all cases, especialy if the item is a value type.
I was actually expecting to see some pull back for "slower" case,
like if we use a custom comparer, but it turns out it's getting
faster too, since we have cached the comparer instead of using
`_comparer` directly in the loop.

The benchmark also shows the optimized `HashSet` has close or
better performance comparing to `Dictionary` in almost all cases,
except for `HashSet<string>` with default comparer. The default
comparer in `Dictionary<string>` is a "non-random" implementation
that performs better. Unfortunately the comparer is internal to
the CoreLib.

The optimized `HashSet<string>` is still faster since it's using
cached comparer of type `EqualityComparer<string>`, which uses
virtual method dispatching instead of interface dispatching. It's
just not as faster as `Dictionary<string, T>` yet.

Fix #36448

* Use Equals() to check _comparer in deserialization

After deserialization the `_comparer` will be set to a different
instance of default comparer, so we need to use `Equals` to compare
`_comparer` with `EqualityComparer<T>.Default` instead of reference
comparison by `==`.

* Remove unnecessary "default!" found in code review

* Ignore resetting "_comparer" in OnDeserialization

Previously we tried to reset `_comparer` in `OnDeserialization` if
it's `EqualityComparer<T>.Default`. But after discussion in #40150
we decide to follow the same approach as in `Dictionary<,>` to make
it simple.

* Remove extra line according to code review.

* Use _comparer?.GetHashCode(key) ?? key.GetHashCode() to leverage intrinsic.
Same to the Equals calls.

* Add "AggressiveInlining" to InternalGetHashCode/Equals methods

Make methods like "Remove" faster.

* Apply similar optimization to Remove and TryGetValue operations.

* Remove trailing whitespaces.

* Remove trailing whitespaces.

* Manually inline InternalEquals() method
  • Loading branch information
JeffreyZhao authored and danmoseley committed Nov 6, 2019
1 parent db4a737 commit f9564cd
Showing 1 changed file with 309 additions and 75 deletions.
Loading

0 comments on commit f9564cd

Please sign in to comment.