Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FrozenDictionary specialization for integers / enums #111886

Merged
merged 3 commits into from
Jan 28, 2025

Conversation

stephentoub
Copy link
Member

This adds a specialized dictionary implementation for when TKey is an integer (some integer types) or an enum backed by an integer. It employs an array that can be indexed into directly, handling both the cases where the values are contiguous from zero and where they're either non-contiguous or not from zero. A density threshold is used to decide when to fallback to another implementation rather than expending more memory on unused array slots.

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Collections.Frozen;
using System.Net;

BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);

[MemoryDiagnoser(false)]
public class DayOfWeekTests : Tests<DayOfWeek> { }

[MemoryDiagnoser(false)]
public class HttpStatusCodeTests : Tests<HttpStatusCode> { }

[MemoryDiagnoser(false)]
public class ConsoleKeyTests : Tests<ConsoleKey> { }

public abstract class Tests<T> where T : struct, Enum
{
    private static readonly T[] s_values = Enum.GetValues<T>().Distinct().ToArray();
    private static readonly T s_lookup = s_values[s_values.Length / 2];

    private static readonly FrozenDictionary<T, string> s_hsc = s_values.ToFrozenDictionary(d => d, d => d.ToString());

    [Benchmark]
    public string Get() => s_hsc[s_lookup];

    [Benchmark]
    public FrozenDictionary<T, string> Create() => s_values.ToFrozenDictionary(d => d, d => d.ToString());
}
Type Method Toolchain Mean Ratio RatioSD Allocated
ConsoleKeyTests Get \main\corerun.exe 2.5985 ns 1.00 0.09 -
ConsoleKeyTests Get \pr\corerun.exe 1.9684 ns 0.76 0.05 -
DayOfWeekTests Get \main\corerun.exe 3.6290 ns 1.00 0.02 -
DayOfWeekTests Get \pr\corerun.exe 0.7337 ns 0.20 0.00 -
HttpStatusCodeTests Get \main\corerun.exe 2.8100 ns 1.00 0.02 -
HttpStatusCodeTests Get \pr\corerun.exe 1.9481 ns 0.69 0.02 -
ConsoleKeyTests Create \main\corerun.exe 11,878.1837 ns 1.02 0.20 18084 B
ConsoleKeyTests Create \pr\corerun.exe 9,649.4596 ns 0.83 0.11 -
DayOfWeekTests Create \main\corerun.exe 309.8565 ns 1.00 0.03 728 B
DayOfWeekTests Create \pr\corerun.exe 301.6182 ns 0.97 0.02 672 B
HttpStatusCodeTests Create \main\corerun.exe 3,819.6599 ns 1.00 0.02 8033 B
HttpStatusCodeTests Create \pr\corerun.exe 3,551.3032 ns 0.93 0.01 11042 B

(something is wrong with the allocation number for ConsoleKeyTests... it's not 0, but BDN keeps reporting that)

This adds a specialized dictionary implementation for when TKey is an integer (some integer types) or an enum backed by an integer. It employs an array that can be indexed into directly, handling both the cases where the values are contiguous from zero and where they're either non-contiguous or not from zero. A density threshold is used to decide when to fallback to another implementation rather than expending more memory on unused array slots.
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @stephentoub !

This reminded me of a conversation I recently had on Twitter where another interesting scenario was mentioned: single item dictionaries / hash sets: https://x.com/STeplyakov/status/1872347499119423630

@stephentoub stephentoub merged commit 4b98d32 into dotnet:main Jan 28, 2025
85 checks passed
@stephentoub stephentoub deleted the densedict branch January 28, 2025 20:33
grendello added a commit to grendello/runtime that referenced this pull request Jan 28, 2025
* main: (31 commits)
  Fix linux-x86 build (dotnet#111861)
  Add FrozenDictionary specialization for integers / enums (dotnet#111886)
  [SRM] Refactor reading from streams. (dotnet#111323)
  Sign the DAC and DBI during the build process instead of in separate steps (dotnet#111416)
  Removing Entry2MethodDesc as it is unnecessary (dotnet#111756)
  Cross Product for Vector2 and Vector4 (dotnet#111265)
  Handle unicode in absolute URI path for combine. (dotnet#111710)
  Drop RequiresProcessIsolation on mcc tests (dotnet#111887)
  [main] Update dependencies from dotnet/roslyn (dotnet#111691)
  new trimmer feature System.TimeZoneInfo.Invariant (dotnet#111215)
  [browser] reduce msbuild memory footprint (dotnet#111751)
  Add debugging checks for stack overflow tests failure (dotnet#111867)
  Localized file check-in by OneLocBuild Task: Build definition ID 679: Build ID 2629821 (dotnet#111884)
  Bump main to preview2 (dotnet#111882)
  Avoid generic virtual dispatch for frozen collections alternate lookup (dotnet#108732)
  Bump main versioning to preview1 (dotnet#111880)
  Switch OneLoc to main (dotnet#111872)
  Improve docs on building ILVerify (dotnet#111851)
  Update Debian version to 13 (dotnet#111768)
  win32: add fallback to environment vars for system folder (dotnet#109673)
  ...

if (typeof(TKey) == typeof(int) || (typeof(TKey).IsEnum && typeof(TKey).GetEnumUnderlyingType() == typeof(int)))
return TryCreate<TKey, int, TValue>(source, out result);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about type char?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have concrete examples where TKey=char is helpful?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, a frozen dictionary with keys from 'A' to 'Z'? I guess such case is fairly common.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess such case is fairly common.

Examples?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dictionary of something like:
'A' -> "Alpha"
'B' -> "Bravo"
'C' -> "Charlie"
...

Why not support char considering some rare (I thought) types such as sbyte have been included.


// Sort the values so that we can more easily check for contiguity but also so that
// the keys/values returned from various properties/enumeration are in a predictable order.
Array.Sort(keys, values);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting is expensive. Is it necessary?

  1. min == 0 && length == count is enough to get the result of isFull.
  2. It seems no need to sort before creating WithOptionalValues. And similar implementation can be used when creating WithFullValues to skip sorting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary?

As noted in the comment, I wanted the keys/values in order, as we do for e.g. SmallValueTypeComparableFrozenDictionary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SmallValueTypeComparableFrozenDictionary must keep keys in order because its searching relies on this. DenseIntegralFrozenDictionary doesn't have such limit, so I don't think the order of the keys makes sense. And even if we need to order the keys of DenseIntegralFrozenDictionary , there's better way to implement that without sorting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And even if we need to order the keys of DenseIntegralFrozenDictionary , there's better way to implement that without sorting.

?

Copy link
Contributor

@skyoxZ skyoxZ Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For WithOptionalValues, after setting the elements of optionalValues:

int keyIndex = 0;
for (i = 0; i < optionalValues.Length; i++)
{
    if (optionalValues[i].HasValue)
    {
        keys[keyIndex] = i + min;
        values[keyIndex] = optionalValues[i].Value;
        keyIndex++;
    }
}

And for WithFullValues, use similar way but Optional is not needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's not the point. Users never expect FrozenDictionary.Keys are sorted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants