-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[Discuss] Avoid IEnumeration allocation (List+Dictionary) #4468
Conversation
Allocations for https://gist.github.com/benaadams/d05000ae4b4935aa3e3b39ce47713a6d var list = new List<int>();
var ilist = (IList<int>)list;
for (var i = 0; i < 100; i++)
{
list.Add(i);
}
for (var i = 0; i < 1000000; i++)
{
foreach(var value in ilist)
{
;
}
} Before 1M objects at 40 MBytes After 5 objects at 1,224 bytes |
_objects = new T[maxPooled]; | ||
} | ||
|
||
/// <summary>Takes an object from the pool. If the pool is empty, returns a new object.</summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the new object created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its just new T()
; I'm investigating something more generic with a lamda
...but so many angle brackets...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look at line 68; obj ?? new T();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, missed the ?? new T()
. Costly due to the use of Activator.CreateInstance
.
As the potential cost of BenchmarkDotNet=v0.9.4.0
OS=Microsoft Windows NT 6.1.7601 Service Pack 1
Processor=Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz, ProcessorCount=8
Frequency=2630751 ticks, Resolution=380.1196 ns, Timer=TSC
HostCLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
JitModules=clrjit-v4.6.100.0
Type=Program Mode=Throughput Runtime=Clr LaunchCount=1
Note: Regular is So there is a cost, but it's roughly 80 nanoseconds (I'm not sure whether that's a problem or not?) |
@mattwarren and taking a parameter |
@benaadams Not sure I'm following, could you update the gist (or leave a comment on it) with the code you want to test and I'll run it for you? |
Here's the updated results, BenchmarkDotNet=v0.9.4.0
OS=Microsoft Windows NT 6.1.7601 Service Pack 1
Processor=Intel(R) Core(TM) i7-4800MQ CPU @ 2.70GHz, ProcessorCount=8
Frequency=2630751 ticks, Resolution=380.1196 ns, Timer=TSC
HostCLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
JitModules=clrjit-v4.6.100.0
Type=Program Mode=Throughput Runtime=Clr
LaunchCount=1
|
Changed the pool to take a Measuring perf now |
return enumerator.Initalize(this); | ||
var enumerator = ObjectPool<PooledIEnumerator<T, Enumerator, List<T>>>.Shared.Rent( | ||
() => new PooledIEnumerator<T, Enumerator, List<T>>()); | ||
return enumerator.Initalize((list) => new Enumerator(list), this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems overly complicated. enumerator._enumerator = new Enumerator(list)
would do the job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, good point!
Current impl takes 846ms and has 12 Gen0 collections (x3.25 slower than struct) |
1dc1cd2
to
bed59a7
Compare
Main issue remaining so far, is no one trusts people to stop using objects post- |
Closing this as seen better solutions (e.g. @jaredpar's enumerator) |
(For discussion)
1M enumerations from 1M objects at 40 MBytes to 5 objects at 1,224 bytes
Current impl takes 846ms and has 12 Gen0 collections (x3.25 slower than struct)
This impl takes 714ms and has 0 GC collections +15% faster than current (x2.7 slower than struct)
(Struct enumerator takes 260ms and has 0 GC collections)
The common
IEnumerator
interfaceforeach
codeTurns into something like
Which boxes the struct enumerator to the interface and causes allocation.
This change introduces
ObjectPool<T>
; closely based onArrayPool<T>
- as aninternal
type (so as not to expose new api).List<T>
andDictionary<TKey, TValue>
use this shared pool for class enumerators which contain the struct enumerator. This allows no boxing/allocation (beyond the initial pool allocations); when the interface is used for List and Dictionary enumeration.The enumerator is returned to the pool on the call to Dispose and the struct enumerator set to
default(Enumerator)
.Interface dispatch will still be slower than using the stuct enumerator however.
Resolves dotnet/roslyn#4446 (Measurements below)
/cc @rynowak, @stephentoub, @davidfowl, @jaredpar, @JonHanna