-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: new TypeLocal
class to get/set arbitrary values based on a supplied Generic Type
#59718
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
I see the following risks:
DependentHandle or ConditionalWeakTable can be used to achieve the same effect without introducing a new API. |
I did mention
I don't understand why this would cause a memory leak. Or maybe you are calling out why the "API Proposal" did not include a
Can you please elaborate? I don't see how using either |
Because the stored value (of reference type) will live forever until
Sure: public static class TypeStore<T>
{
private static readonly ConditionalWeakTable<Type, object> _storage = new();
public bool TryGet<TValue>(out TValue result) => _storage.TryGetValue(typeof(TValue), out result);
public void Set<TValue>(TValue value) => _storage.AddOrUpdate(typeof(TValue), value);
} Now the problem with a memory leak becomes more clear. According to ConditionalWeakTable docs, the table keeps weak reference to the key. When the key has no strong references, the value is reclaimable by GC. However, typeof(TValue) returns the same
It means that the object returned by Generally speaking, there is no lifetime concept applicable to constructed generic type. Probably, the best choice here is using Now let's return back to API proposal: var obj = new object();
TypeLocal<string>.Set(obj); What the lifetime of |
I think you are misunderstanding the intent behind this proposal. The intent is to have an instance based storage of values per type, that does not require a dictionary lookup. The example you provide using I am working under the assumption that the static based workflow (repeated below) does not require a dictionary lookup: public static class TypeStore<T>{
public static int Value;
} This is an excellent high-performance way of storing/retrieving data on a per-type basis, but it is only available as a static global. For high performance scenarios it would be very beneficial to have the same benefit on a per-instance basis.
It would be used very similar to a dictionary, not a global reference. I think my explanation of the existing global static workflow is confusing your interpretation of my proposal. The usage would be the same as a To be clear, this proposal would require modifications to the dotnet runtime. The skills required to speculate the extent of the modifications are sadly beyond my reach. |
Tagging subscribers to this area: @dotnet/area-system-runtime Issue DetailsBackground and motivationYou can store and retrieve Type specific values efficiently using static classes, such as: public static class TypeStore<T>{
public static int Value;
} thus, for a given type supplied to a Generic method you can set/get a custom value without resorting to dictionary lookups based on the Type. This is excellent, but there is no way to do this on a per-instance basis. Doing so can greatly improve efficiency of code needing per-type values by removing a dictionary lookup each time a type is supplied. This is a feature request for performance optimization reasons. A real-world example situation where this is helpful is in a game engine "Entity Component System" where the user (the game developer) arbitrarily wants to pick a component of type API Proposal/// <summary>
/// similar to ThreadLocal, but provides a value per type.
/// </summary>
/// <typeparam name="TValue"></typeparam>
public interface TypeLocal<TValue>
{
public TValue Get<TType>();
public bool TryGet<TType>(out TValue value);
public void Set<TType>(TValue value);
public void Remove<TType>();
public bool HasValue<TType>();
public IEnumerator<(Type type, TValue value)> GetEnumerator();
} API Usagepublic class SomeClass
{
private int indexCounter;
public int GetTypeIndex<T>()
{
var typeLocal = new TypeLocal<int>();
if (!typeLocal.TryGet<T>(out var index){
index = indexCounter++;
typeLocal.Set<T>(index);
}
return index;
}
} Overall the general usage pattern would be very similar to a normal Risks???
|
@jasonswearingen , thanks for clarification! If I understand correctly,
|
you are right, I'm sorry I rushed through the example unthoughtfully. here is a better example. Please let me know if it needs further explanation. public class MyLookupHelper //helper to get an array index
{
private int indexCounter;
private TypeLocal<int> typeLocal = new();
public int GetIndex<T>()
{
if (!typeLocal.TryGet<T>(out var index){
index = indexCounter++;
typeLocal.Set<T>(index);
}
return index;
}
} PS: will update the original example with this too. |
Could you elaborate on what you mean here? Even with RyuJIT in particular will currently ensure that this is "specialized" for value type https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIGYACY8pJ0hgFQE8AHGAZQzQYAHg4A+AN40Gspo2asAlgDsMDAGrYANgFcYAbhoBfGjXpsGAYQbTqc+UxZMUDALIAKAJQy5dhw7MAJwe3HyCwiKqGOIAdFp6MF5G9gGywaG8AkKwIgBm2hDYMfE6+sm+aRlh2ZEQwABWMGAlCeUpaenkITURucwADHFtSR2d1Vl9olbDZaOVDKbUxkA=== shows an example where the disassembly of: Console.WriteLine(TypeStore<int>.Value);
Console.WriteLine(TypeStore<float>.Value);
Console.WriteLine(TypeStore<object>.Value);
Console.WriteLine(TypeStore<string>.Value);
Console.WriteLine(TypeStore<C>.Value); is: L0000: sub rsp, 0x28
L0004: mov rcx, 0x7ff9ee6fcb90
L000e: xor edx, edx
L0010: call 0x00007ffa49cc8b60
L0015: mov ecx, [rax+8]
L0018: call System.Console.WriteLine(Int32)
L001d: mov rcx, 0x7ff9ee6fcb90
L0027: mov edx, 1
L002c: call 0x00007ffa49cc8b60
L0031: mov ecx, [rax+8]
L0034: call System.Console.WriteLine(Int32)
L0039: mov rcx, 0x7ff9ee6fcb90
L0043: mov edx, 2
L0048: call 0x00007ffa49cc8b60
L004d: mov ecx, [rax+8]
L0050: call System.Console.WriteLine(Int32)
L0055: mov rcx, 0x7ff9ee6fcb90
L005f: mov edx, 3
L0064: call 0x00007ffa49cc8b60
L0069: mov ecx, [rax+8]
L006c: call System.Console.WriteLine(Int32)
L0071: mov rcx, 0x7ff9ee6fcb90
L007b: mov edx, 4
L0080: call 0x00007ffa49cc8b60
L0085: mov ecx, [rax+8]
L0088: add rsp, 0x28
L008c: jmp System.Console.WriteLine(Int32) |
I think the best explanation would be a benchmark comparison. Here is one showing the performance discrepancy: BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19042.1237 (20H2/October2020Update)
AMD Ryzen 7 3700X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.100-rc.1.21458.32
[Host] : .NET 6.0.0 (6.0.21.45113), X64 RyuJIT
DefaultJob : .NET 6.0.0 (6.0.21.45113), X64 RyuJIT
| Method | Mean | Error | StdDev |
|------------------- |------------:|----------:|----------:|
| GlobalLookupTest | 18.68 us | 0.226 us | 0.189 us |
| InstanceLookupTest | 1,271.44 us | 24.573 us | 26.293 us | Here is the code associated with the benchmark, run via BenchmarkDotNet ***(Click to expand)***using BenchmarkDotNet.Attributes;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace lowlevel_benchmark.Benchmarks;
public abstract class GlobalLookup
{
protected static int indexCounter;
public static int Get<T>()
{
return GlobalLookup<T>.Get();
}
}
public class GlobalLookup<T> : GlobalLookup
{
private static int index = Interlocked.Increment(ref indexCounter);
public static int Get()
{
return index;
}
}
public class InstanceLookup
{
private int indexCounter;
private Dictionary<Type, int> storage = new();
public int Get<T>()
{
if (!storage.TryGetValue(typeof(T), out var index))
{
index = indexCounter++;
storage.Add(typeof(T), index);
}
return index;
}
}
public class TypeBased_Lookups
{
private int LOOPCOUNT = 10000;
private InstanceLookup instanceLookup = new();
[Benchmark]
public int GlobalLookupTest()
{
var toReturn = 0;
for (var i = 0; i < LOOPCOUNT; i++)
{
//toReturn += GlobalLookup.Get<int>();
//toReturn += GlobalLookup.Get<float>();
//toReturn += GlobalLookup.Get<bool>();
//toReturn += GlobalLookup.Get<long>();
//toReturn += GlobalLookup.Get<byte>();
//toReturn += GlobalLookup.Get<short>();
//toReturn += GlobalLookup.Get<double>();
//toReturn += GlobalLookup.Get<string>();
//toReturn += GlobalLookup.Get<object>();
toReturn += GlobalLookup.Get<GlobalLookup<int>>();
toReturn += GlobalLookup.Get<GlobalLookup<string>>();
toReturn += GlobalLookup.Get<GlobalLookup<object>>();
toReturn += GlobalLookup.Get<GlobalLookup<TypeBased_Lookups>>();
toReturn += GlobalLookup.Get<InstanceLookup>();
toReturn += GlobalLookup.Get<TypeBased_Lookups>();
toReturn += GlobalLookup.Get<System.Threading.Thread>();
toReturn += GlobalLookup.Get<System.Collections.ArrayList>();
}
return toReturn;
}
[Benchmark]
public int InstanceLookupTest()
{
var toReturn = 0;
for (var i = 0; i < LOOPCOUNT; i++)
{
//toReturn += instanceLookup.Get<int>();
//toReturn += instanceLookup.Get<float>();
//toReturn += instanceLookup.Get<bool>();
//toReturn += instanceLookup.Get<long>();
//toReturn += instanceLookup.Get<byte>();
//toReturn += instanceLookup.Get<short>();
//toReturn += instanceLookup.Get<double>();
//toReturn += instanceLookup.Get<string>();
//toReturn += instanceLookup.Get<object>();
toReturn += instanceLookup.Get<GlobalLookup<int>>();
toReturn += instanceLookup.Get<GlobalLookup<string>>();
toReturn += instanceLookup.Get<GlobalLookup<object>>();
toReturn += instanceLookup.Get<GlobalLookup<TypeBased_Lookups>>();
toReturn += instanceLookup.Get<InstanceLookup>();
toReturn += instanceLookup.Get<TypeBased_Lookups>();
toReturn += instanceLookup.Get<System.Threading.Thread>();
toReturn += instanceLookup.Get<System.Collections.ArrayList>();
}
return toReturn;
}
} They do the same thing, but one is a static, the other is an instance. |
public int Get<T>()
{
if(!storage.TryGetValue(typeof(T), out var index))
{
index = indexCounter++;
storage.Add(typeof(T), index);
}
return index;
} You can optimize this code. Currently, it requires two lookups: |
I updated my benchmark (prior post) to only use reference types. (prior it was mostly the special value types). The results still hold.
I believe that is true for only the first loop through the first benchmark run, all further runs will just do the single lookup. But, in interest of trying, I did what you suggest: ***(Click to expand)***public class InstanceLookup
{
private int indexCounter;
private Dictionary<Type, int> storage = new();
public int Get<T>()
{
ref var toReturn = ref System.Runtime.InteropServices.CollectionsMarshal.GetValueRefOrAddDefault(storage, typeof(T), out var exists);
if (!exists)
{
toReturn = indexCounter++;
}
return toReturn;
//if (!storage.TryGetValue(typeof(T), out var index))
//{
// index = indexCounter++;
// storage.Add(typeof(T), index);
//}
//return index;
}
} and get slightly better results: | Method | Mean | Error | StdDev |
|------------------- |------------:|----------:|----------:|
| GlobalLookupTest | 18.64 us | 0.132 us | 0.110 us |
| InstanceLookupTest | 1,021.10 us | 20.000 us | 28.684 us | |
@jasonswearingen , I found the solution for you. As far as I understand, you trying to emulate the following data type: public struct TypeLocal<TValue>
{
private TValue valueForType1;
private TValue valueForType2;
private TValue valueForType3;
} But the number of fields is not known because actual Here is the solution and the benchmark: [SimpleJob(runStrategy: RunStrategy.Throughput, launchCount: 1)]
[Orderer(SummaryOrderPolicy.FastestToSlowest)]
public class LookupBenchmark
{
public static class StaticTypeLocal<TValue>
{
private static class TypeStore<T>
{
internal static TValue Value;
}
public static TValue Get<T>() => TypeStore<T>.Value;
public static void Set<T>(TValue value) => TypeStore<T>.Value = value;
}
public readonly struct InstanceTypeLocalDictionary<TValue>
{
private readonly Dictionary<Type, TValue> _storage = new(10);
public TValue Get<T>() => _storage[typeof(T)];
public void Set<T>(TValue value) => _storage[typeof(T)] = value;
}
public struct InstanceTypeLocalArray<TValue>
{
private static volatile int TypeIndex = -1;
private static class TypeSlot<T>
{
internal static readonly int Index = Interlocked.Increment(ref TypeIndex);
}
private TValue[] storage;
public InstanceTypeLocalArray()
{
storage = new TValue[Math.Max(1, TypeIndex + 1)];
}
private TValue[] EnsureStorageCapacity<T>()
{
if (TypeSlot<T>.Index >= storage.Length)
Array.Resize(ref storage, TypeSlot<T>.Index + 1);
return storage;
}
public void Set<T>(TValue value)
{
Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(EnsureStorageCapacity<T>()), TypeSlot<T>.Index) = value;
}
public TValue Get<T>()
{
return Unsafe.Add(ref MemoryMarshal.GetArrayDataReference(EnsureStorageCapacity<T>()), TypeSlot<T>.Index);
}
}
private InstanceTypeLocalDictionary<int> typeLocalDictionary = new();
private InstanceTypeLocalArray<int> typeLocalArray = new();
[Benchmark]
public int UsingStaticClass()
{
StaticTypeLocal<int>.Set<string>(10);
return StaticTypeLocal<int>.Get<string>();
}
[Benchmark]
public int UsingDictionary()
{
typeLocalDictionary.Set<string>(10);
return typeLocalDictionary.Get<string>();
}
[Benchmark]
public int UsingArray()
{
typeLocalArray.Set<string>(10);
return typeLocalArray.Get<string>();
}
} My benchmark results:
Additionally, the solution can be optimized for memory consumption. Actually, we can rent the array from the pool, implement There is one drawback of the solution: if you have a large number of Also, I'm not pretty sure that that guys from .NET Team will approve that API because its use case is very specific. If this will not happen, I can suggest to move the implementation of this data structure to .NEXT library which is a part of the .NET Foundation. P.S.: struct in my example is just a way to remove the indirection. Probably, the best choice for |
Thanks for your help @sakno , I implemented a similar workaround, but yours is better, in that you were able to figure out how to encapsulate it into a I do think that a runtime native solution would be greatly beneficial: as, besides the space inefficiencies you mentioned, your benchmark shows that in some circumstances it's still a 35x performance increase using the Static vs the Array solution. |
There is no magic and CLR/C++ level. The solution still requires array-like data structure, no matter in which language it is implemented. IMO, ~ 1.6 ns is a very good result. |
Here is another way. :)
|
@jasonswearingen I know this post is kind of old, but DotNext provides a couple of different type map implementations and their documentation links here. Might be worth you giving it a look. |
Background and motivation
You can store and retrieve Type specific values efficiently using static classes, such as:
thus, for a given type supplied to a Generic method you can set/get a custom value without resorting to dictionary lookups based on the Type.
This is excellent, but there is no way to do this on a per-instance basis. Doing so can greatly improve efficiency of code needing per-type values by removing a dictionary lookup each time a type is supplied.
This is a feature request for performance optimization reasons. A real-world example situation where this is helpful is in a game engine "Entity Component System" where the user (the game developer) arbitrarily wants to pick a component of type
TComponent
, and the game engine needs to somehow look this up in a very efficient manner, due to frame time constraints plus that this kind of lookup may be done many times during a game frame.API Proposal
API Usage
Note this is an updated example because my original example was not clear:
Overall the general usage pattern would be very similar to a normal
Dictionary<Type,Value>
Risks
???
The text was updated successfully, but these errors were encountered: