Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal]: First class native integer support #4385

Closed
1 of 4 tasks
tannergooding opened this issue Feb 3, 2021 · 10 comments · Fixed by #6167
Closed
1 of 4 tasks

[Proposal]: First class native integer support #4385

tannergooding opened this issue Feb 3, 2021 · 10 comments · Fixed by #6167
Assignees
Milestone

Comments

@tannergooding
Copy link
Member

tannergooding commented Feb 3, 2021

First class native integer support

  • Proposed
  • Prototype: Not Started
  • Implementation: Not Started
  • Specification: Not Started

Speclet: https://github.com/dotnet/csharplang/blob/main/proposals/csharp-11.0/numeric-intptr.md

Summary

Native integers should be fully integrated into the language such that they are "first class citizens". They are currently missing out in areas where a native integer is the logical type to use such as in Array creation, Array element access, pointer conversions, pointer element access, and pointer arithmetic.

Motivation

Native integers are implicitly supported in many of these scenarios today as they have an implicit conversion to long or ulong which is explicitly supported in said scenarios. However, long/ulong are "too large" on 32-bit platforms and as such their usage generally involves an explicit overflow check leading to less efficient code generation, particularly when most underlying platforms eventually convert to nint/nuint anyways as that is the required size too address memory.

Detailed design

Array Creation expressions

https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#array-creation-expressions should be reworded to:

Each expression in the expression list must be of type nint, nuint, or of a type implicitly convertible to int, uint, long, or ulong.

Following evaluation of each expression, the expression must be of type nint, nuint, or of a type an implicit conversion (Implicit conversions) to int, uint, long, or ulong can be performed. When an implicit conversion is performed, the first type in this list for which an implicit conversion exists is chosen.

Array element access

https://github.com/dotnet/csharplang/blob/master/spec/arrays.md#array-element-access should be reworded to:

Array elements are accessed using element_access expressions (Array access) of the form A[I1, I2, ..., In], where A is an expression of an array type and each Ix is an expression of type nint, nuint, or of a type implicitly convertible to int, uint, long, or ulong.

https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#array-access should be reworded to:

The number of expressions in the argument_list must be the same as the rank of the array_type, and each expression must be of type nint, nuint, or of a type implicitly convertible to int, uint, long, or ulong.

Following evaluation of each index expression, the expression must be of type nint, nuint, or of a type an implicit conversion (Implicit conversions) to int, uint, long, or ulong can be performed. When an implicit conversion is performed, the first type in this list for which an implicit conversion exists is chosen.

Pointer types

https://github.com/dotnet/csharplang/blob/master/spec/unsafe-code.md#pointer-types should be reworded to:

In other words, an unmanaged_type is one of the following:

  • sbyte, byte, short, ushort, int, uint, nint, nuint, long, ulong, char, float, double, decimal, or bool.

Pointer conversions

https://github.com/dotnet/csharplang/blob/master/spec/unsafe-code.md#pointer-conversions should be reworded to:

Additionally, in an unsafe context, the set of available explicit conversions (https://github.com/dotnet/csharplang/blob/master/spec/conversions.md#explicit-conversions) is extended to include the following explicit pointer conversions:

  • From any pointer_type to any other pointer_type.
  • From sbyte, byte, short, ushort, int, uint, nint, nuint, long, or ulong to any pointer_type.
  • From any pointer_type to sbyte, byte, short, ushort, int, uint, nint, nuint, long, or ulong.

However, on 32* and 64-bit CPU architectures with a linear address space, conversions of pointers to or from integral types typically behave exactly like conversions of uint, nuint, or ulong values, respectively, to or from those integral types.

Pointer element access

https://github.com/dotnet/csharplang/blob/master/spec/unsafe-code.md#pointer-element-access should be reworded to:

In a pointer element access of the form P[E], P must be an expression of a pointer type other than void*, and E must be an expression of type nint, nuint, or of a type that can be implicitly converted to int, uint, long, or ulong.

Pointer arithmetic

https://github.com/dotnet/csharplang/blob/master/spec/unsafe-code.md#pointer-arithmetic should be reworded to:

Thus, for every pointer type T*, the following operators are implicitly defined:

T* operator +(T* x, int y);
T* operator +(T* x, uint y);
T* operator +(T* x, nint y);
T* operator +(T* x, nuint y);
T* operator +(T* x, long y);
T* operator +(T* x, ulong y);

T* operator +(int x, T* y);
T* operator +(uint x, T* y);
T* operator +(nint x, T* y);
T* operator +(nuint x, T* y);
T* operator +(long x, T* y);
T* operator +(ulong x, T* y);

T* operator -(T* x, int y);
T* operator -(T* x, uint y);
T* operator -(T* x, nint y);
T* operator -(T* x, nuint y);
T* operator -(T* x, long y);
T* operator -(T* x, ulong y);

long operator -(T* x, T* y);

Given an expression P of a pointer type T* and an expression N of type int, uint, nint, nuint, long, or ulong, the expressions P + N and N + P compute the pointer value of type T* that results from adding N * sizeof(T) to the address given by P. Likewise, the expression P - N computes the pointer value of type T* that results from subtracting N * sizeof(T) from the address given by P.

Drawbacks

It adds additional complexity to the language that could potentially be handled in another fashion.

Alternatives

The impact of not doing this is that the backing code generated by the compiler is typically worse because long and ulong are typically not valid indexers on the underlying host platform, meaning the C# compiler must account for this and insert additional overflow checks.

However, it could also be handled by updating the emitter to account for nint and nuint conversions to long and ulong. There may also be some additional optimizations possible in the JIT to allow folding away unnecessary overflow conversions.

Unresolved questions

It doesn't appear as though the language spec has been updated to include the nint and nuint additions from C# 9, so it is unclear if a few of the suggestions above are already handled. Likewise, there are some additional locations throughout the spec that should be validated to cover nint and nuint.

In Pointer arithmetic, there exists the following operator long operator -(T* x, T* y);. The logical type if this was done from C# v1.0 should have been nint, which is the equivalent type of ptrdiff_t on C/C++ platforms. The rules on how to add this without breaking compatibility are somewhat unclear to me. It may be possible that we just add nint operator -((T* x, T* y) but that is "effectively" overloading based on return type and while nint is in turn implicitly convertible to long, there may be edge cases I'm unaware of that would prevent this.

Design meetings

@tannergooding
Copy link
Member Author

For multi-dimensional or non-zero based arrays, the spec makes no mention of needing to use Array.GetValue or Array.SetValue to access them and so I did not feel it was important to call out in the above that currently only overloads taking int and long exist.

If overloads taking nint or nuint exist in the future, it should be possible for them to be preferred and otherwise an implicit conversion to the Int64 overload can be used instead which is how they behave today.

@AlekseyTs
Copy link
Contributor

AlekseyTs commented Feb 4, 2021

The "Pointer member access" section should probably be renamed to "Pointer element access" and refer to https://github.com/dotnet/csharplang/blob/master/spec/unsafe-code.md#pointer-element-access instead.

@tannergooding
Copy link
Member Author

The "Pointer member access" section should probably be renamed to "Pointer element access" and refer to https://github.com/dotnet/csharplang/blob/master/spec/unsafe-code.md#pointer-element-access instead.

Updated, I must have missed which was the nearest header 😄

@333fred 333fred added this to the Working Set milestone Feb 11, 2021
@lostmsu
Copy link

lostmsu commented Mar 16, 2021

How about support for nuint/nint in index and range expressions? An alternative to current Int32-based System.Range and System.Index would be required.

@CyrusNajmabadi
Copy link
Member

I think this needs to go through the runtime first to see if they actually want a different API here. If they do, then we can consider if in the language level it's something we want first class support for.

@lostmsu
Copy link

lostmsu commented Mar 17, 2021

@CyrusNajmabadi does runtime provide any special support for the Index and Range types? It does not looks like that. I used IndexRange from NuGet for older targets with no issues whatsoever. And it has dead simple implementation.

Performance wise runtime can catch up later.

@HaloFour
Copy link
Contributor

For language features that require APIs they are only going to target those supplied by the runtime for the supported compiler and runtime version. Yes, you can polyfill older runtimes, but that's no longer considered supported, and it's very unlikely that the language would ship with a language change that would require a polyfill.

@naine
Copy link

naine commented Mar 17, 2021

@lostmsu It's not so much that the types are tightly coupled to the runtime. The issue is with the APIs and implementations of those types, which are provided by the runtime (where they're built in), or in your case by the nuget package.

In both cases, the API and implementations for Index and Range only support indexes within the positive range of Int32 and can't be constructed from other types. Allowing native ints in index/range expressions would require the types to be updated with API overloads that accept native ints, and that decision is up to the runtime team. (It almost certainly would not happen, since existing consumers of these types can currently assume an Index represents a position within the positive range of Int32, and changing this would break them.)

Notice how the proposal only modifies parts of the language that currently apply to any of int, uint, long, or ulong. This isn't the case for index/range expressions.

@lostmsu
Copy link

lostmsu commented Mar 17, 2021

@HaloFour

Yes, you can polyfill older runtimes, but that's no longer considered supported

What is the reasoning behind that? I used IndexRange package I mentioned before for netstandard2.0, and it worked marvelously for my purposes, until I hit the 32 bit limit.

Unlike Span<T>, generic Index<T> and Range<T> types need absolutely nothing special from runtime beyond .NET Framework 2.0 (!) level.

@naine and that's why I'd prefer a new Index<T> type. This does not require any existing code modification: existing non-generic members will always have priority over the new generic ones. Existing language spec takes care of that.

@HaloFour
Copy link
Contributor

@lostmsu

What is the reasoning behind that?

I believe that the language team doesn't want to get into the situation of having to describe which features depend on runtime support and which features can be polyfilled, so the official stance is that the language and runtime will evolve in lockstep and that down-level targeting is unsupported. That stance started with C# 8.0 and .NET 3.0 due to Default Interface Methods. For that reason I doubt that the language team would consider adding features that depended on APIs unless the supporting APIs were scheduled to ship in the corresponding runtime release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants