Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert.Try{From/To}HexString #86556

Merged
merged 28 commits into from
Oct 31, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions src/libraries/System.Private.CoreLib/src/System/Convert.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2960,6 +2960,65 @@ public static byte[] FromHexString(ReadOnlySpan<char> chars)
return result;
}

/// <summary>
/// Converts the string, which encodes binary data as hex characters, to an equivalent 8-bit unsigned integer span.
/// </summary>
/// <param name="source">The string to convert.</param>
/// <param name="destination">
/// The span in which to write the converted 8-bit unsigned integers. When this method returns false,
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
/// either the span remains unmodified or contains an incomplete conversion of <paramref name="source"/>,
/// up to the last valid character.
/// </param>
/// <param name="bytesWritten">When this method returns, contains the number of bytes that were written in <paramref name="destination"/>.</param>
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
/// <returns>true if the conversion was successful; otherwise, false.</returns>
/// <exception cref="ArgumentNullException">Passed string <paramref name="source"/> is null.</exception>
public static bool TryFromHexString(string source, Span<byte> destination, out int bytesWritten)
{
ArgumentNullException.ThrowIfNull(source);

return TryFromHexString(source.AsSpan(), destination, out bytesWritten);
}

/// <summary>
/// Converts the span of chars, which encodes binary data as hex characters, to an equivalent 8-bit unsigned integer span.
/// </summary>
/// <param name="source">The span to convert.</param>
/// <param name="destination">
/// The span in which to write the converted 8-bit unsigned integers. When this method returns false,
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
/// either the span remains unmodified or contains an incomplete conversion of <paramref name="source"/>,
/// up to the last valid character.
/// </param>
/// <param name="bytesWritten">When this method returns, contains the number of bytes that were written in <paramref name="destination"/>.</param>
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
/// <returns>true if the conversion was successful; otherwise, false.</returns>
public static bool TryFromHexString(ReadOnlySpan<char> source, Span<byte> destination, out int bytesWritten)
{
var length = source.Length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var length = source.Length;
int length = source.Length;

Use concrete types here. Same on other places.

Perf-wise this could be nuint, so all the division and modulo operations are treated as (fast) bit-operations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perf-wise this could be nuint, so all the division and modulo operations are treated as (fast) bit-operations.

That's interesting. We rarely use native width types except in interop contexts. @jkotas @EgorBo is it correct that in hot code of this type, it can have perf benefit for simple storage of counters and offsets?

Copy link
Member

@EgorBo EgorBo May 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afair, JIT knows that Span.Length (and Array.Length as well) are never negative so it doesn't need extra efforts from the C# side 🙂 e.g.:

int Foo(Span<int> s) => s.Length % 4;
; Method Foo(System.Span`1[int]):int:this
       mov      eax, dword ptr [rdx+08H]
       and      eax, 3
       ret      
; Total bytes of code: 7

(some of the changes landed in .NET 8.0 so might be not visible on sharplab yet)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right -- I'm not aware of patterns where nuint gives better code.

Copy link
Member

@gfoidl gfoidl May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JIT knows that Span.Length (and Array.Length as well) are never negative

Thanks for the reminder -- I'll forget sometimes about the new JIT-goodness.

We rarely use native width types except in interop contexts

Quite a lot of vectorized code uses the native width types, because memory operations don't need (sign-) extending moves then (e.g. Unsafe.Add(ref ptr, intVariable) vs. Unsafe.Add(ref ptr, nuintVariable) -- https://godbolt.org/z/v8E31Wdsx).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would Math.DivRem be more efficient here? Especially once it becomes an intrinsic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would Math.DivRem be more efficient here? Especially once it becomes an intrinsic.

I doubt it, since the JIT optimizes length % 2 != 0 to (length & 1) != 0.

Copy link
Member

@BrennanConroy BrennanConroy May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That comment wasn't just about the single length check, but also the division done a few lines later. DivRem in theory combines the two operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the division done a few lines later

The JIT can optimize (uint)length / 2 to (uint)length >> 1. On X86, DivRem would likely be implemented using a div instruction, which is an relatively expensive operation.


if (length % 2 != 0)
goto FalseResult;

if (length == 0)
{
bytesWritten = 0;
return true;
}

var halfLength = length / 2;

if (destination.Length < halfLength)
goto FalseResult;

if (!HexConverter.TryDecodeFromUtf16(source, destination))
goto FalseResult;

bytesWritten = halfLength;
return true;

FalseResult:
bytesWritten = 0;
return false;
MihaZupan marked this conversation as resolved.
Show resolved Hide resolved
}

/// <summary>
/// Converts an array of 8-bit unsigned integers to its equivalent string representation that is encoded with uppercase hex characters.
/// </summary>
Expand Down Expand Up @@ -3011,5 +3070,40 @@ public static string ToHexString(ReadOnlySpan<byte> bytes)

return HexConverter.ToString(bytes, HexConverter.Casing.Upper);
}


/// <summary>
/// Converts a span of 8-bit unsigned integers to its equivalent span representation that is encoded with uppercase hex characters.
/// </summary>
/// <param name="source">A span of 8-bit unsigned integers.</param>
/// <param name="destination">The span representation in hex of the elements in <paramref name="source"/>.</param>
/// <param name="charsWritten">When this method returns, contains the number of chars that were written in <paramref name="destination"/>.</param>
/// <returns>true if the conversion was successful; otherwise, false.</returns>
public static bool TryToHexString(ReadOnlySpan<byte> source, Span<char> destination, out int charsWritten)
{
var length = source.Length;

if (length > int.MaxValue / 2)
goto FalseResult;

if (length == 0)
{
charsWritten = 0;
return true;
}

int twicedLength = length * 2;
hrrrrustic marked this conversation as resolved.
Show resolved Hide resolved

if (destination.Length < twicedLength)
goto FalseResult;

HexConverter.EncodeToUtf16(source, destination);
charsWritten = twicedLength;
return true;

FalseResult:
charsWritten = 0;
return false;
MihaZupan marked this conversation as resolved.
Show resolved Hide resolved
}
} // class Convert
} // namespace
Original file line number Diff line number Diff line change
Expand Up @@ -34,73 +34,50 @@ public static void CompleteValueRange()

private static void TestSequence(byte[] expected, string actual)
{
Assert.Equal(expected, Convert.FromHexString(actual));
}
byte[] fromResult = Convert.FromHexString(actual);
Assert.Equal(expected, fromResult);

[Fact]
public static void InvalidInputString_Null()
{
AssertExtensions.Throws<ArgumentNullException>("s", () => Convert.FromHexString(null));
}
Span<byte> tryResult = stackalloc byte[actual.Length / 2];
adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
Assert.True(Convert.TryFromHexString(actual, tryResult, out int written));
Assert.Equal(fromResult.Length, written);
AssertExtensions.SequenceEqual(expected, tryResult);

adamsitnik marked this conversation as resolved.
Show resolved Hide resolved
[Fact]
public static void InvalidInputString_HalfByte()
{
Assert.Throws<FormatException>(() => Convert.FromHexString("ABC"));
}

[Fact]
public static void InvalidInputString_BadFirstCharacter()
{
Assert.Throws<FormatException>(() => Convert.FromHexString("x0"));
}

[Fact]
public static void InvalidInputString_BadSecondCharacter()
{
Assert.Throws<FormatException>(() => Convert.FromHexString("0x"));
}

[Fact]
public static void InvalidInputString_NonAsciiCharacter()
{
Assert.Throws<FormatException>(() => Convert.FromHexString("0\u0308"));
}

[Fact]
public static void InvalidInputString_ZeroWidthSpace()
{
Assert.Throws<FormatException>(() => Convert.FromHexString("\u200B 000102FDFEFF"));
}

[Fact]
public static void InvalidInputString_LeadingWhiteSpace()
{
Assert.Throws<FormatException>(() => Convert.FromHexString(" 000102FDFEFF"));
}

[Fact]
public static void InvalidInputString_TrailingWhiteSpace()
public static void InvalidInputString_Null()
{
Assert.Throws<FormatException>(() => Convert.FromHexString("000102FDFEFF "));
AssertExtensions.Throws<ArgumentNullException>("s", () => Convert.FromHexString(null));
Assert.False(Convert.TryFromHexString(null, default, out _));
}

[Fact]
public static void InvalidInputString_WhiteSpace()
[Theory]
[InlineData("01-02-FD-FE-FF")]
[InlineData("00 01 02FD FE FF")]
[InlineData("000102FDFEFF ")]
[InlineData(" 000102FDFEFF")]
[InlineData("\u200B 000102FDFEFF")]
[InlineData("0\u0308")]
[InlineData("0x")]
[InlineData("x0")]
[InlineData("ABC")] // HalfByte
public static void InvalidInputString_FormatException_Or_FalseResult(string invalidInput)
{
Assert.Throws<FormatException>(() => Convert.FromHexString("00 01 02FD FE FF"));
}
Assert.Throws<FormatException>(() => Convert.FromHexString(invalidInput));

[Fact]
public static void InvalidInputString_Dash()
{
Assert.Throws<FormatException>(() => Convert.FromHexString("01-02-FD-FE-FF"));
Span<byte> buffer = stackalloc byte[invalidInput.Length / 2];
Assert.False(Convert.TryFromHexString(invalidInput.AsSpan(), buffer, out _));
}

[Fact]
public static void ZeroLength()
{
Assert.Same(Array.Empty<byte>(), Convert.FromHexString(string.Empty));

bool tryResult = Convert.TryFromHexString(string.Empty, Span<byte>.Empty, out int written);

Assert.True(tryResult);
Assert.Equal(0, written);
}

[Fact]
Expand Down