Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Add StringBuilder Span-based APIs #13163

Merged
merged 3 commits into from
Aug 8, 2017

Conversation

kevingosse
Copy link

Contributes to https://github.com/dotnet/corefx/issues/22430

  • Add span-based Append/Insert/CopyTo methods to StringBuilder
  • Rewrote the existing CopyTo method into an overload of the span-based version, to reduce code duplication

@dnfclas
Copy link

dnfclas commented Aug 2, 2017

@KooKiz,
Thanks for your contribution.
To ensure that the project team has proper rights to use your work, please complete the Contribution License Agreement at https://cla2.dotnetfoundation.org.

It will cover your contributions to all .NET Foundation-managed open source projects.
Thanks,
.NET Foundation Pull Request Bot

@@ -2001,6 +2043,22 @@ private static unsafe void ThreadSafeCopy(char* sourcePtr, char[] destination, i
}
}

private static unsafe void ThreadSafeCopy(char* sourcePtr, Span<char> destination, int destinationIndex, int count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tentative about those overloads, as they significantly increase code duplication. Still, I have yet to come with an elegant way to factorize them (short of making all codepaths going through the Span version, with a performance penalty). Suggestions are much appreciated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know for sure that there is a performance penalty? If so, and Span is the culprit, we'd love to see examples.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take was that there is necessarily some overhead, since we would add the conversion from array to span on top of the existing code. But it's true that this overhead is probably not significant in this context. I'll run some benchmark to make sure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After spending more time on it, I realized that the ThreadSafeCopy overload was only used by CopyTo, so I don't actually have to impact the other codepaths

@kevingosse
Copy link
Author

@dotnet-bot test this please

The char[] based one was called only by CopyTo so it wasn't needed anymore. Replaced by a Span based version
{
if (count > 0)
{
if ((uint)sourceIndex <= (uint)source.Length && (sourceIndex + count) <= source.Length)
if ((uint)sourceIndex > (uint)source.Length || (sourceIndex + count) > source.Length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int overflow risk

(sourceIndex + count) > source.Length => count > (source.Length - sourceIndex)

}
else

if ((uint)destinationIndex > (uint)destination.Length || (destinationIndex + count) > destination.Length)
Copy link
Member

@benaadams benaadams Aug 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int overflow risk?

(destinationIndex + count) > destination.Length) => count > (destination.Length - destinationIndex)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the same condition exists on the other ThreadSafeCopy overload (that I didn't touch), line 2034.
Still, I agree on the risk. Gonna do some further testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source.Length - destinationIndex

@benaadams, copy and paste error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah :)

{
unsafe
{
fixed (char* valueChars = &value.DangerousGetPinnableReference())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, I'd prefer to see this flipped -- that is it would be preferable to have the pointer overload create a span and pass that throughout the class. That realizes part of the purpose of spans -- avoiding memory corruption while still being fast. If you're not comfortable doing that in this change, I do understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calls string.wstrcpy though which wants pointers; would need to use equivalent span function in string to that?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's not currently interop support for Span (@yizhang82, any plans?), I agree there are places where pointers are still necessary. However, if we minimize those places, it mitigates cases like the int overflows you pointed out.

Copy link
Member

@benaadams benaadams Aug 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add an internal wstrcpy that takes Span and call that instead?

i.e. have the pointers in string rather than elsewhere

/cc @stephentoub

Copy link
Member

@stephentoub stephentoub Aug 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep such things separate. There are two distinct tracks for span in coreclr/corefx:

  • Using it internally to clean up duplicated code, to make existing unsafe code safer, to make it easier to stack allocate in places we're currently heap allocating, etc.
  • Adding new APIs that accept span

I'd like to largely keep those separate. If it's easy to achieve the former while doing the latter, great, or if the best way to implement the API is by changing existing implementation, ok, but doing so also makes it more likely we'll incur performance regressions, especially at this early stage where we haven't shaken out all of the possible perf issues with span. Any such changes need to be measured and validated rigorously.

if (count < 0)
{
throw new ArgumentOutOfRangeException(nameof(count), SR.Arg_NegativeArgCount);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this a breaking change, as if multiple invalid inputs were provided, it'll change the order in which we check and thus will change what exceptions are thrown. That said, I personally think that's an acceptable level of break, but others should weigh in... @weshaggard?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change is acceptable as well.

@stephentoub
Copy link
Member

Thanks, @kookiz. Have you run the corefx tests against this change? Assuming yes, LGTM. If not, instructions for doing so are at https://github.com/dotnet/corefx/blob/7059f8b316723f5db64e4b1d28e2b1e8b137ab1e/Documentation/project-docs/developer-guide.md#testing-with-private-coreclr-bits.

@kevingosse
Copy link
Author

kevingosse commented Aug 4, 2017

@stephentoub I did, but it's interesting that you're asking. Isn't the continuous delivery supposed to automatically run the tests when I push a commit?

On an unrelated note, it looks like the cla-required tag wasn't removed even though I signed the agreement (as visible in dotnet/corefx#22852)

@dnfclas
Copy link

dnfclas commented Aug 6, 2017

@kookiz, thanks for signing the contribution license agreement. We will now validate the agreement and then the pull request.

Thanks, .NET Foundation Pull Request Bot

@kevingosse
Copy link
Author

Is there something else I need to do for this PR to be merged?

@jkotas
Copy link
Member

jkotas commented Aug 8, 2017

@dotnet-bot test Windows_NT x64 corefx_baseline

@stephentoub stephentoub merged commit c7a10a5 into dotnet:master Aug 8, 2017
dotnet-bot pushed a commit to dotnet/corert that referenced this pull request Aug 8, 2017
* Add StringBuilder Span-based APIs

* Factorized ThreadSafeCopy methods

The char[] based one was called only by CopyTo so it wasn't needed anymore. Replaced by a Span based version

* Fix possible overflow

Signed-off-by: dotnet-bot <[email protected]>
jkotas pushed a commit to dotnet/corert that referenced this pull request Aug 9, 2017
* Add StringBuilder Span-based APIs

* Factorized ThreadSafeCopy methods

The char[] based one was called only by CopyTo so it wasn't needed anymore. Replaced by a Span based version

* Fix possible overflow

Signed-off-by: dotnet-bot <[email protected]>
@kevingosse kevingosse deleted the stringbuilder_span2 branch August 11, 2017 20:32
@karelz karelz modified the milestone: 2.1.0 Aug 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants