Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Add String.Join overloads accepting a char separator #7942

Merged
merged 12 commits into from
Nov 23, 2016

Conversation

jamesqo
Copy link

@jamesqo jamesqo commented Nov 1, 2016

This PR adds string.Join overloads that take a char separator, which has been approved by dotnet/corefx#5552. It supplants #7621.

Please note that while I've verified the code I've wrote compiles, it's completely untested since I don't have a way to run tests on it. That will be done when the api is exposed in corefx.

Another note: I didn't add a Join(char, IEnumerable<string>) overload after seeing #7621 (comment) which referred to @stephentoub's comment at #2945 (comment).

cc @jkotas @stephentoub @AlexGhiondea @weshaggard @bbowyersmyth @ghost @hughbe

@@ -541,6 +541,184 @@ public String Insert(int startIndex, String value)
}
return result;
}

public static string Join(char separator, params string[] value)
Copy link
Author

@jamesqo jamesqo Nov 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the parameter names the same as the string-based overloads, so this is value as opposed to values.

throw new ArgumentNullException("values");
}

if (values.Length == 0)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided duplicating the Join(string, object[]) bug where we return "" if the first item is null here.

Copy link
Author

@jamesqo jamesqo Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinvp Unfortunately, this means that it's not possible to share code with the string overload here for now. Ongoing discussion @ https://github.com/dotnet/coreclr/issues/4506#issuecomment-258033970

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still share the code - just need a bool argument on what the behavior in this case should be.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas, that seems kind of ugly to me. I think it would be better to simply leave things as-is for now, and when fixing the bug is approved then share the code.

string currentValue = value[i];
if (currentValue != null)
{
checked
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used checked inside the loop, instead of checking for overflow at the end. 3 2B sized strings for example could overflow enough that the total length goes back to a positive value, so the check seems fairly pointless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will throw OverflowException. Other Join overloads throw OutOfMemory exception in this case...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I can change it to do a totalLength < 0 check instead then.


for (int i = startIndex, end = startIndex + count; i < end; i++)
{
// It's possible that another thread may have mutated the input array
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I happened to see @stephentoub's PR a while ago at #4559, so I mimicked his logic here.

if (i < end - 1)
{
// Fill in the separator.
fixed (char* pResult = &result.m_firstChar)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed may/may not have better performance outside the loop, but I think putting it here is best for readability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a simple result[copiedLength] = separator be faster?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas result is a string, so using a set indexer won't work here.

// something changed concurrently to mutate the input array: fall back to
// doing the concatenation again, but this time with a defensive copy. This
// fall back should be extremely rare.
return copiedLength == totalLength ? result : Concat((string[])value.Clone());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic copied from @stephentoub's PR.

@hughbe
Copy link

hughbe commented Nov 1, 2016

Nice! In the issue over in corefx I said I would add some tests and expose the implementation - I don't actually have time to do this anytime soon, so if you want you can take over

{
if (value == null)
{
throw new ArgumentNullException("value");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: nameof (nameof was just deployed throughout mscorlib). Applies throughout.

@justinvp
Copy link

justinvp commented Nov 2, 2016

Any thoughts on sharing code with the string separator methods? #7621 (comment)

@jamesqo
Copy link
Author

jamesqo commented Nov 2, 2016

@hughbe Sure, I would be willing to add tests / expose the impl in corefx.

@jamesqo
Copy link
Author

jamesqo commented Nov 2, 2016

@justinvp Great idea! Esp. considering one of the Join overloads is generic it should help shave down on some jitted code bloat if an app uses both overloads with the same type.

Contract.EndContractBlock();

using (IEnumerator<T> en = values.GetEnumerator())
fixed (char* pSeparator = &separator.m_firstChar)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if separator is null?

Copy link
Author

@jamesqo jamesqo Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinvp Then pSeparator will be initialized to null. This is technically implementation-defined behavior of Roslyn, but it looks like they're planning to update the spec sometime.

On hindsight, the real issue here looks like separator.Length will nullref if it is null (edit: I just realized, m_firstChar as well)... I'll just add a separator = separator ?? string.Empty beforehand. Thanks for pointing this out.

using System.Diagnostics;
using System.Diagnostics.Contracts;

unsafe internal struct UnSafeCharBuffer{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is no longer being used after the Join char/string impls were unified, so I deleted it.

@jkotas
Copy link
Member

jkotas commented Nov 3, 2016

Explicit check for overflow may be a bit faster, like:

long totalSeparatorsLength = (long)(count - 1) * separatorLength;
if (totalSeparatorsLength > int.MaxValue)
   throw new OutOfMemoryException(); 
int totalLength = (int)separatorsLength;

@jamesqo
Copy link
Author

jamesqo commented Nov 3, 2016

@jkotas Nice idea. Updated the PR to match.

separator = separator ?? string.Empty;
fixed (char* pSeparator = &separator.m_firstChar)
{
// Defer argument validataion to the internal function
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: validataion -> validation

return Join(separator, value, 0, value.Length);
}

public static string Join(char separator, params object[] values)
Copy link

@justinvp justinvp Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I added the new Split overloads, I made them all [ComVisible(false)]. I think we should do the same for these new Join overloads.

Copy link
Author

@jamesqo jamesqo Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok edit: See @danmosemsft's comment

}

[ComVisible(false)]
public static String Join(String separator, IEnumerable<String> values) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you changed String -> string above, maybe go ahead and do the same here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why these overloads are diffing since I didn't change them... but since they are, I guess I might as well format them.

if (values == null)
throw new ArgumentNullException(nameof(values));
Contract.Ensures(Contract.Result<String>() != null);
Contract.EndContractBlock();

using(IEnumerator<String> en = values.GetEnumerator()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space between using and (

return StringBuilderCache.GetStringAndRelease(result);
}

public unsafe static string Join<T>(char separator, IEnumerable<T> values)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the public unsafe methods be annotated [SecuritySafeCritical] and private JoinCore annotated [SecurityCritical]?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand @jkotas comment https://github.com/dotnet/corefx/issues/12592#issuecomment-253971384 they are not needed in this assembly as we don't build mscorlib for desktop and the mirror is switched off. @jkotas can confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will not add unless someone says otherwise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand @jkotas comment dotnet/corefx#12592 (comment) they are not needed in this assembly as we don't build mscorlib for desktop and the mirror is switched off.

I realize that's the case for corefx. I asked because I'm not sure about coreclr given it appears the security annotations are still being maintained here (e.g. #7476). @jkotas should confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to maintain these anymore.


// Joins an array of strings together as one string with a separator between each original string.
//
[System.Security.SecuritySafeCritical] // auto-generated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the // auto-generated part of the comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexGhiondea There appear to be lots of those comments throughout the repo; if such a change is made it should probably be done in bulk.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo that would work. I was just thinking that since you are adding this method you can clean-up the comment. I am fine with it as-is :).

@jamesqo
Copy link
Author

jamesqo commented Nov 4, 2016

@jkotas, I think this is ready to be merged.

@jkotas
Copy link
Member

jkotas commented Nov 6, 2016

@AlexGhiondea Could you please take care of this?

@jamesqo
Copy link
Author

jamesqo commented Nov 16, 2016

Looks like the string.Join(string,object[]) bug is no more, since #8114 has been merged. In a few days (after the corefx tests for that PR have been merged), I will update this PR again and introduce code sharing between the (char,object[]) & (string,object[]) overloads.

@AlexGhiondea
Copy link

@jamesqo could you rebase and update the PR so that we can merge this?
Thanks for working on it!

@jamesqo jamesqo changed the title Add String.Join overloads accepting a char separator [no merge] Add String.Join overloads accepting a char separator Nov 21, 2016
@jamesqo jamesqo changed the title [no merge] Add String.Join overloads accepting a char separator Add String.Join overloads accepting a char separator Nov 21, 2016
@jamesqo
Copy link
Author

jamesqo commented Nov 21, 2016

@AlexGhiondea Sure. This is ready for merge now, I believe.

Copy link

@AlexGhiondea AlexGhiondea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking care of this!

@danmoseley
Copy link
Member

@stephentoub apparently has a couple comments b efore we merge this.

@jamesqo
Copy link
Author

jamesqo commented Nov 22, 2016

@danmosemsft @stephentoub What are they?

if (totalLength < 0) // Check for overflow
{
throw new OutOfMemoryException();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the original author went out of his or her way to keep this check out of this loop. I assume this doesn't make a measurable impact, but have you checked?

Copy link
Author

@jamesqo jamesqo Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub The original version was incorrect in the first place. jointLength < 0 does not tell us whether anything overflowed b/c the int could have wrapped around to negative and passed 0 again, since we're dealing with a variable amount of strings. jointLength + 1 < 0 is also pointless.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could check for overflow by doing something like this:

'If (totalLength + currentValue.Length < totalLength)'

Copy link
Author

@jamesqo jamesqo Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexGhiondea That would have to be in the loop as well; there is no difference from the version that is currently checked in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo yes, it should still be in the loop but that check would catch overflows even when the value overflows enough to come back into the positive number range.

Copy link
Author

@jamesqo jamesqo Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexGhiondea, a single int + int cannot overflow that much. Worst case scenario, totalLength and currentValue.Length are both int.MaxValue. Their sum is only -2. It's only when we multiply we run into problems.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo ah yes -- I was thrown off by your comment above about the previous version of the code 😄

@AlexGhiondea AlexGhiondea merged commit 3f59c07 into dotnet:master Nov 23, 2016
@jamesqo jamesqo deleted the new-string-apis branch November 23, 2016 23:41
sergign60 pushed a commit to sergign60/coreclr that referenced this pull request Dec 2, 2016
* Add String.Join overloads that accept char separators
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…7942)

* Add String.Join overloads that accept char separators

Commit migrated from dotnet/coreclr@3f59c07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants