-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Add String.Join overloads accepting a char separator #7942
Conversation
@@ -541,6 +541,184 @@ public String Insert(int startIndex, String value) | |||
} | |||
return result; | |||
} | |||
|
|||
public static string Join(char separator, params string[] value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept the parameter names the same as the string-based overloads, so this is value
as opposed to values
.
throw new ArgumentNullException("values"); | ||
} | ||
|
||
if (values.Length == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I avoided duplicating the Join(string, object[])
bug where we return "" if the first item is null here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinvp Unfortunately, this means that it's not possible to share code with the string overload here for now. Ongoing discussion @ https://github.com/dotnet/coreclr/issues/4506#issuecomment-258033970
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can still share the code - just need a bool argument on what the behavior in this case should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas, that seems kind of ugly to me. I think it would be better to simply leave things as-is for now, and when fixing the bug is approved then share the code.
string currentValue = value[i]; | ||
if (currentValue != null) | ||
{ | ||
checked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used checked
inside the loop, instead of checking for overflow at the end. 3 2B sized strings for example could overflow enough that the total length goes back to a positive value, so the check seems fairly pointless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will throw OverflowException. Other Join overloads throw OutOfMemory exception in this case...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I can change it to do a totalLength < 0
check instead then.
|
||
for (int i = startIndex, end = startIndex + count; i < end; i++) | ||
{ | ||
// It's possible that another thread may have mutated the input array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I happened to see @stephentoub's PR a while ago at #4559, so I mimicked his logic here.
if (i < end - 1) | ||
{ | ||
// Fill in the separator. | ||
fixed (char* pResult = &result.m_firstChar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
may/may not have better performance outside the loop, but I think putting it here is best for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a simple result[copiedLength] = separator
be faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas result
is a string, so using a set indexer won't work here.
// something changed concurrently to mutate the input array: fall back to | ||
// doing the concatenation again, but this time with a defensive copy. This | ||
// fall back should be extremely rare. | ||
return copiedLength == totalLength ? result : Concat((string[])value.Clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic copied from @stephentoub's PR.
Nice! In the issue over in corefx I said I would add some tests and expose the implementation - I don't actually have time to do this anytime soon, so if you want you can take over |
{ | ||
if (value == null) | ||
{ | ||
throw new ArgumentNullException("value"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: nameof
(nameof was just deployed throughout mscorlib). Applies throughout.
Any thoughts on sharing code with the |
@hughbe Sure, I would be willing to add tests / expose the impl in corefx. |
@justinvp Great idea! Esp. considering one of the |
Contract.EndContractBlock(); | ||
|
||
using (IEnumerator<T> en = values.GetEnumerator()) | ||
fixed (char* pSeparator = &separator.m_firstChar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if separator
is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinvp Then pSeparator
will be initialized to null. This is technically implementation-defined behavior of Roslyn, but it looks like they're planning to update the spec sometime.
On hindsight, the real issue here looks like separator.Length
will nullref if it is null (edit: I just realized, m_firstChar
as well)... I'll just add a separator = separator ?? string.Empty
beforehand. Thanks for pointing this out.
using System.Diagnostics; | ||
using System.Diagnostics.Contracts; | ||
|
||
unsafe internal struct UnSafeCharBuffer{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is no longer being used after the Join
char/string impls were unified, so I deleted it.
f649e89
to
1abc87f
Compare
Explicit check for overflow may be a bit faster, like:
|
@jkotas Nice idea. Updated the PR to match. |
separator = separator ?? string.Empty; | ||
fixed (char* pSeparator = &separator.m_firstChar) | ||
{ | ||
// Defer argument validataion to the internal function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: validataion -> validation
return Join(separator, value, 0, value.Length); | ||
} | ||
|
||
public static string Join(char separator, params object[] values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I added the new Split
overloads, I made them all [ComVisible(false)]
. I think we should do the same for these new Join
overloads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok edit: See @danmosemsft's comment
} | ||
|
||
[ComVisible(false)] | ||
public static String Join(String separator, IEnumerable<String> values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you changed String
-> string
above, maybe go ahead and do the same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why these overloads are diffing since I didn't change them... but since they are, I guess I might as well format them.
if (values == null) | ||
throw new ArgumentNullException(nameof(values)); | ||
Contract.Ensures(Contract.Result<String>() != null); | ||
Contract.EndContractBlock(); | ||
|
||
using(IEnumerator<String> en = values.GetEnumerator()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: space between using
and (
return StringBuilderCache.GetStringAndRelease(result); | ||
} | ||
|
||
public unsafe static string Join<T>(char separator, IEnumerable<T> values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the public unsafe methods be annotated [SecuritySafeCritical]
and private JoinCore
annotated [SecurityCritical]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand @jkotas comment https://github.com/dotnet/corefx/issues/12592#issuecomment-253971384 they are not needed in this assembly as we don't build mscorlib for desktop and the mirror is switched off. @jkotas can confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same I think for [ComVisible]. https://github.com/dotnet/corefx/issues/12592#issuecomment-253972687
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will not add unless someone says otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand @jkotas comment dotnet/corefx#12592 (comment) they are not needed in this assembly as we don't build mscorlib for desktop and the mirror is switched off.
I realize that's the case for corefx. I asked because I'm not sure about coreclr given it appears the security annotations are still being maintained here (e.g. #7476). @jkotas should confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not necessary to maintain these anymore.
|
||
// Joins an array of strings together as one string with a separator between each original string. | ||
// | ||
[System.Security.SecuritySafeCritical] // auto-generated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove the // auto-generated part of the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexGhiondea There appear to be lots of those comments throughout the repo; if such a change is made it should probably be done in bulk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamesqo that would work. I was just thinking that since you are adding this method you can clean-up the comment. I am fine with it as-is :).
@jkotas, I think this is ready to be merged. |
@AlexGhiondea Could you please take care of this? |
Looks like the |
@jamesqo could you rebase and update the PR so that we can merge this? |
8f794cc
to
c60c141
Compare
@AlexGhiondea Sure. This is ready for merge now, I believe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of this!
@stephentoub apparently has a couple comments b efore we merge this. |
@danmosemsft @stephentoub What are they? |
if (totalLength < 0) // Check for overflow | ||
{ | ||
throw new OutOfMemoryException(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the original author went out of his or her way to keep this check out of this loop. I assume this doesn't make a measurable impact, but have you checked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub The original version was incorrect in the first place. jointLength < 0
does not tell us whether anything overflowed b/c the int could have wrapped around to negative and passed 0 again, since we're dealing with a variable amount of strings. jointLength + 1 < 0
is also pointless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could check for overflow by doing something like this:
'If (totalLength + currentValue.Length < totalLength)'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexGhiondea That would have to be in the loop as well; there is no difference from the version that is currently checked in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamesqo yes, it should still be in the loop but that check would catch overflows even when the value overflows enough to come back into the positive number range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexGhiondea, a single int + int cannot overflow that much. Worst case scenario, totalLength
and currentValue.Length
are both int.MaxValue. Their sum is only -2. It's only when we multiply we run into problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamesqo ah yes -- I was thrown off by your comment above about the previous version of the code 😄
* Add String.Join overloads that accept char separators
…7942) * Add String.Join overloads that accept char separators Commit migrated from dotnet/coreclr@3f59c07
This PR adds
string.Join
overloads that take a char separator, which has been approved by dotnet/corefx#5552. It supplants #7621.Please note that while I've verified the code I've wrote compiles, it's completely untested since I don't have a way to run tests on it. That will be done when the api is exposed in corefx.
Another note: I didn't add a
Join(char, IEnumerable<string>)
overload after seeing #7621 (comment) which referred to @stephentoub's comment at #2945 (comment).cc @jkotas @stephentoub @AlexGhiondea @weshaggard @bbowyersmyth @ghost @hughbe