-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Consolidate much of the duplicated code across Encoding implementations #6114
Conversation
// All of our public Encodings that don't use EncodingNLS must have this (including EncodingNLS) | ||
// So if you fix this, fix the others. Currently those include: | ||
// EncodingNLS, UTF7Encoding, UTF8Encoding, UTF32Encoding, ASCIIEncoding, UnicodeEncoding | ||
// parent method is safe | ||
|
||
[System.Security.SecuritySafeCritical] // auto-generated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SecuritySafeCritical
annotations should be removed here as well. Instead, they should be added to the shared impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas I'm assuming this applies to SecurityCritical
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SecurityCritical
should stay here. But SecurityCritical
should be also added on the new methods that take pointers.
Could you please run the applicable tests from corefx repo on this, if you have not done it already? |
@jkotas I have run the tests beforehand. There were a few failures, specifically those testing what parameter name was used in the exceptions thrown from certain
return EncodingForwarder.GetString(this, bytes, byteIndex, byteCount, "byteIndex", "byteCount");
edit: Closed the CoreFX pull request, see the comments there. Will fix test failures. |
{ | ||
// Validate parameters | ||
|
||
Contract.Assert(encoding != null); // this parameter should only be affected internally, so just do a debug check here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether it would look better if these forwarders were internal instance methods on the Encoding
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas IMHO that class is big enough already (2000+ lines of code), plus we'd have to name the methods something else (e.g. DefaultGetByteCount
, NLSGetByteCount
) since the overloads would conflict with the virtual overloads that were already there. I don't think taking it as a parameter and doing a null check is too big of a deal (that's what all extension methods have to do anyway, basically).
Contract.Assert(encoding != null); | ||
if (s == null) | ||
{ | ||
string paramName = encoding is ASCIIEncoding ? "chars" : "s"; // ASCIIEncoding calls the string chars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to pass the param name to this method instead? this can give flexibility for anyone in the future calling this method will pass the right info. I think we can do that for all methods throw similar exception here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be slowing down the main non-exceptional path...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. thanks @jamesqo |
@tarekgh Would you like to merge this? There is a chance that this change may need follow up on the TFS side ... I am not on corpnet these days, and so I won't be able to take care of it. |
@jkotas I'll merge this one today. |
@jkotas I have submitted the following change in TFS side.
|
@jamesqo Thanks for doing this! @tarekgh Looks like this is already merged, but in the future, please use the option to squash when merging PRs like this which are broken down into many (in this case 22) small incremental commits all working towards the same goal. It helps keep the commit history clean. |
@adityamandaleeka I'll do that. sorry for this one. |
Some time ago, we have ported the UTF-8 encoder to C++ (used in PAL) in a way so that the source code of both the managed and native versions match except for the necessary syntax differences between C++ and C#. This change breaks the match, so we should update the native version to restore it. |
This change have not touched the worker method that we have ported to C++. It just touched the wrappers over the worker method. The wrappers were not ported to C++ because of they are very C# specific. I do not see anything that needs updating. Is there anything particular you have in mind? |
Ah, ok then. I have seen changes in the Utf8Encoding.cs, so I have just assumed it has touched the ported parts. |
Consolidate much of the duplicated code across Encoding implementations Commit migrated from dotnet/coreclr@00eca9a
(Note: This PR is best reviewed on a commit-by-commit basis.)
Many of the unsafe methods of
Encoding
which take pointers were introduced after the overloads which take arrays. So the pointer-based overloads had to be defined in terms of arrays, which means allocating a new array and copying all of the data (since there is no way to reinterpret a pointer as an array). This is undesirable for performance, so all the types in mscorlib which subclassEncoding
switch it around and define the array-based overloads in terms of pointers. (They also override the pointer-based ones, to call the overload withEncoderNLS
/DecoderNLS
.) To do this, 11 methods need to be overridden consistently in each class that subclassesEncoding
.EncodingNLS
subclassesEncoding
and overrides all of these methods for convenience, so any class that inherits from it doesn't have to worry about that. However, the public Encodings (ASCIIEncoding, UTF8Encoding, etc.) can't do that since it is an internal class and public classes can't derive from internal classes, so they all have to re-duplicate the logic.Originally, all of this code (which is +1,000 lines) was just copied-and-pasted around the 6 classes. I've changed it so that they call a static method in a new class,
EncodingForwarder
, passing all of their parameters andthis
to it. The methods inEncodingForwarder
do all the argument validation/checks and call the unsafe overload on theEncoding
.I've also updated/removed many of the comments in the classes that say "if you change something in here, change it in the other 5 classes as well..." since the implementation is now shared among all the classes, and added a note about
EncodingForwarder
to the top of each class to explain the whole thing.cc @tarekgh @jkotas @AlexGhiondea