Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Fixing Span IndexOf vectorization. #17500

Merged
merged 3 commits into from
Mar 25, 2017
Merged

Conversation

ahsonkhan
Copy link

@ahsonkhan ahsonkhan commented Mar 24, 2017

cc @benaadams

Please take a look. When trying to reason about different test scenarios with @shiftylogic, these changes make more sense. Let me know if you notice any performance pitfalls or correctness issues.

Also cc @jkotas

@@ -61,7 +61,7 @@ public static unsafe int IndexOf(ref byte searchSpace, byte value, int length)
unchecked
{
int unaligned = (int)(byte*)Unsafe.AsPointer(ref searchSpace) & (Vector<byte>.Count - 1);
nLength = (IntPtr)(uint)unaligned;
nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) % Vector<byte>.Count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the idiv get elided out in the asm? (otherwise its & the jit const (Vector<byte>.Count - 1) on the line above)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the idiv get elided out in the asm?

What is idiv?

Copy link
Member

@benaadams benaadams Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integer divide; is the modulus (%) and can take up to 22 cpu cycles (vs bitwise and which is 1 cycle)

Copy link
Member

@benaadams benaadams Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just conscious that its being compared against the byte by byte compare that's a lot of byte compares

Copy link
Author

@ahsonkhan ahsonkhan Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, change:

nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) % Vector<byte>.Count); 

To:

nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) & (Vector<byte>.Count - 1)); 

Is that correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to store Vector.Count in a local?

Copy link
Member

@benaadams benaadams Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, when IsHardwareAccelerated == true it the jit directly replaces it with a const

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes on the & change as Vector<byte>.Count is always a power of 2 so -1 will all the low bits and will be the same as the %

@@ -122,15 +122,16 @@ public static unsafe int IndexOf(ref byte searchSpace, byte value, int length)
{
goto NotFound;
}
nLength = (IntPtr)(uint)(length - Vector<byte>.Count);
nLength = (IntPtr)(uint)(length - (uint)index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be:
nLength = (IntPtr)(uint)((length - (uint)index) & ~(Vector<byte>.Count - 1));

Then you don't need the second calc in the loop? And can compare against index?

{
var vMatches = Vector.Equals(vComparision, Unsafe.ReadUnaligned<Vector<byte>>(ref Unsafe.AddByteOffset(ref searchSpace, index)));
if (Vector<byte>.Zero.Equals(vMatches))
{
index += Vector<byte>.Count;
nLength -= Vector<byte>.Count;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to drop this?

@ahsonkhan
Copy link
Author

No, when IsHardwareAccelerated == true it the jit directly replaces it with a const

What about Vector<byte>.Count - 1?

@benaadams
Copy link
Member

benaadams commented Mar 25, 2017

What about Vector.Count - 1?

Should be const - const => const

@ahsonkhan
Copy link
Author

@benaadams, do the changes make sense now (in line with what you had intended)?

@@ -61,7 +61,7 @@ public static unsafe int IndexOf(ref byte searchSpace, byte value, int length)
unchecked
{
int unaligned = (int)(byte*)Unsafe.AsPointer(ref searchSpace) & (Vector<byte>.Count - 1);
nLength = (IntPtr)(uint)unaligned;
nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) & (Vector<byte>.Count - 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, unaligned will always be < Vector<byte>.Count so (from mask on line above) the & part isn't needed?

Copy link
Author

@ahsonkhan ahsonkhan Mar 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if unaligned == 0?
If we don't do the &, nLength would equal Vector<byte>.Count rather than 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True

@benaadams
Copy link
Member

LGTM; thank you for fixing this

@ahsonkhan
Copy link
Author

@dotnet-bot test Innerloop OSX10.12 Release x64 Build and Test

@ahsonkhan ahsonkhan merged commit 49a6fc9 into dotnet:master Mar 25, 2017
@ahsonkhan ahsonkhan deleted the IndexOfVectorImpl branch March 25, 2017 02:16
@karelz karelz modified the milestone: 2.0.0 Mar 25, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants