Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

SqlClient reduce metadata size #34393

Merged
merged 8 commits into from
Feb 13, 2019
Merged

Conversation

Wraith2
Copy link
Contributor

@Wraith2 Wraith2 commented Jan 6, 2019

When using a DataReader any accessed data will cause an _SqlMetaData object to be allocated. As such these are one of the most commonly allocated types and when investigating I found that they they have some unused fields and some occasionally used fields which can be moved to on-demand allocated sub-objects.

These changes touch the _SqlMetaData and it's base class SqlMetaDataPriv. Multiple boolean fields are packed into flags ints with accessor properties replacing the publicly available bools and callers fixed up to use them (case changes from isThing to IsThing) the udt and xml schema properties are moved into new sub-objects which are allocated when needed, There are 3 structural* string fields which are never used and have been removed.

Standard and manual tests passed in native mode, this is above the network layer so managed should not make a difference.

On an x86 system this changes the size of _SqlMetaData from 224 bytes to 144 bytes, this will be better on x64 due to reference size. Perf measurements rom BDN:

BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17763.195 (1809/October2018Update/Redstone5)
Intel Core i5-7600K CPU 3.80GHz (Kaby Lake), 1 CPU, 4 logical and 4 physical cores
.NET Core SDK=3.0.100-preview-009812
  [Host] : .NET Core 3.0.0-preview-27122-01 (CoreCLR 4.6.27121.03, CoreFX 4.7.18.57103), 64bit RyuJIT  [AttachedDebugger]

Job=InProcess  Toolchain=InProcessToolchain  InvocationCount=1  
UnrollFactor=1  
Method Mean Error StdDev Median Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
ReadOrderDetailsBefore 81.23 ms 1.982 ms 5.812 ms 79.09 ms - - - 312.52 KB
ReadOrderDetailsAfter 74.78 ms 1.457 ms 1.619 ms - - - - 273.46 KB

/cc @saurabh500 @afsanehr @keeratsingh

@@ -280,20 +280,32 @@ sealed internal class SqlLoginAck

sealed internal class _SqlMetaData : SqlMetaDataPriv
{
[Flags]
private enum _SqlMetadataFlags : int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main perf improvement is because you packed the many different booleans into a single flag, is that right?
7 booleans are now packed into a Single Int causing an improvement in allocations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For _SqlMetaData yes. sharplab.io showed that they were represented as a byte each so we get back an int which isn't great but the number of allocations makes it more significant than you'd think.

The base type changes remove some reference sized fields and make others optional inside child classes so there's net gain there as well.

Copy link
Contributor

@saurabh500 saurabh500 Jan 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not change this to byte instead of using an int? The flag won't need more than 255 values.

Suggested change
private enum _SqlMetadataFlags : int
private enum _SqlMetadataFlags : byte

Also the size will reduce from 4 to 1 byte.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if it's a byte every use of it will have to sign extend up to an int when it's masked which costs cycles. in terms of memory it won't save any space because there's no free space, it'll just be a byte in a 32 bit aligned bit of space. I did look at doing it but there was no upside.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if it's a byte every use of it will have to sign extend up to an int when it's masked which costs cycles.

I don't follow this explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The layout of the fields from the base class and main class mean that the runtime layout of 5 bools is 5 bytes in length, if you change that to be bit fields in a byte you take off 4 bytes. The structure in memory is going to be 32 bit aligned though so 5 bytes or 1 means you end up with something smaller but every time you access that last byte the runtime is going to translate it in the runtime to an int to work on it.

Looking at that I thought that there was no point using a byte when it'll consume the same space as an int and make the runtime do extra work to deal with it. I'm not that attached to it so if you prefer a byte that's fine I just think the hidden cost isn't worth it at the moment and if we later need more fields we can change it since it's internal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at this: https://gist.github.com/Wraith2/2cea2a23cb26ff082a7cdb42a82d048a on sharplab.io in release and run mode.

If you scroll to the right hand side you can see that tableNum op operand are packed into an int. flags is put into the next int and there are 3 empty bytes at the end because the runtime wants to align things at 32 bit binaries. If we type the enum as byte it has to unpack it from that int at the end every time it wants to load or store it, if it's typed as a byte it just loads/stores with no extra steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it . Thanks for explaining. For some reason I am not receiving emails of your comments and updates on this PR and for that matter, I am not getting any emails on being tagged either. Hence the delay.

At this point the changes look good to me. However I am not sure if the metadata changes have a test. Did you get a chance to confirm this? IIRC the _SqlMetadata should be surfaced in the SqlDataReader.GetSchemaTable() API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no tests which specifically verify the individual fields on the metadata object, however the metadata is fetched and verified for most of them at some point during the testing. It might be something that I could add in another PR sometime (yay, test dev....)

The only item I had any concern about was the readability flags and I confirmed the logic for that in a separate project..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only item I had any concern about was the readability flags and I confirmed the logic for that in a separate project..

This was the reason I had asked the question as well :) Works for me if you have verified it.
Put together a test PR for this area to improve the tests. That would be great.

@saurabh500
Copy link
Contributor

@dotnet-bot Test Linux arm64 Release Build
Test OSX x64 Debug Build
Test corefx-ci

@saurabh500
Copy link
Contributor

@dotnet-bot Test Linux arm64 Release Build

@Wraith2
Copy link
Contributor Author

Wraith2 commented Jan 26, 2019

@dotnet-bot test all

@AfsanehR-zz
Copy link
Contributor

AfsanehR-zz commented Jan 29, 2019

@dotnet-bot test Linux arm64 Release Build
@dotnet-bot test corefx-ci

@Wraith2
Copy link
Contributor Author

Wraith2 commented Jan 29, 2019

The Linux build just doesn't seem to work, it won't seem to start, might need some engineering help

@saurabh500
Copy link
Contributor

@danmosemsft What is the resolution for the failing CI legs?

@danmoseley
Copy link
Member

@safern ?

@safern
Copy link
Member

safern commented Jan 29, 2019

Linux arm64 Release Build leg was removed in favor of newly added corefx-ci, that is the reason why it doesn't start anymore. The reason why it shows is because you need to rebase on top of master and/or close and reopen the PR, but not worth doing it.

corefx-ci is failing because the fixes where made a few days ago, and so you would need to rebase. Since the Jenkins UWP CoreCLR build passed and the failures I'm seeing in corefx-ci are a known and fixed issue, you can feel free to ignore them in this PR.

@AfsanehR-zz
Copy link
Contributor

@Wraith2 Could you get the latest change from master so the CI pass? We would like to merge these prs, just pending the check to be green. Thanks!

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 6, 2019

Done.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 6, 2019

Jenkins seems to be having a bad day, I've seen that @safern is fighting with it in another commit so i'll leave it alone at the moment and re-trigger when it's settled.

@afsanehr I'm seeing manual test failures but nowhere near my changes. They're all coming from Udt tests and I think the cause may be the change in floating point precision in the 3.0 preview2 runtime, for example:

        Unexpected distance value.
  Expected: 141.42135623731 (System.String)
  Actual: 141.4213562373095 (System.String)
        Expected: True
        Actual:   False
        Stack Trace:
          E:\Programming\csharp7\corefx\src\System.Data.SqlClient\tests\ManualTests\DataCommon\DataTestUtility.cs(155,0): at System.Data.SqlClient.ManualTesting.Tests.DataTestUtility.AssertEqualsWithDescription(Object expectedValue, Object actualValue, String failMessage)
          E:\Programming\csharp7\corefx\src\System.Data.SqlClient\tests\ManualTests\SQL\UdtTest\UdtTest2.cs(266,0): at System.Data.SqlClient.ManualTesting.Tests.UdtTest2.UDTParams_InputOutput()

If you can replicate and agree it's probably worth opening another PR to review those tests and update the expected values to the new precision.

@safern
Copy link
Member

safern commented Feb 6, 2019

cc: @tannergooding for the decimal changes.

I think the as long as the jenkins issues are infrastructure related and corefx-ci is green it is safe to merge. Why? Because corefx-ci covers all the build legs that we do in Jenkins. Jenkins should be gone soon, whenever azure pipelines trigger comments are fixed 😄

@tannergooding
Copy link
Member

cc: @tannergooding for the decimal changes.

These look fine, the test (see https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/tests/ManualTests/SQL/UdtTest/UdtTest2.cs#L266) is just validating the result of Point.Distance().ToString(). The formatting included a fix such that double.ToString() would now return the shortest round-trippable string. The previous result being returned was bits: 0x4061AD7BC01366C9 and it is now returning bits: 0x4061AD7BC01366B8, so the old result was off from the actual result by 17 ULP (units in last place).

If, for any reason, the old behavior is desirable, you can achieve it by explicitly doing ToString("G15") (although I imagine returning the round-trippable string is the more desired result).

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 6, 2019

The tests are used again both CoreCLR and netfx since changes can move upstream i'm told. In light of that i think specifying a specific precision in all cases is best so the results are stable and we don't need special cases.

@tannergooding
Copy link
Member

@Wraith2, the preference (at least as far as I am aware) if you arent special casing the production code is actually to split this into NetFramework and NotNetFramework tests. Similarly to: #35016

I'll let the area owners pushing this through have the final say on what they find desirable, however.

@AfsanehR-zz
Copy link
Contributor

@Wraith2 Yes, good catch on noticing these new failures! I see those udtTests failing as well.
@tannergooding Thanks explaining the changes. Will send a pr to update those values.

@AfsanehR-zz
Copy link
Contributor

@dotnet-bot Test this

@AfsanehR-zz
Copy link
Contributor

@safern the corefx-ci is still red on some others prs like this one #34546

How do you retrigger these? I tried @dotnet-bot test corefx-ci doesn't help.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 7, 2019

I think all the tests are supposed to underneath core-ci now, I have a look earlier though and I can't see any test output just build.

@safern
Copy link
Member

safern commented Feb 7, 2019

@safern the corefx-ci is still red on some others prs like this one #34546
How do you retrigger these? I tried @dotnet-bot test corefx-ci doesn't help.

Since corefx-ci is from the new system it needs to be retriggered with either through the build UI on the upper right corner, there are three dots (...) you will have two options, rebuild or retry. Rebuild will build the whole thing again, Retry, will build only the failing legs.

Also, you can retrigger it through a comment:
/AzurePipelines run corefx-ci -- however this comments are broken now for forked PRs. They should be fixed by tomorrow or monday according to azure pipelines.
https://github.com/dotnet/corefx/issues/35121

@AfsanehR-zz
Copy link
Contributor

@safern great. Thanks!

@AfsanehR-zz
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@safern
Copy link
Member

safern commented Feb 12, 2019

@afsanehr note that not specifying the build after "/azp run" will cause all builds to run including outerloop 😄

@AfsanehR-zz
Copy link
Contributor

@safern I see.. I'll wait for these to finish then. And as you said, once all the corefx-ci builds pass, we should be ok with merging.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 13, 2019

The failures to do with sending to helix. @safern are there any docs you can point me at on the tech we're using in the build system so I can get a better understanding of how it all fits together?

@safern
Copy link
Member

safern commented Feb 13, 2019

The failures with send to helix are because there was a build failure:

C:\Users\VssAdministrator.nuget\packages\microsoft.dotnet.helix.sdk\2.0.0-beta.19112.3\tools\Microsoft.DotNet.Helix.Sdk.MultiQueue.targets(67,5): error : Job '0ec093e8-e9c6-4f2c-a733-daec56ff0b0d' failed 1 out of 585463 tests. [D:\a\1\s\eng\sendtohelix.proj]

These are the results:
https://mc.dot.net/#/user/dotnet-bot/pr~2Fdotnet~2Fcorefx~2Frefs~2Fpull~2F34393~2Fmerge/test~2Ffunctional~2Fcli~2F~2Fouterloop~2F/20190212.3

There aren't any docs yet, but good point, it might be worth adding a doc file explaining all the technology and what can contributors do to diagnose a failure, restart a build and all that new infrastructure related to PR builds. I've opened and self assigned: https://github.com/dotnet/corefx/issues/35288

@Wraith2
Copy link
Contributor Author

Wraith2 commented Feb 13, 2019

So the failures are linq and dns related, nothing in system.data, that's good.

A basic overview of how the moving parts fit together and some pointers on diagnosis would be good to have and save both you and me some time, thanks for opening the issue.

@AfsanehR-zz
Copy link
Contributor

@Wraith2 I can see the udt failures too. Created a PR for it.
This PR can be merged as the changes made here are not related to CI failures.

@AfsanehR-zz AfsanehR-zz merged commit f0f5141 into dotnet:master Feb 13, 2019
@Wraith2 Wraith2 deleted the sqlperf-metadata branch February 13, 2019 23:44
@karelz karelz added this to the 3.0 milestone Mar 18, 2019
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants