-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SqlCommand.EnableOptimizedParameterBinding Feature #1041
Conversation
Hiya @Wraith2, can you make a package available like previously? |
@KristianWedberg You can download from the azure DevOps CI-BuildNPack pipeline here: https://dev.azure.com/sqlclientdrivers-ci/sqlclient/_build/results?buildId=29463&view=artifacts&pathAsName=false&type=publishedArtifacts |
Yup, with
So all good so far! |
Excellent. So, fancy helping write some tests? 😁 |
I've added some simple tests which examine the behaviour of parameters with this setting enabled and they surprised me a bit.
So it seems that on input there is still a name-value map generated but it's populated by position not name match and then the engine proceeds normally requiring names to exist once the query is parsed etc. On output there is no name-value map so you can't use named outputs at all. |
Out of curiosity... Is a new flag on SqlCommand really needed? Is there any reason you can't simply check if the command's parameters have their names set or not (i.e. ParameterName is null)? I'm raising this because I'm planning a similar change in Npgsql. |
Ideally I'd use a new flag in the CommandBehavior enum. That's a breaking change though and it's in /runtime and since I thought this was going to be SqlClient specific it didn't seem to be the right solution to the problem. If you could use the same value in npqsql then perhaps it's worth considering. I don't know if mysql or sqlite (or firebird?) would be able to or want to make use of it. It might be worth a wider discussion. It does have to be at DbCommand level. Even though the parameter values aren't sent with names the RPC request still contains a list of parameter names and types and those get checked. It's peculiar but I'm finding from the tests that the only thing this changes is that the parameters are matched up to names in the order provided, you can still use a named parameter multiple times and you still have to name each parameter correctly and provide values for each named parameter. |
Thanks @Wraith2, so you're saying users have to specify SqlParameter.ParameterName regardless of this option, right? That's peculiar indeed... The concept of positional parameter usually means that placeholders themselves are positional: either via numeric placeholders like in PostgreSQL ($1, $2), or with implicit positioning (?). I doubt a CommandBehavior would make sense across multiple providers here. |
If there is a way to support true placeholder positional style I don't know what it is. If there is a use for the positional option in other drivers the enum flag is a better approach and the limitations of how Sql Server implements it would be something to cover in the documentation or provider. edit:
To answer this. No, users aren't required to enter a parameter name by the SqlClient driver but if you don't then the server is going to reject the parameter with |
We could do this without the formal extension to CommandBehavior by simply oring in a new flag value
but that looks like a horrible hackjob and if I found that as an official answer I'd be suspicious about why the people who own the api felt the need to provide such a weird way to access it when they're capable of changing the api surface. It would need some internal plumbing to make sure validation of command behavior didn't trip up on it but i can handle that. |
... I agree this looks... not ideal :) There's also a risk of conflict in case CommandBehavior ever does get extended. Is there any real downside to introducing a flag on SqlCommand as this PR currently does? |
The only danger I can think of it is that people will see and use it without understanding it. |
If I understand the new functionality and limitations correctly, from the users' perspective this is not positional parameters. Rather it's an Input only (or InputAndReturn only?) optimization for >64 parameters. I think a different property name would reduce any incorrect usage, e.g. Thoughts? |
Not quite, consider this (passing) test: [Fact]
private static void PositionalParameters_ParametersAreUsedByPosition()
{
using (var connection = new SqlConnection(DataTestUtility.TCPConnectionString))
{
connection.Open();
using (var command = new SqlCommand("SELECT @Second, @First", connection))
{
command.UsePositionalParameters = true;
command.Parameters.AddWithValue("@First", 1);
command.Parameters.AddWithValue("@Second", 2);
using (SqlDataReader reader = command.ExecuteReader())
{
reader.Read();
int firstOutput = reader.GetInt32(0);
int secondOutput = reader.GetInt32(1);
Assert.Equal(1, secondOutput);
Assert.Equal(2, firstOutput);
}
}
}
} On input: Names must all be declared. Names can be reused without taking an additional value. Names and values are associated at first encounter of the name by taking the next value from the list of values. |
OK. How about And are Return parameters affected or not (i.e. should they be referenced in the name or not)? |
Capturing all the complexity in the name is virtually impossible unless we use an absurd name. We're going to have to rely on the docs to inform people what it changes and what it doesn't.
Return parameter names aren't really important because nothing in the driver cares about the name. They work as the did before because their names aren't relevant. There is only one return value it has no name and it has type int32, in the absence of explicitly returned value it has default value 0. |
Given what you said about Return values, I'd certainly vote for |
It also affects input parameters as I showed in that test so using the word Output with the attendant implication that it only affects output parameters seems misleading. Perhaps |
@Wraith2 I'm probably missing something, but in this example, would the code work any different if UsePositionalParameters isn't set? It just looks like regular named parameter code to me... |
[edit] |
Ok. It's fun. The server side seems to be using the rpc parameter order to match values to names so as long as the build of the parameter list parameter isn't screwed up in some way then the order will be preserved.
So the server is matching up the index of the parameter name in the parameterList ( index1 in the rpc) text with the following user params and because they're in order it works. If you found a way to make the order difference then it'd fail but since we build and send in a single operation it'll always work if rpc are built this way. In which case calling it |
Tagging @saurabh500 @David-Engel for suggestions too! |
We don't really expose users to the word RPC so I'm not a fan of using that in the name. What would you expect the behaviour to be if the name were |
From my understanding this flag is essentially changing what the RPC is sending to the server from: exec sp_executeSql N'select @','@ int',@=2 to the following: exec sp_executeSql N'select @','@ int',2 So perhaps |
I'll vote for clarity over terseness, seems like a reasonable name to me. |
Don't want to bikeshed too much, but an API name should generally express why a user would use it. If I understand correctly, the point of this flag would be to get better performance, whereas the current naming makes it sound like users would use it in order to DisableOutputParameters, which doesn't really make sense; disabling output parameters is a side-effect of the more efficient transfer technique, no? So my proposal would be to call this something more like UseOptimizedParameterTransfer or similar. If this is incompatible with output parameters, I'd detect that and throw NotSupportedException with a clear message to the user. |
Even documented I think |
cc @David-Engel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. 👍
That's my first surface area addition to the library. Everything I've done before has been invisible to users. Woo! |
@Wraith2 Congrats! And my first time naming one! Do you have a benchmarking suite, would be interesting to test 1.0 against 3.0? |
I've been relying on the benchmarks that @KristianWedberg posted in the original issue and has kindly been running on the interim builds to verify the changes https://github.com/KristianWedberg/sql-batch-insert-performance |
@Wraith2 I meant general benchmarks that could highlight the general performance improvements made over time, not related to this PR. |
Ah, no. The closest i've got to that is the https://github.com/aspnet/DataAccessPerformance benchmark which recreates the TE Fortunes benchmark. Everything else has been highly situational and micro benched. In general my focus has been memory usage on common paths like opening connections, executing a reader, reading valuetypes etc. |
@ErikEJ looks like 4.0.0 has been released so your new api is public, let loose the blog post! ;) |
Yay! |
@KristianWedberg i also had some followup thoughts on this. Do you re-use SqlParameter classes a lot when inserting data? And if so does the object boxing overhead of valuetypes show up in profiles? |
It'd be nice if there was a way to avoid it, but I doubt it matters too
much compared to IO overheads. I regularly use SqlBulkCopy with large
number of wide rows of mostly doubles and the only thing that tends to show
up on my profiling is `WriteToServerAsync()`
…On Fri, Nov 19, 2021 at 7:55 AM Wraith ***@***.***> wrote:
@KristianWedberg <https://github.com/KristianWedberg> i also had some
followup thoughts on this. Do you re-use SqlParameter classes a lot when
inserting data? And if so does the object boxing overhead of valuetypes
show up in profiles?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1041 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUB4J36TJLYA7WGX33UGLUMZCNVANCNFSM43DUEDRQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I think there is a way that the boxing could be avoided but it's only profitable with with repeated uses of the same parameter. If you're seeing WriteToServerAsync you're probably running a cpu profile in which case you'd see boxing as GC time. |
@Wraith2 Yes, the SqlParameters get heavily reused. I pre-create a few batch insert statements (e.g. 1, 32, and 1024 rows) with associated SqlParameters, and generate a function to quickly and repeatedly set the parameters with new values for each new batch. As you noted during the investigation, value types will be boxed and generates GC pressure. This didn't affect the performance in any of my benchmarks (I also tested with pre-boxed value types), but if bottlenecked on client CPU cycles this could of course matter to some degree. The amount of boxing is simply one per value type inserted per inserted row, and I'm guessing all generation 0, so the impact should be easy to estimate. I could implement this pre-boxing also in my production code, but that's currently on a fairly low priority. How were you thinking of addressing this from your end? |
I've got an experimental branch which allows SqlParameter to detect and use |
Gotcha. In my scenario I'm guessing I would store the Bit of work, but certainly doable. There is however nothing SqlClient specific with this improvement - it would be really sweet if this was an optional API on |
Other providers wouldn't need it, most of them have a Being able to quietly understand a generic argument container like would be backwards compatible and perform better when opted into. There's a small amount of complexity around nulls but as long as it's properly documented it wouldn't be difficult for anyone who has used direct ADO in the past to deal with. |
Even so, why not design it so that it (optionally) can be reused across providers, even if it means they might have both their own API and the portable API? I might not be able to justify implementing this for a non-portable API, since I'm generating the parameter setting code with expression trees, and doing that differently for different providers could be more complexity than it's worth... |
That would need runtime implementation planning and work and then pushing out to all providers. So far that level of co-operation hasn't been possible. We can see what @roji thinks but I expect it would be a no-go for a general provider feature because as i said other providers can handle it better already. |
FWIW I'm not aware of another provider apart from Npgsql which has
Note that at least in the current Npgsql implementation,
Well, first of all, what would there actually be here to design across providers, in System.Data? SqlClient can simply support writing when SqlParameter.Value is set to Furthermore, if SqlClient does implement this, then it seems like a generic I guess what I'm saying, is that if SqlClient can support writing |
In technical terms there is some difference but it could be worked out. The SqlParameter implementation leans heavily on reflection to identify the type so a lot of internal logic would need to change if we no longer have the field. The hack i've done sort of sidesteps it but still needs the In practical terms getting the interface prototyped, approved, implemented in the BCL, implemented in SqlClient and then available to consumers is a multi-year effort and so far we've been unable to get The batching Api more than halfway through that process in SqlClient despite the other major providers having completed it. Hacks in this one library that don't affect the external interface are much "easier" to get implemented than trying to change api surface. Here it might only take a year. |
The batching API was delivered in .NET 6.0. I wouldn't estimate that a generic If you feel that eliminating parameter boxing is urgent/high-value enough, then you can indeed go ahead and add support for StrongBox. Given that there's an intention to solve this at the System.Data level, I personally don't see a reason to rush this - it's been this way since forever, it's an optimization that probably isn't terribly important in the SqlClient context (because of the other perf issues), and importantly, when the System.Data API is done you'd be stuck maintaining both the standard |
fixes #974
This PR adds a new property to SqlCommand called DisableOutputParameters. When set to true parameters names will not be sent to the sql server when the command is executed. This behaviour has been show to increase performance for commands with very large numbers of parameters which are used in bulk operations.
This is a draft because:
@KristianWedberg would you mind running your perf tests with the artifacts from this PR to ensure that this small change still has the performance changes observed in the investigation we did in your original issue?