-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional API for DbProviderFactories in .NET Core #22229
Comments
My advice would be: keep the overloads and not just 1 method as having just 1 method will be more cumbersome to use. In general having overloads gives a bit of a maintenance burden, however the class they're part of is pretty much complete in functionality, as in: I don't think this class will ever get more functionality so the overloads won't have a necessity to change over time with new features being added to this class. In that light, the overloads have as much maintenance overhead as a single method, so I think having them is not a big burden but has a positive impact in usability of the class :) |
@divega @saurabh500 @corivera can you please review it on Monday from API experts point of view (make a decision on 1 method vs. more), then add a note and mark it for API review (on Tue morning) by relabeling to api-ready-for-review? Thanks! |
@karelz I have setup something for this week. |
We discussed this with @saurabh500 @corivera @ajcvickers and the @dotnet/fxdc people and came up with the API we want: /// <summary>
/// Programmatically registers the configuration of a <see cref="DbProviderFactory" />.
/// </summary>
/// <param name="invariantName"> A unique name that can be used to refer to the data
/// provider. Required to not be null nor whitespace.
/// </param>
/// <param name="assemblyQualifiedName"> Assembly qualified name of the
/// <see cref="DbProviderFactory" /> class, which contains enough information to
/// instantiate the object. Required to not be null nor whitespace.
/// </param>
/// <param name="name"> Human readable name for the data provider. Can be null in which
/// case it is ignored.
/// </param>
/// <param name="description"> Human readable description of the data provider. Can be null
/// in which case it is ignored.
/// </param>
/// <remarks>
/// If a registration exists for the given <paramref name="invariantName" /> the call to
/// this method will override the configured values.
/// </remarks>
public static void RegisterFactory(
string invariantName,
string assemblyQualifiedName,
string name,
string description) Rationale
@ajcvickers, @saurabh500, @corivera, @terrajobst, @davkean, @FXDC please let me know if you see any error of misrepresentation from what we discussed or if any important details are underspecified. BTW, there is one tweak I took the liberty of making today: I shortened cc @FransBouma |
I'm disappointed in this API. The main reason is that as a person who directly has to explain developers of .NET applications how to use this API, I have to explain more than I should have: it's completely non-intuitive, and as there are no overloads proposed, they always have to call this method. Although you qualify it as a 'low level API', the end developer still has to call it. What I was after was a way to let them make as less mistakes as possible, with an API that's as easy to use as possible: no mistakes. Now, there's an API which actually has just 1 overload which only accepts a string for a typename. I understand why you'd want a string variant, but it's the only variant, so everyone has to call this method (and they have to call it always), so mistakes are easily made and only visible at runtime. People already make mistakes typing the provider invariant name, do you really think they make no mistake in typing a type name, oh sorry, assembly qualified name? I find this API so bad actually, that I refuse to implement it: every ORM will likely implement some extension methods which make this single method completely obsolete anyway. |
To explain things per point:
yes, but that's always the case. The BCL has a tremendous amount of ctors and methods which only call into other methods and are there just to make things easier. We could all get rid of these. But they're there for a reason: they make life easier for the situations when you don't have to pass additional arguments. This is the case here too. I specified with every method I proposed, why it was there. This is completely ignored, which baffles me. I'm in the ORM business now for 14 years, I work with users of ORMs every day and have done so for the past 14-15 years. I know what mistakes they make. It's not as if I don't know what I'm talking about when I say: this api has to be as simple as possible; mistakes shouldn't be possible here. But you came up with the variant which exactly allows that. The problem is that you don't have to explain to my users how to use this api, but I have to. ADO.NET provider writers have to. It was already a complex affair to explain to people how to register a factory in the web.config file when an ADO.NET provider didn't register itself in machine.config, which was the case with the PostgreSql and Firebird ADO.NET providers for some period of time. I had to specify the XML so they could copy/paste it, and even then they made mistakes because they got the version wrong in the strong name in the type specification. That's what I mean with make it as easy as possible. You now propose an API which basically requires a set of extension methods to be easy to use so most people will use it without a problem: users of this method aren't ORM / ADO.NET specialists, they write websites, they create services, and using an ORM/ADO.NET api is a burden and out of their comfort zone. Proposing an API which makes it hard to use it correctly won't help at all.
Yes, it's 'enough' to use the AssemblyQualifiedName, but it's not intuitive. Why not allow an overload which simply accepts a type? What's so utterly bad about that? I already described I don't expect any code changes in the implementation of this API going forward, so there's little to no chance things need refactoring soon, if at all.
The first two are important, but the 2nd argument (assemblyQualifiedName) is actually the only value which is required. You can infer invariantName, like the netfx auto-init code uses too (and which my implementation uses as well). About the 3rd and 4th argument:
I don't know but... passing null values for strings... Come on, why on earth would this be a properly designed API where it accepts nulls for strings, while the default value (so it's ignored) is "", not null. Sorry, but in the current state I won't associate my name to it, so I won't port my implementation in the PR dotnet/corefx#19885 to this API; I'll live on forever as the person who would have implemented it into the BCL and netstandard and I refuse to do that. // cc @karelz |
One small thing that bugs me as well is the following: why has the debate about this API been held in the secret and not right here in this issue? Isn't it better for everyone involved to have transparency and openness so we all know what's going on? Now no-one knows who said what for what reasons, only that 'this is the API' is the result, presented as a Law sent down from the Ivory Tower where the design committee is located. |
Because they meet in person. Can you imagine trying to live-post here while talking?
That is exactly what is happening. They did not say this is final, only that this is the API they like and why, and they even specifically cc'd you for your review of their opinion.
The reasons are all listed in Rationale in https://github.com/dotnet/corefx/issues/20903#issuecomment-310225542. That's plenty to go on, as you showed in your response. |
I don't want to throw oil into a fire, but since you heavily tweeted about this I came here and I don't fully follow your anger.
I don't agree with this. As an outsider I can fully follow the argumentation that was provided alongside the proposed API. Whether you agree with those arguments is up for discussion, but "no one knows what" is certainly not true. And what exactly do you mean by "why has the debate about this API been held in the secret" ? That is just non-sense. Did you document all your customer conversations from the last 14 years somewhere int he public so we all can follow your reasons for your API decisions? If not, why do you keep those as a secret? OSS doesn't mean that people need to keep protocol of everything they discuss in person. What is relevant is that reasons are provided for any decision and that has been done here, what else did you expect? You can still further discuss those reasons here... |
Also I'm really confused at the tone. It stands out of place as a curiosity in contrast to everything else going on here. If history is a guide, antagonism isn't going to accomplish what you want. |
This issue is 1.5 years old, if you don't know. A lot has been said about this issue in that period of time. What surprises me is that they didn't debate this particular api once during that 1.5 years, nor during the month or so since I picked it up. Perhaps it's a perception wrt what to expect regarding transparency, but I really don't see the added value of not debating things here in the open if everything else is open, especially since this issue is already so old and a lot already has been said.
I've already spend a lot of time on this and that's now wasted. I really don't see how I should be cheerful about that. The thing is that what's in the start post is what I proposed and what's already implemented, and I did all that with the best of my intentions and knowledge of the field in question. I now have no idea why any of that is wrong or irrelevant. You apparently do, that's great but I really don't. I simply like to know why exactly even a simple overload is already too much work or an overload which accepts a type. Mind you, the purpose of each method I proposed is derived directly from my own experience with ORM users and their benefits sought. (edit) It's not that I want my API to be accepted and nothing else, on the contrary. What I like to see is a reasonable API based on what I brought forward as knowledge regarding this particular API. That methods I proposed need to change their name, or that there needs to be an overload which accepts a string as a typename, fine by me, really. But now it's that I spent time on this API which was apparently for naught: Microsoft can't possibly think I'd simply say "sure, nice API, let's port my code over and toss all the results of my own thought process and the debates that have already been held overboard!". |
Like I said, what's not open? The debate is ongoing in this thread.
Because it's effective. You want to catch flies, use honey not vinegar. I don't believe for a moment that you truly want to waste time and be ineffective on this.
No, I'm with you on that. I like your proposal better. But the tone is turning me away. Fyi. |
No, it's really not. The API that's acceptable is what they proposed. If they had done the debate here in the open we could have participated in that debate. Now we can only bicker about the result and hope they will change their mind, but we can't be effective on that as we don't know why exactly they chose this over the others.
I don't follow. Everything is already implemented and tested and works. This new API requires a new implementation, new tests and a truckload of documentation to tell everyone how to use the API as there are no convenient overloads. That there has to be a debate about whether there should be overloads at all is odd: like they didn't understand that overloads might be handy here.
Well, to give you an idea: I opened the related issue #45711.5 years ago. A tremendous amount of time was wasted in the issue, due to circumstances within MS. As no-one did anything I picked it up but it was a rough ride: tooling issues which were hard to solve and not a lot of documentation regarding tools to use made this a frustrating effort. I spent nearly a full workweek on this, and I'm sorry to say that as I am not really a slow programmer, but it now looks like it. A full workweek which led (due to release mess it was pretty much chaos in this issue and it led to the API review being proposed after the code) to a PR which isn't mergeable and which has an API which is a 180 from what's proposed as the API to implement. So this leaves me with a tough decision: I either implement this API which MS finds acceptable (and I will hate myself for that forever) which will also take time and effort, and I already spent too much on this, or I don't implement the API and the time I spent is wasted as well. So yeah, that frustrates me a bit. This whole experience was dreadful for me and the outcome has been in line with that. If my tone is then not that friendly, I apologize for that, but it is what it is. |
Let's just give the team a chance to respond to your feedback. |
We generally do not consider overloads as adding to the concept count. That's largely because the concepts are expressed by names and the overloads are just different mechanics to do the same thing. Logically, overloads are often considered being chained, i.e. they provide conveniences and shortcuts. Personally, I don't like having a single method which supports a combination of things, based on which arguments are
I totally see the reason why you'd want to support a
I find that argument not very convincing. Granted, I agree with your sentiment that this API is more cumbersome to use than we probably want but (1) I think we can fix that and (2) if can't or won't I'd solve this problem by delegation, not by igoring the API altogether. What I would do is follow what ASP.NET Core has done with their In other words, I'd provide a method that I'd instruct customers to call from their ImmosFantasticOrm.Register(); Now, I'm not saying that's what you should do; but if I had to solve usability issues, I'd rather do this than not using a provided extension point many things plug into. |
It's actually rather simple :) Let's say ORM L relies on DbProviderFactory instances internally (most do, except EF and NH, the latter uses reflection). For a user to use L, she needs to register the factory for the DB she wants to use. This is overhead which for some users already might be confusing: shouldn't the ORM take care of that? (it should, but it can't as it doesn't know the ADO.NET provider, but that's a complex story for the user as well) They need to register factories to make L work at all. In no way I want to imply users of ORMs aren't able to understand this, it's simply not in their field of expertise, so that they need to register a factory is new info for a lot of them. If anything goes wrong with that registration, L won't work, will throw errors or worse: it might not throw errors but .NET might throw an error that some factory isn't known... So for the developers of L (and likewise tools) it's key to make sure the user of L doesn't get these errors. For netfx we can rely on installers for ado.net providers in most cases or e.g. that SqlClient is already registered. For .net core this has to be done by the user of L. So for me, to avoid any issues for my users, I'd provide methods which makes it much harder to fail at registering a factory than with the api proposed. And as my fellow ORM devs are much smarter than me, I am pretty sure they will too ;)
That's an idea, though we can't as one crucial element is missing: the type of the ado.net provider's factory. So we can't simply let the user call a method without arguments, sadly: which ADO.NET provider dll is to be used when that method is registered? So alternatively, with the idea you brought forward, you could perhaps make ADO.NET provider writers call the api instead. So instead do: System.Data.SqlClient.SqlClientFactory.Register(); Writing the code for this issue I pondered a bit about what could have been done differently (as the API in netfx how it is setup isn't stellar) and one thing I realized was that this whole registration business should be with the ADO.NET provider itself. (with one caveat, it should be possible to override a type for a given invariantName, so you can wrap a factory with a profiler or logger factory) ps: This kind of debate is what I meant with: keep it in the open. Thanks @terrajobst :) |
@FransBouma @terrajobst trying to catch up with the long thread. I am not going to try to answer on every point, but hopefully I can address some important ones. From @terrajobst
I agree the effects of multiple overloads is different from multiple methods. However I don't agree that having multiple overloads doesn't add any complexity. Don't call it concept count, but it adds to the decision points. I agree with adding options that prove useful. From @terrajobst
I agree. Our intention was actually to get the basic building block right, then iteratively look at how to improve the usability for the actual 80% scenario that emerges. Perhaps the problems is that this iterative improvements process doesn't work well with the nature of BCL and its schedule, and we should instead have been more proactive about having a more complete solution from the start. From @FransBouma
Have you heard about YAGNI? 😄 Try not to take things personally, but I would challenge any framework developer, even myself, who proposes to add every possible overload of an API, no matter how much experience they have, to build the API from the ground up and argue for real scenarios. I suspect we don't agree on what shapes of this API would actually be useful in real scenarios, but that would be fine, as long as we can rely on actual consumers of the API to provide feedback. From @FransBouma
I can completely agree with that. In fact you have convinced me that with just the one basic method isn't enough to achieve that. BTW, all this chat about extension methods is making me think that having something like this would be nice: SqlClientFactory.Instance.RegisterFactory(...); This would look like one of @FransBouma's previous proposals in the PR with generics, but with a I am not sure about this but since Thoughts? Also, @stephentoub has brought up the fact that maintaining an internal |
I do agree with the fact that overloads increase complexity. But in many cases the complexity is a function of usage patterns. For instance, if all I need is this: DbProviderFactories.RegisterFactory(typeof(ImmosFancyDbProvider)); But by making me write this: DbProviderFactories.RegisterFactory(
nameof(ImmosFancyDbProvider),
typeof(ImmosFancyDbProvider).AssemblyQualifiedName,
nameof(ImmosFancyDbProvider),
"Immo's Fancy Provider"
); You also increase complexity by what is known as cognitive dissonance, i.e. what I have to write in code doesn't match with the operation I have in my head. I'd argue that the complexity in choosing the overloads is the same set of choices as deciding what values to put in my code, however, IntelliSense helps me by showing possible patterns and thus provides some guidance. Personally, I think the net complexity is slightly lower with the overloads than without. Now, I do buy the argument that maybe this isn't the API where usability is important, i.e. that it is a low-level building block and that real usability improvements are elsewhere, e.g. in providing an extension method. In that case, I think it's fair to say "look, I'd rather invest the cost of adding APIs there than by making the low-level stuff more usable".
I'd be fine with using |
@terrajobst I don't disagree with any of these things. I value finding the sweet spot between a spare and a intuitive API. I believe what we need to explore more is what usage scenarios are common and important.
Can you please elaborate a bit more on the tax for future reference? |
Create a console app. Make sure to import In other words, the tax is that extension methods will often show up, even if not always desired. Sometimes a static method taking an instance is better suited precisely because it is less in your face. |
😄 I always run a script to get every possible argument mutation ;). But seriously, not every possible overload is required of course, but there are multiple scenarios here, 2 in particular: one where only invariantname / type is specified and one where everything is specified (your version). The latter likely will never be used or very rarely. The former will the one used by most, if not all people, if just 2 methods were provided. As you can further slim this down as the invariantName can be inferred (as it's the namespace, see the internal logic of NetFX's DbProviderFactories), a convenience variant of the first method is desired. (I had to chuckle about the close-to-bikeshedding regarding whether method overloads were useful btw. Didn't a former colleague of yours write a book about this? 😉 )
Good :)
When I ported the code over I realized that the registration / key/value stuff isn't well designed in netfx wrt DbProviderFactory instances: the key, the invariant name, isn't provided by the type it represents, but in practice, the only source which/who can define the invariant name is the developer who creates the ADO.NET provider, so the invariant name should be produced by something in the ado.net provider. In Netfx they can get away with it as the types are registered in either app/web.config or in machine.config, and the latter is the general way to do this (using an installer, likely provided by the ADO.NET provider developer) but in CoreFX this isn't the case. In CoreFX the ADO.NET provider is referenced through NuGet in a lot of cases anyway (as there's no Gac), so the types are known at the app level. In cases where this isn't, a string based approach could be used. DbProviderFactories are not for the code at app level but for the code in the ORM used, to avoid coupling there, hence it's still useful to have this machinery. So I realized the registration of the factory in DbProviderFactories should be logic inside the ADO.NET provider. The only problem is: how to enforce that. In .NET land we have companies like Sybase and Oracle who are Feet Dragging World Champions and are not likely to add something that 'should be there but isn't enforced through e.g. an abstract method which has to be implemented'. It's now by convention that there's a static 'Instance' method on the DbProviderFactory type, but it's not enforced, only by the DbProviderFactories internal implementation. Adding similar machinery (so based on 'good faith') for registration is probably going to be problematic as the feet draggers highly likely overlook this. It would be nice, absolutely, but unless there's an enforcement implemented, it's fragile.
I first had a design around ConcurrentDictionary, as indeed that's all it takes. I decided against it after I realized the following.
The API (and code) I came up with was the result of a lot of thinking about use cases and compromises that have to be taken wrt Netfx compatibility and also what is to be expected what 3rd party ADO.NET provider developers will do: it's not the API one would come up with if everything is started over, but I think it's the API that works best, knowing usage patterns of the netfx' variant of DbProviderFactories and what people do wrong there. |
@terrajobst Ah, I thought you could be talking about something more subtle. In this case I think an extension method on |
@divega The reason I want a DbConnection oriented registration variant is that the type is likely known by the user. I mean: they likely know the existence of 'SqlConnection', but I doubt many know the existence of SqlClientFactory and have to look it up. (Additionally, there's also a GetFactory(DbConnection) variant in the Netfx API of DbProviderFactories) |
@divega any update? It would be nice to push the API forward. Or do you want to wait for @saurabh500 to get back from his leave? |
@karelz this is still sitting on my plate, I have just been busy with other things. Planning to come back to it next week. |
@divega any progress? |
@divega ping? |
@karelz, @FransBouma, and everyone else interested: I am sorry it is taking me much longer than I had hoped to come back to this issue. And I am sorry it is blocked on me. 2.0 has kept me quite busy until the last minute, and today I am starting a family vacation. I will be back before the end of the month, and then I will make this one of my top priorities. Thanks @karelz for your pacient insistence 😄 |
I have been discussing (initially only with @ajcvickers) the details of a new proposal to capture what we have learned so far, as well as the specific concerns we had with some of the API that have been previously proposed. I found that for all the time we all spent on this issue, we are still far from reaching consensus, and that there are still many lingering concerns that need to be addressed. So I just wanted to update the thread with information on where we are. Consider this an early draft, and please respond with feedback. I hope this can help us converge into something that can be soon added for 2.1. ScenariosWe are currently contemplating two main scenarios:
Call to action: Help us improve these descriptions and identify additional important scenarios that could be missing in this list. ScopeThis proposal centers on adding new members to the static These APIs additions should be adequate to be ported back to .NET Framework in the future, at which point they can also be incorporated into a future version of .NET Standard. Existing APIThese are the current members on (I have added some information on API usage along the signatures, which I obtained from https://apisof.net) public static DbProviderFactory GetFactory(string providerInvariantName) {...} // api port 1 %, nuget.org 1.3 %
public static DbProviderFactory GetFactory(DataRow providerRow) {...} // api port 0.3 %, nuget.org 0.3 %
public static DbProviderFactory GetFactory(DbConnection connection) {...} // api port 0.2 %, nuget.org 0.2 %
public static DataTable GetFactoryClasses() {...} // api port 0.5 %, nuget.org 0.4 % Concerns with previous proposalsUsability of overload that that takes provider factory assembly-qualified type nameWe acknowledge feedback that although this overload can be the fundamental building block of the API, it can be more difficult to use and error prone that other alternatives. Name and description argumentsWe have not reached consensus on whether there is any real value in being able to associate a For the purpose of this draft, we are going to leave this as an independent decision point, i.e. we won't include signatures that include these two arguments in the description of the methods below, but if we agree there is enough value, we can add appropriate overloads. Instance vs. type based registrationProvider factories have been part of ADO.NET since the times of .NET Framework 2.0. There has since then been a prescribed pattern to make them effectively singletons: a private parameterless constructor and a public static read-only field called Connection instance vs. factory as the starting pointThe original proposal includes method overloads that register a factory based on a connection instance. We have discussed this and re-read all the feedback in this thread about why this could be important. It seems that the main point is that many customers that may need to register providers (for either scenario 1 or 2) will not be familiar with the factory types but could be familiar with the connection types. This seems to be a legitimate concern. However it is a bit strange when you think about how the user would need to create a connection instance before they can register the factory that the application later is supposed to get the connection from. In the case of scenario 2 (provider replacement), the type of the connection instance used initially will not even match the one that would be obtained later. We are still wondering if it is reasonable for consumers of this API to be informed about provider factory types at the same time that they are informed about the need to call these APIs. For the purpose of this draft, we won't include these overloads, but we would like to understand this a bit more before making a call. Most of the concerns about this were raised by @ajcvickers, so I will let him expand more on these if necessary. Inferring the invariant name from the type's namespaceThe original proposal includes registration overloads that assume the invariant name to be used is the namespace of the provider's factory class. This seems to in fact match the actual default provider invariant names chosen by provider writers, with the notorious exception of the SQL Server CE providers, which append version numbers to their invariant names. We haven't been able to reach consensus on whether these should be included or whether they could become a pit of failure. One of the possible issues is that the overload would register the wrong invariant name 100% of the time for the provider replacement scenario. Another possible issue is that this convention is not enforced by any means and can be incorrect for existing providers or future providers. For the purpose of this drafts, we are not including overloads that infer the invariant name, but we can add them if we reach consensus that they are more value than the possible pitfalls. Again, most of the concerns about this were raised by @ajcvickers, so I will let him expand more on these if necessary. High confidence additionsHere a list of APIs we are convinced will enable the scenarios described above and that we believe present a very low risk of becoming pits of failure. Register extension method overload that takes the factory instancepublic static void Register(this DbProviderFactory factory, string invariantName) Example usage: // as extension method
SqlClientFactory.Instance.Register("System.Data.SqlClient");
// as static method
DbProviderFactories.Register(SqlClientFactory.Instance, "System.Data.SqlClient"); Pros:
Cons:
Neutral (i.e. can be seen as both pros and cons):
Register overload that takes assembly-qualified namepublic static void Register(string factoryAssemblyQualifiedName, string invariantName) Example usage: DbProviderFactories.Register(
@"System.Data.SqlClient.SqlClientFactory,
System.Data,
Version=2.0.0.0,
Culture=neutral,
PublicKeyToken=b77a5c561934e089",
"System.Data.SqlClient"); Pros:
Cons:
Optional additionsThese are APIs that could be added and for which we don't see high risks, but for which the value added isn't clear. Register overload that takes a factory typepublic static void Register(Type factoryType, string invariantName) Example usage: DbProviderFactories.Register(typeof(SqlClientFactory), "System.Data.SqlClient"); Cons:
TryGetFactory method that doesn't throw if the provider is not registeredWe believe this would a nice addition, so that higher level libraries can easily probe if a provider is registered without relying on public static bool TryGetFactory<TFactory>(string invariantName, out TFactory factory) where TFactory : DbProviderFactory Example usage: // with var out
if (DbProviderFactories.TryGetFactory<SqlClientFactory>("System.Data.SqlClient", var out factory))
{
return factory.CreateConnection();
}
else
{
return new SqlConnection();
}
// Un-registration methodsThese are mostly for completeness. They don't belong in any fully articulated scenario we have identified but given that an application will be able to programmatically register provider factories it seems a bit strange that they cannot be unregistered. public static bool Unregister(string invariantName) {...}
public static bool Unregister(this DbProviderFactory factory) {...}
public static bool Unregister(Type factoryType) {...}
public static void Clear() {...} |
Great! Reassigned to you @FransBouma. If you don't have time, or change your mind, just let us know ;). @FransBouma based on your reply I assume you are ok with the final shape of the API and you don't have any must-have feedback we should discuss prior to implementation. Is that correct assumption? |
FYI: The API review discussion was recorded - see https://youtu.be/HnKmLJcEM74?t=4047 (48 min duration) |
correct :) Question: I have a PR with the original code / API I wrote for this, that's flagged as not merge etc. Is it better to commit to that PR or delete that PR and create a new one? |
@terrajobst @divega /// <summary>
/// Returns the invariant names for all the factories registered
/// </summary>
public static IEnumerable<string> GetProviderInvariantNames(); the problem is, the data is stored in a shared resource, so locking (readwritelockslim) is used for its access. This gives a problem with this method, as the enumerator traversing the stream would require a full lock (it can't deal with a write mid-enumeration). Internally projecting the names to a list and returning that is also a bit lame, as it's actually 1) running into the same problem and 2) it mitigates the whole purpose of returning IEnumerable. I'll revert the internal implementation to ConcurrentDictionary and create a DataTable construction function for GetFactories() (which currently creates a copy, like in NetFX), as ConcurrentDictionary offers enumeration with writes happening. We do lose some aspects which might not have been considered because of this though:
Keeping both internally is really not great. So... what to do? (edit) If I cache the created datatable which is copied for GetFactories(), and re-create it when a new factory is registered, point 1 is solved. Point 2 is of little relevance, so I think we're all good. |
@FransBouma it's up to you what you think is easier for you and PR feedback. New PR might start fresh with new comments and new rebase, so it may be easier overall. But if current PR is very close to final code, I can imagine it will be easier for you to reuse it ... |
@karelz I think it's better to use a new PR regardless as the history of the old PR isn't really relevant for this new code. I'll refer to it in the new PR. I lean towards going for the ConcurrentDictionary approach (see previous message) so the previous PR's code isn't close to the final code. I'll likely make a mistake with rebasing / history squashing as I haven't done that before (once I think) as I mainly use subversion, but we'll address that when the PR is posted ;) edit: old PR closed, will refer to it in new PR. I expect all code to pass the tests later this week. |
@FransBouma I also think it is ok to switch the implementation to use a
cc @terrajobst @ajcvickers in case they have other insights on this. |
@divega In the current implementation I don't fill the columns name and description indeed, the returned datatable from Regarding the overwriting of the internal datatable... I've always found that a dirty hack to be honest :), but it was the only way to intercept the factories. Now that this API allows to register a different factory under an invariant name which is already used with a registered factory, it's easier to wrap the factories now: first call GetProviderInvariantNames, loop over that and for each name obtain the factory and wrap it with your wrapping factory and re-register it under the invariant name (so replacing the real one). The profiler using the hack could stop working if the inner workings of NetFX's implementation change from datatable to concurrentdictionary of course, but as it's a hack relying on an internal implementation which is nowhere documented, it's not weird that it perhaps breaks at some point. As someone who potentially is affected by such breakage, I don't mind this breaking change. That said, the current code I have now is still datatable based, it depends on the It's not code that'll appear on any hot path however, most of this code is called once and at startup, and I doubt multiple threads will be either registering and reading info about factories at the same time so if some thread is blocked for a short moment I don't think it's that big a deal to upscale the lock to a Write lock, do the DataTable enumeration and copy the names into a structure and return that in some form, but just to be sure I'd like your final opinion what to do :)
That sounds like a reasonable implementation, the AsReadOnly() call. Personally what I always find about IEnumerable is that if it's an enumerator over a set anyway, it's a bit of a waste: e.g. if you want to do a However, if the implementation is meant to be potentially streaming the names, From the API description it's unclear whether GetProviderInvariantNames is a streaming method or a method which returns a projected set, hence my question :) PS: I don't want to re-do the API review all over again, I just ran into this when implementing it and thought I should mention it :) (edited: clarity) |
@FransBouma ok, let me summarize what I believe you are saying, just to confirm:
I don't have a strong opinion on (3). I would like @terrajobst to weigh in for that. I know for your example of |
@terrajobst @divega The method
What factory should be returned? FReal or FProfiler? On NetFX it will return FReal, as that's the type in the DataRow, and the code assumes that it can't differ from the internal store as there's no way to change the internal store (unless you use reflection). However on CoreFX we do have re-registration of an invariant name, so on CoreFX, it's not clear what to do: return the factory type stored in the row, or return the type currently registered under the invariant name in the DataRow, ignoring the type in the DataRow? If I look at the documentation of the current NetFX API (GetFactory(DataRow)), it says:
Thing is: if I call GetFactoryClasses() right before GetFactory(DataRow) I get a datatable with FProfiler as registered factory type. I know, nitpicking perhaps, but better safe than sorry. :) |
@divega @terrajobst @karelz If the replies to my (small) questions above are in line with what I have implemented, I'll post a PR from the code, otherwise will change the code to make it match what's decided and then push the PR (after squashing some commits I recon). I saw in this list of changes it does add a test for SqlClient.SqlConnection.ProviderFactory property (done back in May) as it's required for this work to function (GetFactory(DbConnection)) (edit) to elaborate what's implemented wrt the small points above:
|
@FransBouma, trying to answer your questions:
That is interesting indeed. It sounds like the current implementation on .NET Framework relies on the fact that the configuration cannot change at runtime to implement some shortcuts. At first glance, here is how I think about it: When we add the
When we discussed the overload that takes the assembly-qualified name in API review, we said the only reason to have it in the API was that it allows delaying the actual loading of the assemblies and instantiation of the providers, which could be useful if the application configures many providers, but ends up using only a few of them (which was the common case with .NET Framework). |
So in short:
? Sounds good.
I completely overlooked that part. Hmm, then there's another problem tho: how to check whether the specified typename is valid? I can't simply accept is as valid without loading the type, but loading the type does bring in the assembly into the appdomain I think, mitigating the purpose of the method... Storing the type name as string as-is will cause a problem when the factory is obtained later on and the type doesn't resolve to a factory at all (e.g. a typo was made, wrong type was specified), which is odd as the registration was successful; other RegisterFactory methods fail when the wrong info is specified (e.g. the wrong Type is specified). That's inconsistent (and unexpected, as the error will pop up with GetFactory, handling code for a failure that a type is registered wrongly won't be at that spot, but will be at the spot where the RegisterFactory method is called) (edit) It's also not enough to reflect the type, a valid factory type is a type which implements DbProviderFactory, has a static Instance property and that property returns an instance of the type. All those checks are done with the other RegisterFactory methods, but the string based variant then has to postpone all those checks to a later point, which makes this implementation rather odd. I get the theoretical use case of the method. I think however it's a bit of a theoretical exercise rather than a practical situation anyone will run into, as, when it's the case they run into memory problems due to the registration of too many provider factories, they can also do the filtering of what to use themselves. Running into memory problems however due to too many assemblies being registered (I don't know what other problems there might be that this is needed) when you're talking to databases is... curious, to say the least ;) |
What happens today in .NET Framework when you call
Certainly this overload and the one that takes the type are IMO much less important than the one that takes the instance. We could re-discuss (without resetting the whole thing 😄). The argument was always non-functional: avoid the waste of cycles and memory in that scenario.
Thanks! Put this way it is actually clearer to me that ignoring either the type on the row or the type currently registered could be equally unexpected. This overload is messed up. Perhaps throw if the types diverge? Or keep the current behavior but deprecate the overload (but only if we can deprecate it on .NET Framework too)? cc @terrajobst |
It actually happens twice. First the validation occurs when the config is parsed and the datatable is created internally, see: That said, the RegisterFactory(invariant name, typename) overload has as comment: /// <summary>
/// Registers a provider factory using the assembly qualified name of the factory and an
/// invariant name
/// </summary>
public static void RegisterFactory(string providerInvariantName, string factoryTypeAssemblyQualifiedName); which tells me it will register the factory using the specified name. It doesn't register the type name, it registers the provider factory, and to do that, it uses the name specified (like it would also use the type specified in the other overload). At least that's how I interpret that documentation/spec. Looking at the method, I'd expect it to throw an exception if I call it with "Foo" as typename specified, and not succeed (as tests are happening later on in the case where the type name is stored and not examined, to prevent cycles/memory (but how much is really saved here?)).
Heh 😄 yeah I scratched my head a few times too why that method was there. It's the central method for all other GetFactory methods to obtain the real factory (so it needs to instantiate the type specified in the row it receives, see above), but apparently the initial developer overlooked the case where you can pass in a datarow which contains a completely different type that's not registered at all. No idea if there are design docs for this particular method tho. How I see it, I think it shouldn't be a public facing method, or at least test whether the type in the row is in the internal table as well, if the containing table isn't the same as the internal table, but I have no idea if there is code actively using this (I guess there is, there always is ;)). So perhaps a solution here is to document it properly what will happen with this method: either it will return the instance of the type specified in the row (the netfx implementation) or it will return the instance registered with the invariant name in the row (which for netfx currently likely is the same), whatever the decision is, so the behavior isn't totally unexpected: the documentation states what will be done. The current documentation of this method isn't helpful in this (https://msdn.microsoft.com/en-us/library/s90hx22e(v=vs.110).aspx). Such a little API and so much complexity... :) |
That code you pointed to special cases the providers defined by .NET Framework itself (e.g. in System.Data.dll, assembly that you would have been already loaded) and in particular, it instantiates Microsoft's Oracle provider. Unless I am missing something, I believe if you call FWIW, there was once a perf bug reported internally about the fact that the Oracle provider is loaded into memory unnecessarily here, but IIRC, the behavior had to be preserved for backwards compatibility.
Yeah, I happen to be the person that wrote that comment 😄, and I am sure that is not what I meant. From my point of view "registering a provider" is establishing a mapping between the invariant name and the provider, and establishing that mapping does not imply that the actual type has to be loaded or instantiated, only identified in the mapping. Any feedback on how to convey that better without making it overly complicated?
Yes, that is what I thought, but I am still on the fence. We need to be ok with porting that behavior to .NET Framework as well. Perhaps for this API it is ok to just keep returning whatever the dataRow says, as long as the behavior is noted in documentation as you suggested. |
You're not missing something, you're right. It registers type names and defers the checks to the GetFactory() method.
Ok :) I have struggled with coming up with a use case for the method, but last night I thought of one and it's close to home as well: an ORM can pre-register all supported ADO.NET providers with DbProviderFactories using this method, so the user doesn't have to :) (they then have to make sure they dll(s) they want to use are in the fusion probing path and everything works out fine). So the RegisterFactory(string, string) overload defers registering the actual type till it's requested the first time. Now that I have found a legitimate use case it's more clear to me when the method will be used and what the expectations are. IMHO they're not the same as for the other RegisterFactory() methods. So, to me, there are 3 options:
There's no question anymore whether the string variant needs deferred registration, I'll add that to the code, the API spec how it is now is IMHO however vague in its communication to the user, which I think needs more clarity. // cc @terrajobst for an opinion on this? (edit). Calling RegisterFactory(string, string) after a RegisterFactory(string, type) has been performed will replace the already registered factory instance again with a deferred one. This is obvious perhaps, but just in case it's not ;) (edit). TryGetFactory(name), should that return false in case of a deferred registration which causes an error? Or should it throw the exception (e.g. type isn't found, type is badly formed etc.), as the registration is there, but it's not correct ? |
@divega Current implementation defers the registration of RegisterFactory(string, string) till GetFactory() is called (any overload). Will also throw when TryGetFactory() is called with an invariant name which matches a registered, but deferred, factory, and the assemblyqualifiedname is wrong/lot loadable. This is debatable tho, see previous post, and easily corrected. RegisterFactory(string, string) has a comment regarding the deferred registration but not an xml comment, so it's likely going to be missed by documentation writers if they don't peek into the code. I think I've covered all the bases for a PR now in the code (IMHO it's 100% complete), but the points still open needs addressing before PR submit. |
@FransBouma glad to hear you arrived to that scenario for the assembly-qualified name overload. I think it is basically the same thing we had in mind, but thinking of O/RMs doing this is a very concrete example in which it would make sense. Feel free to propose improvements to the XML doc comments of the methods. The comments were not really the focus of the API discussion and should not be considered part of the spec.
Last-one has to wins always, otherwise I can't think of a way to explain the API. I think at this point it would be good to just finalize any small details on comments on the PR. Thanks a lot for your patience! |
PR is up :) dotnet/corefx#25410. I've tried to word the decisions made here as accurate as possible, please correct me if I made a mistake there. I also tried to rebase stuff but as I'm a git newbie I didn't get very far so I hope the PR is alright as it is. All tests pass. Almost there! |
* Implements #19826: DbConnection doesn't offer an internal property ProviderFactory * Added test for #19826. Due to project misconfiguration it fails locally. * Removed writeline * Added comment to illustrate the purpose of the property which currently is dead code. * Implementation of DbProviderFactories, ported from NetFx and adjusted to CoreFx See #4571, #19825 and #19826 * Update for project files to get a netcoreapp test set * Changes to project system to get the stuff compiled. Failed. * Various changes: - Updated config/csproj files for netcoreapp for tests. - Properly moved DbProviderFactories file to right location - Added DbProviderFactories type to ref project, split across 2 files (one for netcoreapp specific methods) - Changed SetFactory to ConfigureFactory - Changed generic type argument to normal method parameter - Removed default parameter values and added overloads * Added tests for DbProviderFactories. * Better subclass testing added and more tests added * Moved all hard-coded strings to SR/Strings.resx. Added test for factory retrieval using a connection * Removal of now redundant comment. See: #19885 (review) * Comment correction for bad English * Refactored API a bit, see: dotnet/standard#356 (comment) * Added tests for reworked API. Added tests for wrong type passing * Reverting change in sln file: See #19885 (review) * Almost done implementation using DataTable internal storage. Refactoring now to concurrentdictionary * Final work for #20903 * Corrections of the implementation-> - registrations of assembly type name are now deferred - storage is now a struct, as the typename has to be registrated as well. - corrected a wrong nameof() call. - added tests for the change behavior in RegisterFactory(string, string) * Final implementation * Corrections made to code by request of @saurabh500 * Github for windows didn't commit this file in last commit... :/ * Again correcting component elements. These are automatically inserted so if they're back again, I can't remove them * ... annnnd another component element. Last one I hope * @divega requested changes - Corrected sln file netcoreapp->netstandard - Corrected wording in exception messages. - Component elements are back, I'll leave them now, they get back regardless. - Renamed column name constants to have the 'ColumnName' suffix for more clarity - Made Instance constant be initialized with nameof(Instance) - Added using for System.Reflection so less verbose code * More @divega requested changes - factored out a clone into its own method (GetProviderTypeFromTypeName) - Reverted 'nameof(Instance)' to string, and renamed field - Removed redundant null/length check in ctor of ProviderRegistration ctor * Last nit removal for @divega
Closing the issue with PR dotnet/corefx#25410 |
Adding netfx-port-consider label for the new APIs. Note for implementer: For the .NET Core implementation of the new APIs we decided to ignore aspects of a provider registration like the name and the description. In .NET Framework the new API will probably don't have these either, but the underlying implementation will have to set them to something (null?) in the DataTable used for storage. All this API is static, so it needs to be thread-safe. The .NET Core implementation simply uses a ConcurrentDictionary, but for the DataTable-based implementation on .NET Framework you will probably need locks on access. |
* Add missing documentation for DbProviderFactories APIs Introduced in https://github.com/dotnet/corefx/issues/20903 * minor fixes
Latest Proposal
Copied from latest API approval: https://github.com/dotnet/corefx/issues/20903#issuecomment-342605350
Original proposal
Issue #15732 is about porting the existing surface of
DbProviderFactories
into .NET Core. This new issue is specifically about new API that needs to be added in .NET Core that does not (yet) exist in .NET Framework.DbProviderFactories
on .NET Framework can only be initialized from .config files, and in order to make the API usable without .config files the new API is needed.The proposal by @FransBouma can be found in dotnet/standard#356 (comment) and is repeated below:
I've refactored the code a bit, the new API now looks like:
Two new overloads are added. They'll use the fallback code for the providerInvariantName, as it is also present in netfx' auto-init code: it will use the namespace of the type. I've added this to avoid people making a typo in the name as for most factories I know (I don't really know of a dbproviderfactory where this isn't the case) the invariant name is equal to the namespace.
Method usage / purpose
Description: This method will register the factory instance contained in the specified type, under invariant name equal to the namespace of the specified type. It will leave name and description empty.
Purpose: This method is meant to be the easiest way to register a factory. Most (if not all) ADO.NET providers use as invariant name the namespace of the factory type.
Description: This method will register the factory instance contained in the specified type, under invariant name specified in providerInvariantName. It will leave name and description empty.
Purpose:: This method can be used to register a factory for the ADO.NET providers which don't use the namespace as the invariant name, and it can also be used to register a different factory type under a well-known invariant name, e.g. in the case of a wrapping factory for ADO.NET profiling.
Description: This method will register the factory instance contained in the specified type, under invariant name specified in providerInvariantName and will fill in the name and description values for the factory.
Purpose:: This method is equal to
ConfigureFactory(type, string)
and can be used to fill in the additional two columns in the factory table if a user requires that.Description: This method will register the factory instance obtained from the specified connection, under invariant name equal to the namespace of the factory instance's type. It will leave name and description empty.
Purpose: This method is meant to be the easiest way to register a factory if the user doesn't know the factory type but does know the connection type. As DbProviderFactory registration was mainly hidden for most users it can very well be they're not familiar with the factory types, so this method and its overloads make it easier for them to register a factory. Most (if not all) ADO.NET providers use as invariant name the namespace of the factory type.
Description: This method will register the factory instance obtained from the specified connection, under invariant name specified. It will leave name and description empty.
Purpose:: This method can be used to register a factory for the ADO.NET providers which don't use the namespace as the invariant name, and it can also be used to register a different factory type under a well-known invariant name, e.g. in the case of a wrapping factory for ADO.NET profiling.
Description: This method will register the factory instance obtained from the specified connection, under invariant name specified in providerInvariantName and will fill in the name and description values for the factory.
Purpose:: This method is equal to
ConfigureFactory(DbConnection, string)
and can be used to fill in the additional two columns in the factory table if a user requires that.The text was updated successfully, but these errors were encountered: