Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

API for exposing the Schema Column from DbDataReader #5609

Closed
wants to merge 1 commit into from

Conversation

saurabh500
Copy link
Contributor

Adding a DbSchemaFactory and a method to retrieve the schema from the DbDataReader

A new class was added because of the following:

  1. A new method added to the existing DbDataReader would not allow type
    forwarding which is needed so that the libraries written on .Net Core can
    interop with the .Net Framework
  2. A new class can be added to the facade of the .Net core so that the
    implementation can be bridged with the .Net framework.

DbDataReader

A new class was exposed because of the following:
1. A new method added to the existing DbDataReader would not allow type
forwarding which is needed so that the libraries written on .Net Core can
interop with the .Net Framework

2. A new class can be added to the facade of the .Net core so that the
implementation can be bridged with the .Net framework.
@saurabh500
Copy link
Contributor Author

@NickCraver @YoungGah Please take a look.

This is the API that I am adding to get the schema from the DbDataReader.
Once this PR goes through, I will change the contracts and add the SqlClient implementation.

@saurabh500
Copy link
Contributor Author

cc @corivera

@NickCraver
Copy link
Member

Could this be added as an extension method instead? It seems like that'd be much more intuitive to use, e.g. .GetSchema() on DbDataReader

@saurabh500
Copy link
Contributor Author

@NickCraver I considered it and that could have been the best solution
However extension methods cannot be created as Virtual to allow the clients to override.

{
public abstract class DbSchemaFactory
{
abstract public List<DbColumn> GetColumnSchema(DbDataReader dataReader);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the parameter be named simply reader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should this be List<T>? Has this been API reviewed?

The framework design guidelines say not to use List<T> in public APIs... Usually the recommendation is Collection<T> or ReadOnlyCollection<T> (or a subclass of one of those). In this case, it seems like you want some kind of read-only collection at the very least, because it doesn't seem like a caller of GetColumnSchema should be modifying the collection.

(Though, I am torn about this guideline as List<T> is more efficient than Collection<T> and ReadOnlyCollection<T>. Perhaps the guidelines should be reconsidered with an eye towards potentially allowing the use of immutable collections in APIs? Edit: Indeed Roslyn and System.Reflection.Metadata components expose ImmutableArray<T> in public APIs, so there is some precedent.)

cc: @terrajobst @KrzysztofCwalina

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of type (I agree with ReadOnlyCollection<T> here): abstract public => public abstract for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.
@justinvp and @NickCraver Thanks for the feedback.

@justinvp I have not done a formal API review yet. I do intend to send out a request when submitting a Contracts change. I did want to expose the idea/direction to @NickCraver using code to get any feedback from him and answer questions if any.

I will look forward for any comments from @terrajobst @KrzysztofCwalina
I will send a review based on the documentation at https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/api-review-process.md

@NickCraver
Copy link
Member

Semi-related, I think you'll want to open up the Dictionary<string, object> on DbColumn, possibly make it protected? From an efficiency standpoint downstream inheritors are going to want to iterate, etc. The indexer being the only point of access internally will be a problem there.

@saurabh500
Copy link
Contributor Author

@NickCraver
Can you open another issue with your comment, so that I can send a separate PR with this change.
Also, tag me in the issue.

"Semi-related, I think you'll want to open up the Dictionary<string, object> on DbColumn, possibly make it protected? From an efficiency standpoint downstream inheritors are going to want to iterate, etc. The indexer being the only point of access internally will be a problem there."

@saurabh500 saurabh500 self-assigned this Jan 22, 2016
@NickCraver
Copy link
Member

Absolutely, I'll open one shortly. I'm thinking about other approaches on the schema method because it's a bit odd. If the bits we needed to render the schema were exposed from DataReader itself in the first place, we wouldn't need the method. So the bits we really need are internal to the reader. It feels very weird for the implementation to live to disconnected from the reader...and I'm not sure it even can, unless it's using internal exposed bits which means no providers can really implement it.

What's the thinking here on how that method actually works?

@NickCraver
Copy link
Member

DbColumn issue created: #5611

@NickCraver
Copy link
Member

However extension methods cannot be created as Virtual to allow the clients to override.

Hmm, good point. I'm curious about point 1 though: wouldn't this scenario prevent ever adding anything to DbDataReader?

@saurabh500
Copy link
Contributor Author

@NickCraver
Based on a conversation I had with @ericstj point 1 exists because of the limitations of type forwarding. Type forwarding doesn't allow granular forwarding at the level of the class members. It would only work for the type.

I'm curious about point 1 though: wouldn't this scenario prevent ever adding anything to DbDataReader?
Short answer to your question is, not without some side effects.

A little longer explanation
If the changes needed in DbDataReader are present in .Net Framework, then we can add it to DbDataReader. This means that the .Net framework would need to be ahead.
This will have a longer turn around than desirable.

Alternatively if we add to DbDataReader before changing the .Net framework, then the libraries will not be completely portable.
This wouldn't be desirable.

@NickCraver
Copy link
Member

@saurabh500 Okay gotcha, so this will be lock step with a new generation for base class additions. Thanks for that.

Do you have an example of (as an example) the SQL Server provider implementation here? I'm trying to picture the actual end result. Are you envisioning the SQL DbDataReader would have internal-visible members and the factory would be accessing/using those to get at the TDS result schema and translate into the DbColumn collection (or a derivative)? If not, I'm curious as to how this is pictured to extract the needed data that only the reader has access to.

@terrajobst terrajobst added the port-to-core Issue tracking port of .NET Framework code to .NET Core label Jan 22, 2016
@saurabh500
Copy link
Contributor Author

@NickCraver I will put together an example or post a link to a dev branch to show code.
You are right, the idea is to have the internals of implemented DbDataReader exposed to accomplish this.

The SqlDataReader has metadata which was used to expose the schema earlier. This metadata will be used to populate the Collection of DbColumn as well.

@mgravell
Copy link
Member

Can I ask: in this proposal, how would a consumer obtain the factory? (or do they never need to?) What is the intended API for the consumer?

Context: primary object: dataReader; other context objects; command, connection. What does my code look like if I want to investigate the columns?

@NickCraver
Copy link
Member

Kind of ignoring the access issue Marc notes above (which I'm also very curious about), do we actually need another class for this? What if FromDataReader() instead lived on DbTable? It could return a list of columns but without the need for another type you're hunting for. We don't have type forwarding rules to contend with and it's also a bit closer to what you'd expect in a port. The possible win I see with a DbTable (or downstream inheritor as the base) return instead of a ReadOnlyCollection<DbColumn> would be the ability to add additional properties about the result set with the DbTable indexer itself.

It's totally possible that's a really bad idea, just throwing it out there for discussion since it adds the ability to replicate some other potential uses DataTable served before.

@saurabh500
Copy link
Contributor Author

@NickCraver The DbTable and related structures have not been finalized yet. There is still discussion in progress on that front. This work is focused on getting the SchemaTable for the DbDataReader

An example of how this would look:

class SqlSchemaFactory : DbSchemaFactory { 
    public ReadOnlyCollection<DbColumn> GetColumnSchema(DbDataReader reader){
           ... 
          return columnCollection;
  }
}


// Calls to API 
       DbDataReader reader = ... //
       DbSchemaFactory schemaFactory = new SqlDbSchemaFactory();
       ReadOnlyCollection<DbColumn> columnSchema = schemaFactory.GetColumnSchema(reader);

@NickCraver
Copy link
Member

@saurabh500 Okay yeah that's what I was afraid of - that API approach just won't work. It would require a) knowing the provider and b) taking a dependency on it. Taking a dependency on every (or any) provider is a non-starter for any library.

@saurabh500
Copy link
Contributor Author

@NickCraver How about extending DbDataReader to provide a SchemaFactory registration API which can only be initialized by the DbDataReader subclass and the factory is exposed by an extension method in the DbDataReader. The Getter for factory will not be virtual though.

Once the reader is created, the factory will be assigned by the Provider and then the factory can be retrieved using the DbDataReader.GetSchemaFactory()
This way the factory can be obtained from the DbDataReader. The libraries will not need to know about the providers and no dependency will be needed.

Need to think a little more on this extension.

@NickCraver
Copy link
Member

@saurabh500 If that method isn't virtual - aren't we back to issue with type forwarding that drove this direction originally?

@ericstj any thoughts here? I'm still not sure I fully understand the rules for the final .Net/.Net Core forwarding and what we're allowed to do in maintaining interop here.

@mgravell
Copy link
Member

eugh; yeah, that's what I was afraid of, and was why I was so keen to get a caller's API perspective. That is not viable. That is simply not a compatible API - the calling surface is completely different, and requires knowledge of completely different things.

That would screw libraries so badly. Frankly, at that point, you'd do better to just have:

public abstract class DbDataReader {
    protected virtual DbWhatever GetSchemaWhatever() { }
}

with the base-class implementation doing any of

  • throw new NotSupportedException()
  • building something convincing-looking using GetSchemaTable() (on full .NET only, obviously)
  • building something convincing-looking using the reader API otherwise

with derived readers then able to override this method in the intuitive place (the same place they have historically added GetSchemaTable()) if they want to do something more specific.

(it is Saturday and I'm tired, so if I'm getting the wrong end of the stick, feel free to educate me; but I just don't see the point of all that plumbing)

@FransBouma
Copy link

Lots and lots of friction and ceremony all to prevent the implementation of something useful like a DataTable. If MS had ported that code this all would have been done by now, not only that, but ADO.NET providers would have an easier time too (as they can just port code over for getschematable as-is, as that already works with datatable).

sigh

@mgravell
Copy link
Member

He makes a valid point...

On Sat, 23 Jan 2016 17:50 Frans Bouma [email protected] wrote:

Lots and lots of friction and ceremony all to prevent the implementation
of something useful like a DataTable. If MS had ported that code this all
would have been done by now, not only that, but ADO.NET providers would
have an easier time too (as they can just port code over for getschematable
as-is, as that already works with datatable).

sigh


Reply to this email directly or view it on GitHub
#5609 (comment).

@terrajobst
Copy link

@FransBouma @mgravell

I understand the frustration. However, try to put yourself in the shoes of the ADO.NET owners: DataTable and DataSet aren't free of issues and have a pretty large footprint. If a core API in the base layer has to take a dependency on it, there is little hope that these technologies can ever be considered legacy and thus would be hard to replace.

Also, from an AOT perspective it's probably beneficial to be able to omit a dependency on DataTable/DataSet.

@NickCraver
Copy link
Member

@terrajobst that's side-stepping the core issue here though - this API just doesn't work. You can't use it in a library without taking a dependency on every provider you'd ever support. Given that defeats the point of things here, we still have to figure out an API.

Can we hop on a hangout a figure out a solution much more quickly? The community is eager to help here, the delays are what's frustrating and honestly scary since RC2 is about to ship. I don't think outside of Microsoft we're fully aware of the type forwarding constraints and desires mentioned here - so we can't actually design a full solution due to a lack of known constraints.

Please, can we help? We're not complaining to complain - I'm trying to be constructive and willing to invest some hours for the good of everyone here.

@FransBouma
Copy link

I understand the frustration.

I honestly don't believe you for a second. The main reason I have for this is that you haven't expressed a single time that you actually understand our problems, because if you do you'd stop this madness and work more closely with us to get things sorted. This stupidity regarding DataTable is already in full swing since May 2015. Even a slow developer would have solved it by now 10 times.

But no-one at Microsoft really gives a sh.t, so at the last minute, already in RC time, things have to be added and sorted, with an API no-one wants and which results only in more work for us, outsiders. We have to live with this API for the rest of the life of this framework, and thus it has to be good. Now, as there's already an API, you either use that, or fix things properly. Not just this single aspect. If you keep the existing API, you (that is, Microsoft) has to do some work (port datatable/set over) and make a compromise: accept that there's something called DataTable and it's here to stay. Because that would offload the burden from us, your users, to you, the creator of the API/framework.

I simply can't match that with 'I understand your frustration'. At all, sorry.

However, try to put yourself in the shoes of the ADO.NET owners: DataTable and DataSet aren't free of issues and have a pretty large footprint. If a core API in the base layer has to take a dependency on it, there is little hope that these technologies can ever be considered legacy and thus would be hard to replace.

First of all, who are they, these ADO.NET owners? Is that @saurabh500 ? Who makes these decisions regarding ADO.NET? Including DbProviderFactory class but no design at all how to use it?

Secondly: why is there so much time wasted on getting rid of a datastructure which solves a myriad of problems for you? .NET Core isn't about re-implementing things properly this time, it's a porting effort, clear and simple.

Anyway, after all this time we're still where we started: no-where, in discussion with a software vendor who has no clue what we really need and above all, doesn't care that the problems they create have far reaching consequences for us. We have to deal with the consequences of your decisions, not you, and I don't think you fully grasp that.

@terrajobst
Copy link

@NickCraver

@terrajobst that's side-stepping the core issue here though - this API just doesn't work. You can't use it in a library without taking a dependency on every provider you'd ever support. Given that defeats the point of things here, we still have to figure out an API.

That's a fair point. I was simply responding to the "let's just add DataTable/DataSet and be done with it" proposal.

@FransBouma

I appreciate your technical insight and your patience in engaging with us on this issue and reviewing the design. But I don't appreciate the hostility that comes from your replies. This isn't the first thread where this happened. Keep in mind that there are caring humans on the other end. The hostility doesn't help either of us.

@FransBouma
Copy link

@terrajobst

I appreciate your technical insight and your patience in engaging with us on this issue and reviewing the design. But I don't appreciate the hostility that comes from your replies. This isn't the first thread where this happened. Keep in mind that that there are caring humans on the other end. The hostility doesn't help either of us.

I know, but let's be honest here: when I started the issue about where's DataTable/Set in core back in February 2015 (#1039), I didn't expect to be still debating the same issue in January 2016, would you? It's nothing personal towards any Microsoft individual, it's just frustration about the lack of serious commitment from Microsoft about this issue to get it solved. We're almost 11 months (!) further into it now and not much has happened: there's a serious lack of communication from Microsoft about what's going on (as in: sometimes weeks pass without any reply), and if a reply comes it's either ignoring feedback related to serious holes in a design, or, sorry to say it, a reply from someone from Microsoft telling us that we, paraphrasing here, 'just have to deal with it'.

It's a cycle, really :) After that correction from the (I guess higher upper) Microsoft employee, we'll see new work from another Microsoft employee after a few weeks, to which the community then responds and Microsoft again falls into silence, sometimes things happen, but if enough negative feedback is accumulated, again a senior Microsoft employee steps in and tells us to basically 'this is what it is'. Now, for you the worst that can happen is that this project might fail and you'll move on to another team. That's painful but you'll manage, your job is still there.

For me, things are a little different. If I can't port my product to .NET core, it might be it ends, which might be my company dies, which means I don't have a job. That's all doomsday scenario material, (and I don't expect it to get that far! ;) ) but just to illustrate the different future scenarios you and I face, which are completely different. This is also why I'm a little frustrated that there's little to no progress. As I've said earlier to you, I'm not a newbie developer, so seeing Microsoft dabble with this API for such a long time (almost a year!) while it would take us community perhaps a weekend, is a tad frustrating.

I'm dealing with ADO.NET since its inception, and having to work around issues is something one gets used to (as with all libraries). I had the idea that this time however we could make a head-start so known problems from the past could be avoided. That's also why I invested serious time in this. I care about the end result. Or better: I did. You might think you should lecture me on how 'software works' as you did in another thread, but you haven't learned an important aspect of communities: people who take the time to write a reply, any reply, care. People who don't care anymore, don't write replies and don't give you feedback.

I don't want to be frustrated in these kind of threads, as it doesn't get one anywhere and I agree with you, my tone hasn't been the best. Maybe it's the endless dealing with 'competing with a Microsoft default technology', I don't know. Fact is, I don't need this frustration any more than you do. So, yesterday I've decided to step away from this and other related issues and just see what the end result will be: if it sucks, so be it, if it is great: great. If Microsoft still needs my advice to make this API better, then they know where to find me.

@saurabh500
Copy link
Contributor Author

This pull request will not be merged. I will restart a new PR for the API discussion tracked in #5915

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
port-to-core Issue tracking port of .NET Framework code to .NET Core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants