-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is DataTable/View/Set absent? #14302
Comments
@terrajobst When you have some free time perhaps you could give an overview about how we came up with what we think the surface area for .NET Core v1.0 is? |
Whatever may be the reason, having no real story for how to fetch tabular data using stored procs, which is today mostly done using DbDataAdapters storing data in DataSet/Tables, is likely going to stop a lot of people migrating their code to .NET Core for the simple reason that a lot of people have to work with stored procs and the tabular data they return every day. While it might be that a datareader opened on a proc call's resultset and materializing the rows into objects is 'easy', fact is that on databases which return cursors, this isn't that easy, while DbDataAdapters solve that problem today with a few lines of code. I do understand DataTable/View/Set and DbDataAdapter come with a fair set of interfaces you likely don't want to port now (not sure if they are, haven't checked), e.g. IBindingList, IListSource and the like, but IMHO it's a cornerstone for Data access on .NET, despite what some hip people think. |
I want to extend the question: I understand that .Net Core is not the full .Net Framework and some features are cut out. Nevertheless, many of us rely on some features and want to migrate to .Net Core. But there are two groups: The impossible ones (e.g. WebForms / WPF) and the possible and even reasonable packages like e.g. DataTable/View, XmlSchema, SerialPort, System.Activities, etc. Some may have technical constraints, not be hip, or legally restrained but to know the status and strategy of them would be a great help for understanding this new .Net. For these cut-out features will there be
I have not heard of a general strategy for these gaps, maybe I lost it somewhere. |
I ran into a use case today where I needed DataTable in order to return dynamic data to a service client. JSON.NET can easily serialize data tables for example. There are no easy solutions for materializing DYNAMIC data in .NET and DataTables make it easy when you don't know what the heck you're going to get back and you need to iterate over it. It's very useful for a number of things... I can see why DataSet/DataTable is falling out of favor. Personally I rarely use them, but for dynamic data scenarios they are often the best choice available in .NET. So I hope we get this at some point or else an alternative that makes it easier to get dynamic data that can actually be serialized. |
There's another problem: DbConnection.GetSchema(). It's a way to obtain meta-data from the connected database and all ADO.NET providers have support for it, often with detailed schemas about metadata which can be used by data-access code to e.g. generate code on the fly for types or parameters. A lot of work has been put into this method by ADO.NET providers to make this method work on the database the provider is for: often they use detailed metadata queries so the user of the method doesn't have to. GetSchema() returns a DataTable, as it's unknown what the types are of the rows returned as that depends on the schema requested and the ado.net provider used. This method can't be used in .NET core if DataTable isn't there. |
We generally consider So why are these types missing then? Simply because we're focusing on non-legacy components first and so far |
@terrajobst Answered also my question. A list of "legacy" .Net Framework technologies would be awesome (maybe including a state like "never", "maybe", "later"). I need to understand, in which technologies I can trust. |
@terrajobst and what about the scenarios I brought up? You can state something is 'legacy' in your eyes but that doesn't necessarily mean use cases a lot of devs are faced with every day are simply 'gone' or have valuable alternatives (as there are no alternatives at this point). Sure, someone can port the code from referencesource and that might look OK, but it's not, as those types aren't the same types exposed and used in the ADO.NET providers. I don't have the feeling Microsoft really understands the seriousness of this problem. |
I wish an alternative to SqlDbReader GetScema before makes DataTable and DataSet etc to be legacy and discontinued My point DataTable is needed for metadata access and some of development work in now |
@FransBouma It is a serious problem for you, because you maintain database products that depend on DataTable functionality. I bet the number of people like you vs the rest of .NET developers is like a fraction of 1%. And a small one, at that. When rewriting a framework, some stuff gets left behind, especially anything in the edge cases. All of the cases you made are edge cases. You may spend most of your life in them, and many others might because they use your products. But they are not Microsoft's focus moving forward. DataTables are a 15 year old technology, and there has to be a better way to do some of the things you described. There are reasons EF7 is only usable for simple next-gen ASP.NET apps, and some of the scenarios you mentioned are among those reasons. For anyone building things that are beyond simple apps, ASP.NET v6 on .NET 4.6 is going to be the way to go. DotNet Core will not be usable for many people in v1, and that is OK. Not every single feature can come along for the ride, otherwise if everything is legacy than nothing is. Frans, You can't tell me you couldn't come up with a better way to write DbConnection.GetSchema() that could work in your own apps for .NET Core if you really thought about it. Write it as an extension method on DbConnection and make it work with older versions of the framework by using your own objects. Can't be that hard. |
My tools don't depend on datatable in their core, they use datatables for meta-data retrieval in some situations (as metadata is only exposed through datatables in these situations)
Please don't make up numbers and pass them on as real. Almost every application calling stored procedures uses them, and of those applications there are many.
There's always a better way, sadly there's not one available at the moment.
I think you're mistaken. Plus it's besides the point.
Sometimes GetSchema() is necessary, but it's just an example. You also have GetSchemaTable on a datareader, which is used in a lot more cases (as not everyone reads metadata from a schema), as it tells you what the layout of the resultset is. Essential information for creating datastructures to store resultsets of unknown layout. And yes, that's not in less than 1% of the applications out there. Try most enterprise applications which have to talk to databases with many stored procedures for example. Good luck with writing solid server side code without datatable. Sure, like I said there's always a way, but the thing is: people will want to port code they already have to .NET core and e.g. run it on Linux. Now they can't do that. |
Out of curiosity, is there any public information you are using to decide what is important to port or not? Mono used to use Moma to scan and publish a database of "what people really need" to help drive the implementation. Is there any metric like that being used to help decisions, or is it based more on the direction the team want things to go? |
@FransBouma +1 for the answer. The 1% is totally unrealistic considering that EF showed up in 2011 (2/3 of .Net lifetime are before) and the consideration that DataTable was the dominant technology for data retrieval in all earlier examples. I would more go for 20% to 30% of all enterprise platforms will have dependencies to DataTable (numbers are random as well). @nvivo +1. I totally agree. It is a bit ... in-transparent currently (especially for a Open Source project which expects contributions). @terrajobst Like stated above ... having a forecast what might (DataTable, XmlSchema, etc) and what definitely will be never ever be in (remember the endless .. WPF/WinForms/WebForms discussions) would really be helpful. |
Entity Framework came out of Object Spaces which puts it in the 2008-2009 timeframe. And the 1% I referred to was forward-looking development. How many of those 30% Enterprise projects aren't even on .NET 4.0 yet? I'd venture to say a lot. NETCORE is not concerned with people who will never migrate anyways. It is concerned with building the next generation of apps. Not maintaining the last one. The Frameworks that have already shipped can handle that just fine. |
@advancedrei .NET core on Linux could spark new interest in .NET from dev teams who now won't look at .NET as it's windows only but do have lots of databases with stored procs. There are a tremendous amount of database-driven applications out there which have to use stored procs (I use stored procs as an example, as it's one of the use cases for a datatable as the resultsets are often of unknown layout: as you can't determine the layout .... without a datatable ;)) and they can't pick .NET if there's not a solid scenario available for their situation. There's little known about the ADO.NET story for .NET core, no code in that area is opened yet (to my knowledge) and how the provider model will look like is completely unknown. For server-side software that's an important point, and it now looks like v1.0 of .NET core is completely useless which IMHO wastes a tremendous opportunity to win people back to .NET. But alas... I'm starting to repeat myself. :) |
I'm sorry. I know it sounded like I was questioning the decision, but I was not. =) I was genuinely curious how the process of selecting the API was/is done. I have been using pure ADO.NET for years without EF, and I have not used DataTable for anything new in ages. I think this can be provided later by a package, but don't need to be in the core right now. We need to remember .NET 4.6 will still be released and all this stuff will be there. |
I'm with others here who point out that we should be looking forward not backward, but... I think we need to be careful here that we don't kill functionality that may be crucial to building applications. The two things that have jumped out at me are lack of a good story for creating new providers so that third parties have an easy time to create new providers. This (or some other mechanism like standardized DI) is a MUST and currently not supported/documented. Right now you're stuck with SQL Server and third parties can build their own providers but have no effective way to swap providers at runtime (maybe there's something there that we can't see but this was what the provider model was for previously). Personally I don't see DataTable/DataSet as needed tech, but the features it provides - an easy way to retrieve list based dynamic query DB data is a solid use case for building solutions on top of raw ADO.NET and for third parties that build on top of it. I think something is needed to provide this functionality out of box. It could be as simple as a dictionary, but it should be there natively IMHO because it is a common scenario if you use ADO.NET raw. So to me the story isn't about backwards compatibility but making sure the use cases that are common are addressed. If that means new interfaces that's fine, as long as some fundamental scenarios that third party tools and frameworks can build on are addressed in some way. Honestly I think this is all of utmost importance. After all most applications use data. And the story for data in vNext as presented by Microsoft so far looks like a bungling mess. I'm not sure if it actually is or not, but the message has been muddled and confused. From the outside it looks like EF with SQL Server is the only viable solution that's in box and that really sucks if that's how it ends up. Especially in light of .NET Core running on other platforms where SQL Server usage is not very likely. If this is just a messaging thing - then that needs to be corrected with some official commentary. If it's a feature thing then I think there should be discussion on what needs to be there. |
I agree with @RickStrahl that EF + SqlServer is bad, and agree with @FransBouma that there is no way to do some things like just selecting untyped data without dealing with DataReader. But I don't see anything else filling this gap DataTables left. What is really to be missed? All the select + order by is done much better by linq directly on objects. For other things, there are alternatives to EF. SimpleData is a famous one, Dapper from StackExchange is also very handy, and even I maintain a pretty useful one =). I bet new alternatives will appear even more with .NET Core, and eventually a winner will come up. But we need to give it some time. |
I agree with @RickStrahl here. It's the concept / featureset provided by Datatable and friends that's essential, and if I may add, essential for v1.0 to be there. Datatable is an obvious choice to provide that featureset as a truckload of code is already written out there supporting it, but it's not essential to have it as the type to provide that featureset, however why write something else to replace something you already have? |
Which part exactly do you miss from DataTable? It seems the only thing that you miss is a type that represents "an array of object arrays" for untyped data, but that is provided out of the box. Is that it or am I missing something? |
As a note, I saw the comments about the GetSchema. The point is that those things can be provided as separate modules, they are not exactly required by anything other than supporting the IDE around DataTable/Set/View stuff. And even this could be today exposed as real objects instead of datatables. If all is needed is a way to retrieve untyped tabular data, we could strip away 90% of DataTable and have a single new type that have no change detection, no select capabilities, etc. It should only hold an array of "rows", so "an array of object arrays". |
I agree with @nvivo, a DataTable is just an array of object arrays. Not sure why the functionality can't be replicated with some custom code and without the "sky is falling" mentality. |
DataTable isn't a construct with arrays of object arrays, it's a construct with typed columns (data isn't stored internally in object arrays, but vertically in typed columns), which has the ability to define custom views on top of it you can sort/filter without touching the original datatable, which can be filled with a DbDataAdapter which can create the typed columns for you during fetch based on metadata obtained from the resultset, which is also stored in a datatable. Like I said, if something is replacing it with equal functionality, fine, but that's not in the cards. This means that ado.net providers can't provide the same functionality they do today. GetSchema is one thing, but as I said, DbDataReader.GetSchemaTable is another, and that one IS used a lot by user code (directly or indirectly through data-access code or dbdataadapters). @advancedrei please stay on point with this discussion. I understand you don't give a shit about datatable and friends and that's fine, but just because it's irrelevant to you doesn't mean it's irrelevant to many thousands of developers out there. |
I'm obviously not saying how DataTable is implemented, I'm saying this is the only part that is missing.
Don't get me wrong here. I have a lot of code that depends on GetSchemaTable and DataTables in general out there. I have data migration tools that automatically compare and move data between different providers using this functionality. But that doesn't matter because we are not discussing removing this functionality from .NET. We are discussing if this functionality should be added to a new framework that has a different target. You say that you want "equal" functionality, but most of what DataTable/View/Set provides can be achieved with POCO and LINQ in a much cleaner way. You can create views, group by, count, select different rows and columns, etc. And in most cases, that will be achieved with LINQ using less memory. I believe the only way DataTable should be added to CoreFx would be with a simpler design that integrates better with LINQ and removes things like .Select, OrderBy, Views, etc. It should be just a container for tabular data, nothing more. As someone that works a lot with DbProviderFactory, I liked how the design went. Providers are clean and do only what they need to allow database access using SQL. Everything else can be build on top of it later, and that's how it should be. Version 1.0 is not even out yet, and @advancedrei said already, this version won't have a lot of stuff for a lot of people. But it's better to start clean and add pieces as it goes than start bloated. |
You can't create POCOs if you don't have meta-data of the fields of the resultset. You can't determine the schema of a resultset of a stored procedure unless GetSchemaTable() is implemented. Again, if that's implemented but it gives me a typed object, no problem. However if it returns an array of object arrays, what am I going to do with that? What element is what and of what type? I'm not talking about the POCO/typed element route which is created from metadata obtained in some way, I'm talking about calling a stored procedure which returns a resultset and you now have to do something with that resultset. Anyway I have given enough examples, @RickStrahl has given additional examples and background of why this feature is very important. I simply don't get why one would want to be limited by excluding functionality which is already implemented and has worked for over a decade. |
Everything you said is perfectly achievable if you change your mind model.
Why is that? Any database nowadays implements INFORMATION_SCHEMA to provide this kind of information in a standard way. If you need something very specific, you can just run any query against the database.
So, you are saying that if you ignore all the other ways to do it, there is no way to do it? DataTable is a huge implementation that is intended to do a lot of things that are not that common anymore. I have been building commercial apps for ages with plain ADO.NET and didn't use DataTables for a long time, I see them as legacy as well. Most frameworks and controls today focus on objects with observables and other ways to track changes. I do believe it has some features that are still valuable and have no replacement, but we need find what those features are and try to come up with an alternative for those only. Trying to bring this huge implementation full of duplicated stuff that is easily achievable with LINQ is not a good option. |
It's not for me, it's for all those devs out there who can't proceed to .NET core because someone thought 'Datatable' is legacy while not considering the consequences of that decision.
The DbConnection.GetSchema() call is an example, not the use case. There are many more examples to give where DataTables are used today and are very useful. I have given a couple, others have given other examples.
Common in your world != common in everyone's world. I write ORMs for a living, I know DataTables are not the only way to do data-access. I also know that everything that even looks like a stored procedure is automatically useless if datatables are not in the framework anymore.
Why come up with alternatives for code that has already been proven to work for over a decade and which is used by a lot of people, which is supported today in ADO.NET providers (so when these are ported to .NET core, this code will work automatically).
You're not focusing on the real problem. I'm not talking about creating a projection of a resultset to a typed object, I'm talking about being able to store untyped resultsets in a datastructure which makes it possible to use that untyped resultset without knowing what the layout is (as e.g. that might differ based in input, some stored procedures are written that way). Anyway, I've done my best. If MS wants to shoot themselves in the foot by limiting the Data-access story of their .NET core framework to 'whatever EF7 can do', so be it. |
After reading through this thread I tried to find out what of ADO.NET is being brought over to CoreFX (or at least would be supported on top of it) and I couldn't really find anything. Are connections, commands and readers also getting the axe (or at least deferred until after the first release)? Doesn't EF7 depend on these things to function? A server core framework for web applications that doesn't support DB access doesn't sound very useful to me. I can't honestly say that I've used DataTables and their kin recently but I do use data readers frequently when I need both firehose performance and/or need to avoid the inherent limitations of ORMs. I also have a metric ton of extension methods around DbConnection, DbCommand and DbDataReader to vastly simplify their use which I'd love to propose for potential inclusion/discussion. Update: My apologies, apparently I can't read. I do see the progress page which includes the types in |
@HaloFour, check for System.Data.Common under https://github.com/dotnet/corefx-progress/tree/master/src-diff/README.md. Specifically, check the definition of DbProviderFactory that defines what a Data Provider is. Basically it was reduced to connection, command, transaction and datareader. It seems all that is needed to me. |
Mono? Binding Bits? Look! Here is the code for DataView. It's all there. |
|
Which (I think?) brings us back to my question of how can MSFT only support Core RTM apps on Azure when not everything has been ported? |
Of course it's doable. Most people here only need certain bits and pieces to get their code compiling. Why doesn't someone who's got some time get their bit compiling and then stick it up in a repo somewhere? Then, others can contribute as they get their bits compiling. This is better than sitting around an complaining.
Some will compile, some won't. In the majority of cases it will be a matter of finding the equivalent API call in .NET Core.
It's exposed. The method is there in the public repo: virtual public DataTable GetSchemaTable() That method needs to be overridden by the inheriting class. |
@MelbourneDeveloper I'd suggest you read this whole thread, you're trying to rehash things which have already been discussed at length here and in other threads. The TL;DR: it's not as simple as you seem to think, not by far. Don't you think we would otherwise already have implemented a workaround? |
@MelbourneDeveloper , @redowl3 DataSet/DataTable/DataRow implementations are over-complicated and it is good that they are not ported to .NET Core "as is". The structure is too heavy; it implements too many functions and in overall this "monolith" component is not the best choice in every concrete case. Instead of just asking for old DataTable in .NET Core, could anyone list the functionality/usage scenarios that still needed and not covered by existing .NET Core-compatible libraries? |
Oh, no, please, not again. |
@MelbourneDeveloper |
@VitaliyMF please read this entire thread (as painful as that might be). Note that since this was debated quite a bit of time has gone by, so here's a quick summary of the current state of affairs as I see it. .NET Core RTM has been released and it does include an alternate resultset metadata API which does not rely on DataTable/DataSet (#5915), addressing one of the main concerns originally raised. A database/table metadata API (as opposed to a resultset metadata API) is still missing (#5024), but it could be argued that users that really need this can query schema information manually (e.g. INFORMATION_SCHEMA or some database-specific structures), even if that's not database-independent and generally sucks. IMHO this means that at this point there doesn't seem to be any missing functionality depending on DataTable/DataSet - you can do (almost?) anything without DataTable/DataSet (e.g. access dynamic results via DbDataReader APIs). So the remaining reason for DataTable/DataSet is for porting across code that already uses them, or a need for an untyped data access API that also includes the DataTable/DataSet extras (e.g. optimistic concurrency). These are of course valid requests/requirements. Note that Microsoft seems to have also shifted their general strategy, and intend to make .NET Core more backwards-compatible with .NET Framework, to help porting (see this article). This means that DataTable/DataSet may come back, or they may not. Either way, IMHO it's great to have them as a totally separate package without any core aspects of ADO.NET depending on them (i.e. metadata). |
@roji Thank you for the detailed explanation. As I know that IDbColumnSchemaGenerator was specially introduced to cover schema aspect (not very helpful in practice - for example, Microsoft.Data.Sqlite.SqliteDataReader doesn't implement this interface). Am I understand correctly, that only reason to have DataTable in .NET Core is backward compatibility?.. Well, not everything can be easily ported to .NET Core 1.0 and a lot of other incompatibilities will prevent old .NET projects from migration anyway (you don't say "System.Web" and WebForms should come back, right?). But new .NET Core projects will not use/depend on old DataTable! Regarding untyped data access API: when needed, it may be provided by 3rd party libraries (I already mentioned my NReco.Data lib that implements command builder/data adapter and RecordSet with API similar to DataTable/DataRow). For me, it is still unclear, why old DataTable should come back into NETStandard ADO.NET API. |
I thought this thread was closed :) Could someone from Microsoft simply clarify if it's on the roadmap or not? |
Are you sure about that? It's documented on SqliteDataReader. The Npgsql provider also implements it.
Well, it's true that originally Microsoft seemed to want to remove legacy/unwanted APIs in .NET Core. But at some point (again, see this blog) the decision seems to have made to really prioritize making .NET Core as compatible as possible with .NET Framework, to help people port. Whatever we think of DataTable/DataSet, there are definitely tons of programs out there relying on it. Especially with .NET Core being broken down into nugets, there doesn't seem to be any harm in making DataTable/DataSet available in .NET Core, as a separate package (as long as it isn't needed for any core operation, e.g. metadata). I'm guessing it might be somewhat low on the internal priority list though, with everything else going on. |
I already did some work around this in Silverlight by simply looking at the metadata of the System.Data namespace in Visual Studio (ILDasm like functionality) - not with any use of ILSpy, or with any code from the reference source. It's not as hard as people make out. Here's an example of production code Fill method for a DataAdapter:
Originally, @benaadams mentioned that this code was using the MIT license
But, after looking at the license agreement, it only allows us to use the code as a "reference": The source is actually released under the MS-RSL license. Which according to Wikipedia "is the most restrictive of the Microsoft Shared Source licenses" (https://en.wikipedia.org/wiki/Shared_source#Microsoft_Reference_Source_License_.28Ms-RSL.29). Are we able to get clarification on this? Are we allowed to take the System.Data code, slightly modify it for the purpose of compilation and use on .NET Core, and then stick that code up in a GitHub repo? The code may, or may not be used by Microsoft at a later point as a basis for building more functionality in to the System.Data namespace in .NET Core. Just to clarify, I have no intention of breaking the license agreements put forward by Microsoft, or breaking the spirit of the agreements. I am merely raising this as a point that if Microsoft were to allow us to do it, we could recreate this library ourselves. If 10 programmers spent a small amount of time working on just the area of the functionality they need, and those areas were different, we'd be able to knock this over in a very short time span. But, if Microsoft doesn't want us to do this, then I withdraw any comments about attempting this, and will be happy to wait for the team to go ahead with this. |
This is what I have done in the past, but it's not ideal. You will need to write a layer over the top if you are implementing for multiple database platforms.
Exactly. And, why not let the community build it? |
On a basic level you are right. These classes are bloated and unnecessary. People should rewrite their code so as to avoid them. That doesn't change the fact that legacy code does exist, and in order to embrace the new .NET Core technology, legacy code must be made to compile and work with the new technology. With the flick of a switch, large swathes of legacy code could be made to compile, and since this library does not need to be deployed with every instance of the .NET Core architecture, it's a mute point to say that it's too heavy. Any library could be too heavy. But, just taking some existing code and compiling it for .NET Core won't slow .NET Core down. Every developer still needs to make conscious decisions about performance no matter what. |
@MelbourneDeveloper The source code at http://referencesource.microsoft.com is released under the MS-RSL license. But the code at https://github.com/Microsoft/referencesource/ is released under the MIT license. |
Boom! Off we go! |
Can anyone suggest what would be the recommended approach to pass data ( User Defined Table type or paramenters) to stored procedure in ASp.NET Core if using ADO.NET |
@bharatmangal04 Are you referring to SQL CLR UDTs on Sql Server? |
Tracking.... need DataTable/DataSet badly :( |
So it is in daily build. How can I use it?
So does it mean I can use DataTable/DataSet? `
`
So should I be able to just use them?! |
Lack of DataTable blocks pretty much anyone using EnterpriseLibrary.Data UpdateDataSet() for higher performance batch insertions, unless the requirement is to rewrite entire DAL to build the SQL statements manually. We dont want to be putting DataReader dependencies in services, either. This also prevents using TableValuedParameters, EF has never been performant enough, and is too easy to misuse. |
Hi Roji, since you are a member of Microsoft .NET data team now, do you have the answer about "somewhat low on the internal priority list"? I'm also curious about this. |
I looked here: https://github.com/dotnet/corefx-progress and saw DataTable/View/Set isn't ported to .NET Core. IMHO this is a mistake, as DataTable and friends are a convenient way to handle untyped data at runtime, and is e.g. often used when fetching data through stored procedures as it's easy to e.g. fetch a dataset with the result of a proc which returns 3 cursors on e.g. Oracle.
I fail to see why this cornerstone of many people's data-access code is absent in a framework that targets servers.
The text was updated successfully, but these errors were encountered: