Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option for migrations not to create foreign key constraints #21512

Closed
John0King opened this issue Jul 4, 2020 · 17 comments
Closed

Option for migrations not to create foreign key constraints #21512

John0King opened this issue Jul 4, 2020 · 17 comments

Comments

@John0King
Copy link

from https://docs.microsoft.com/en-us/ef/core/saving/cascade-delete#cascading-to-untracked-entities

If only the principal is loaded--for example, when a query is made for a blog without an Include(b => b.Posts) to also include posts--then SaveChanges will only generate SQL to delete the principal/parent:

as an Object Relational Mapping Framework, and we spend much time to create the model, I think EF should
care about the relation in Model instead of use something-else ( today use database's ability) to perform CascadeDelete

if parent entity is delete, then it's children should be delete to when configure to DeleteBehavior.Cascade , and this can be done by a simple DELETE FROM <Table> WHERE <ForeignKey> = @ParentPrimaryKey, and EF should scan the model to perform multiple level of CascadeDelete in correct order. and for DeleteBehavior.SetNull is the same

and in many big table (millions of data) , people often not use ForeignKey , instead they only use index, EF Core do not support this neither. (skip foreight key) and in this case , the pagination query is also has a problem when you use None-Nullable Type as Foreign Key ,(caused by skip the JOIN) , For example:

var query = db.Blog.Inlcude(b=>b.Posts);

var total = query.Count();  // SELECT * FROM [Blog]

var items = query.Skipt(PageSize * PageIndex).Take(PageSize).ToList()  
// SELECT * FROM [Blog] as b INNER JOIN [Post] as p on b.Id = p.BlogId  OFFSET  <Skipt>  ROWS FETCH <Take> ROWS ONLY

because without the inner join the total can be bigger than it should

@roji
Copy link
Member

roji commented Jul 5, 2020

Duplicate of #18960

@roji roji marked this as a duplicate of #18960 Jul 5, 2020
@roji
Copy link
Member

roji commented Jul 5, 2020

@John0King note that if you're using DeleteBehavior.Cascade (as opposed to DeleteBehavior.ClientCascade), all child entities should already be automatically deleted for you by the database. The issue is usually when people can't use DeleteBehavior.Cascade because of SQL Server limitations (the cycles or multiple cascade paths issue).

in many big table (millions of data) , people often not use ForeignKey , instead they only use index

Can you provide more detail on this, i.e. why someone would avoid defining a foreign key but still define an index? In any case, EF Core certainly needs to have a relationship defined between tables in order to do any cascading deletes, and that indeed requires a foreign key.

@John0King
Copy link
Author

@roji

if you're using DeleteBehavior.Cascade (as opposed to DeleteBehavior.ClientCascade), all child entities should already be automatically deleted for you by the database

that's not true, the database automatically delete is did by that you add FOREIGN KEY and ON DELETE CASCADE when you create the table , if not it doesn't work (EF Core will not force people to use the ef migration right ?)

Can you provide more detail on this, i.e. why someone would avoid defining a foreign key

stackoverflow https://stackoverflow.com/questions/599159/should-i-use-foreign-keys is the discussion

and this is the Alibaba 's Database Design Rule

https://developer.aliyun.com/article/709387

6.【强制】不得使用外键与级联,一切外键概念必须在应用层解决。 说明:以学生和成绩的关系为例,学生表中的 student_id 是主键,那么成绩表中的 student_id 则为外键。如果更新学生表中的 student_id,同时触发成绩表中的 student_id 更新,即为 级联更新。外键与级联更新适用于单机低并发,不适合分布式、高并发集群;级联更新是强阻 塞,存在数据库更新风暴的风险;外键影响数据库的插入速度。

translate :

  1. [mandatory] foreign key and cascade are not allowed. All foreign key concepts must be solved in the application layer. Note: take the relationship between students and grades as an example,the student_ id in the student table is the primary key, then the student_ id in the grade table is the foreign key. If you update the student_id in the student table, and the student_id in the grade table will be triggered at the same time eg. update is cascade update. Foreign key and cascade update are suitable for single machine with low concurrency, and are not suitable for distributed and high concurrency clusters; cascaded update is a strong blocking, and there is a risk of database update storm; foreign key affects the speed of database insertion.

EF is a business layer technology , it is an ORM, it should manage all the relationship between our mapped Object , but it current is a database design tool , it create the relation that make the database to managed it , EF should force on manage relation/ handle relation and translate sql more flexible, for example look at the community is doing

Task UpdateAsync<TEntity>(IQuerable<TEntity> condition, Expression<Func<TEntity,object>> updateExpression)
Task UpdateAsync<TEntity>(IQuerable<TEntity> condition, object updateConstantObject)

Task RemoveAsync<TEntity>(IQuerable<TEntity> condition);
Task RemoveAsync<TEntity>(Expression<Func<TEntity,bool>> condition);

// for changed entity  we may add a new parameter  "bool updateTrackedEntity = false" 
// and translate the  updateExpression and delete expression  again for  `db.ChangeTracker.Entries<TEntity>()`
```

@roji
Copy link
Member

roji commented Jul 7, 2020

@John0King I wasn't trying to say you're wrong in any way. For users which are OK with the database implementing the cascade delete (via foreign keys), the DeleteBehavior.Cascade does that - this seems to be the majority case, except for some limitations in SQL Server related to circularity.

For users who don't want to use DeleteBehavior.Cascade - either because of the SQL Server limitation, or because their database policy precludes foreign keys or database-side cascade - the DeleteBehavior.ClientCascade exists precisely to perform the cascade on the client, which is what you are asking for. However, at this point ClientCascade only deletes tracked entities, and #18960 tracks doing a bulk delete in order to cascade. Therefore this issue seems to be a duplicate of #18960, or have I misunderstood your request?

@John0King
Copy link
Author

@roji yes the main issue maybe a duplicate ,
and I mention 3 problem

  1. the CASCADE and options to ignore FOREIGN KEY on Table Create (EF Core Migration )
  2. because 1, the .Include().Count() must not reduce the INNER JOIN for Non-Nullable ForeignKey
  3. Introduce Bulk Update/Delete as non-tracked batch operation

@roji
Copy link
Member

roji commented Jul 7, 2020

  1. the CASCADE and options to ignore FOREIGN KEY on Table Create (EF Core Migration )

@John0King if I'm understanding you correctly, you want to use EF Core migrations, but don't want EF Core to create foreign constraints. If so, this isn't really related to to cascading - it's how EF Core generates migrations for relationships. Can you please confirm this is what you're looking for?

  1. because 1, the .Include().Count() must not reduce the INNER JOIN for Non-Nullable ForeignKey

Can you please provide more detail on what this means, including a code sample?

  1. Introduce Bulk Update/Delete as non-tracked batch operation

Already tracked by #18960 as discussed above.

@John0King
Copy link
Author

@roji

  1. 🤔, first , it is cascade delete problem, then with this behavior, we want the migration to do this for us too
  2. see .Include() .Count() not using inner join for count #8201
  3. it seems not the "independent" featuring request, but a feature used by cascade delete(if I understand correctly), I want the bulk update/delete to be a public method on DbContext

@roji
Copy link
Member

roji commented Jul 10, 2020

thinking, first , it is cascade delete problem, then with this behavior, we want the migration to do this for us too

I'm confused... What is it exactly that you'd like migrations to do for you? Above you mentioned you don't want foreign key constraints, which is what migrations introduce for relationships. Any sort of cascade solution based on bulk delete would be purely client-side, as EF would automatically send DELETE statements to cascade - no need for any migrations...

Can you please explain what it is you're looking for?

see #8201

I think @smitpatel addressed everything in that issue.

it seems not the "independent" featuring request, but a feature used by cascade delete(if I understand correctly), I want the bulk update/delete to be a public method on DbContext

General support for bulk delete/update is tracked by #795 - this would include a public, user-facing API for doing arbitrary bulk operations. The implicit use of bulk delete/update to implement client-managed cascading deletes is tracked by #18960.

Is there anything not already tracked by other issues that requires keeping this open?

@John0King
Copy link
Author

@roji
let me explain more:

entity:

public class Blog
{
public int Id{get;set;}

public List<Post> Posts {get;set;}
}

public class Post
{
public int Id{get;set;}
public int BlogId {get;set;}
public Blog Blog{get;set;}

}

we do not want Post to have a FOREIGN KEY in Database, but without a new fulent API like EntityBuilder<Post>().IgnoreForeightKey(x=>x.BlogId) or EntityBuilder<Post>().IgnoreForeightKeys(),
whendb.Database.CreateAsync() called , it will add FOREIGN KEY to the db too.


if this fluent-API is been noticed , then this issue can be close now

@roji
Copy link
Member

roji commented Jul 10, 2020

To be clear, what you want is for migrations to not create the constraint in the database, right (the foreign key column must still be created, otherwise there is no way to represent that the entities are related)? If this is what you're looking for, then as a workaround you can edit your migrations after adding them and manually remove the foreign key constraints.

I must say I have serious doubts about the advice to not use foreign key constraints for the general case. Sure, there may be some extreme edge cases where foreign keys may cause performance issues, and you may pay a very small price at insertion time for the checks, but that's likely to be negligible for all but very specific/high-perf applications. I would highly recommend using foreign key constraints unless careful benchmarking empirically shows them to be problematic in your specific case.

@roji roji changed the title Feature request: EFCore as an ORM should manage the relation for us Option for migrations not to create foreign key constraints Jul 10, 2020
@ajcvickers
Copy link
Contributor

Related: #13146

@smitpatel
Copy link
Contributor

For the query part of the issue,

  • The queries posted in this issue are not correct. .Include() .Count() not using inner join for count #8201 and any join optimizations are related to including a reference navigation. We don't generate any join while doing Count if Include is on collection navigation. It is simply incorrect result. And Count of parent does not require accounting for dependent (till required dependents are a thing)
  • If you want to write Include(a => a.Navigation) in your query then you must have foreign key constraint defined in EF Core model. You can edit migration or not use migration and not have it on database (as long as your data remain consistent as if the constraint existed). But if you don't want to create a FK in EF Core model then you cannot use navigations. You must manually generate joins in your query as you want.

@John0King
Copy link
Author

John0King commented Jul 10, 2020

@smitpatel
the FK is exists in ef model (application layer), and the .IgnoreForeignKey is just tell migrations to not create FK in db.

and join optimizations without database constraint will never be correct, if there no include,there will no john , perform a join optimizations will not be need, so remove the join optimizations seems a right direction

@smitpatel
Copy link
Contributor

@John0King - Your statements seem to be contradicting.

  • There is no join optimization. If you apply count on parent, we don't need to join to dependent table.
  • If you project out Parent with Include of children then EF Core need to generate join to get related data.
    That has nothing to do with where FK exist. It is simply matter of expanding the navigation property defined in EF Core model.

If your ask is that user will define FK in EF Core model but it does not exist in database then there is no query issue as above will continue to happen since model contains the FK. Regardless of constraint not defined in database, if EF Core has it then your data need to be consistent graph. There is just no enforcement of it. But it does not allow you to have a dependent without a parent.

If you have issue with join then make sure your FK definition in EF Core model is correct and consistent with data. If you don't want to have FK in your EF Core model then you can have data in database whichever way you want but you cannot use navigations in the queries. You must use join operator yourself to join tables.

@ajcvickers
Copy link
Contributor

We discussed this again and concluded that we don't want to allow normal (i.e. not the kind of constraint described in #13146) FK constraints to be excluded from migrations as a first-class option. Instead, the constraint can be removed from the migration code generated so that it is not created in the database. This will not change EF behavior with regard to the FK.

@micahosborne
Copy link

So sharding a table isn't a use case? Can't believe noone has mentioned this. While in code, it makes sense to see the database as if there is a foreign key. But from a database level having a foreign key can create as many problems as it solves when scaling a database (sharding). I personally believe that it is much better to handle deletes in code rather than leaving it to the database. It forces the developer to really think how to handle deletion in all cases. Saying this shouldn't be a first class option just blows my mind.

@amzhang
Copy link

amzhang commented Sep 25, 2022

Another use case is using partitioned tables like the one that TimescaleDB provide. Where their "hypertables" are tables that partition on the timestamp. These hypertables do not support foreign key constraints but work just fine using joins.

@ajcvickers ajcvickers reopened this Oct 16, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants