Batch split queries #10878

roji · 2018-02-05T21:34:36Z

In some scenarios, EF Core internally generates more than one read in order to execute a single query. This is notably the case when executing (some?) joins when using a non-MARS provider. Performance could be (greatly!) improved by batching those multiple reads into a single batch internally. Unless I'm mistaken, in the case you generate multiple queries, the results for the earlier queries have to be buffered anyway, so there's no additional penalty in the buffering implied in the batching.

Note: this issue isn't about exposing a batching read API to the user (that's what #10879 is about).

NOTE: batching becomes impossible if we decide to go in the direction of #12776 (fetching dependents by foreign keys instead of reevaluating the principal query).

ajcvickers · 2018-02-08T01:44:47Z

Note: see also #10465, which is very similar but comes from a slightly different angle.

divega · 2018-02-08T07:38:16Z

See also #5952

roji · 2019-09-25T17:09:54Z

Unless I'm mistaken, this is no longer relevant as we generate one SQL query per LINQ query since 3.0. Closing.

roji · 2020-04-21T10:21:04Z

Reopening and assigning to @smitpatel at his request, could be done as part of the workaround for #18022.

roji · 2020-05-01T20:33:19Z

I've checked what command batching does to isolation/consistency in SQL Server and PostgreSQL, and the results are negative: merely batching statements into the same DbCommand doesn't have any impact on isolation. In other words, different database state can be observed by different statements in the same DbCommand, if not within an explicit transaction.

It's still possible to have consistency by wrapping the batch in a serializable transaction (or snapshot in SQL Server) - see test code. We probably shouldn't do this implicitly, since it implies error or deadlocking behavior that doesn't occur otherwise.

Test code

[NonParallelizable]
public class BatchTransactionIsolationTest
{
    const string PostgresConnectionString = ...;
    const string SqlServerConnectionString = ...;

    [Test]
    public async Task SqlServer([Values(true, false)] bool wrapInSerializableTx)
    {
        using var conn1 = new SqlConnection(SqlServerConnectionString);
        using var conn2 = new SqlConnection(SqlServerConnectionString);
        conn1.Open();
        conn2.Open();

        var createSql = @"
IF OBJECT_ID('dbo.data', 'U') IS NOT NULL 
DROP TABLE data; 
CREATE TABLE data (id INT);
INSERT INTO data (id) VALUES (1)";
        
        using (var createCmd = new SqlCommand(createSql, conn1))
            createCmd.ExecuteNonQuery();

        var tx = wrapInSerializableTx
            ? conn1.BeginTransaction(IsolationLevel.Snapshot)
            : null;

        using var batchCommand = new SqlCommand(@"
SELECT * FROM data;
WAITFOR DELAY '00:00:05';
SELECT * FROM data", conn1, tx);
        
        using var modifyCommand = new SqlCommand(@"
UPDATE data SET id=2 WHERE id=1; 
INSERT INTO data (id) VALUES (10)", conn2);

        var t = batchCommand.ExecuteReaderAsync();
        Thread.Sleep(1000);
        modifyCommand.ExecuteNonQuery();

        var reader = await t;
        Console.WriteLine("Before wait:");
        while (reader.Read())
            Console.WriteLine(reader.GetInt32(0));

        Assert.True(reader.NextResult());
        
        Console.WriteLine("After wait:");
        while (reader.Read())
            Console.WriteLine(reader.GetInt32(0));
    }
    
    [Test]
    public async Task Postgres([Values(true, false)]bool wrapInSerializableTx)
    {
        using var conn1 = new NpgsqlConnection(PostgresConnectionString);
        using var conn2 = new NpgsqlConnection(PostgresConnectionString);
        conn1.Open();
        conn2.Open();

        var createSql = @"
DROP TABLE IF EXISTS data;
CREATE TABLE data (id INT);
INSERT INTO data (id) VALUES (1)";
        
        using (var createCmd = new NpgsqlCommand(createSql, conn1))
            createCmd.ExecuteNonQuery();

        if (wrapInSerializableTx)
            conn1.BeginTransaction(IsolationLevel.Serializable);
        
        using var batchCommand = new NpgsqlCommand(@"
SELECT * FROM data;
SELECT pg_sleep(5);
SELECT * FROM data", conn1);
        
        using var modifyCommand = new NpgsqlCommand(@"
UPDATE data SET id=2 WHERE id=1; 
INSERT INTO data (id) VALUES (10)", conn2);

        var t = batchCommand.ExecuteReaderAsync();
        Thread.Sleep(1000);
        modifyCommand.ExecuteNonQuery();

        var reader = await t;
        Console.WriteLine("Before wait:");
        while (reader.Read())
            Console.WriteLine(reader.GetInt32(0));

        Assert.True(reader.NextResult());
        Assert.True(reader.NextResult());  // pg_sleep returns a void resultset
        
        Console.WriteLine("After wait:");
        while (reader.Read())
            Console.WriteLine(reader.GetInt32(0));
    }
}

ajcvickers · 2020-05-02T15:14:50Z

@roji Thanks for checking this. It's unfortunate, but I think it re-enforces that we need to retain both collection-include ehaviors for now.

roji · 2020-05-02T15:17:16Z

I completely agree. I also think it points towards retaining single-query as the default behavior, but I think we all agree we need some way to opt into split query.

roji · 2024-03-04T10:11:02Z

Note: batching becomes impossible if we decide to go in the direction of #12776 (fetching dependents by foreign keys instead of reevaluating the principal query).

roji mentioned this issue Feb 5, 2018

Execute multiple LINQ queries in a single round-trip (aka Expose batching read API to users) #10879

Open

ajcvickers added the type-enhancement label Feb 7, 2018

ajcvickers added this to the Backlog milestone Feb 7, 2018

ajcvickers added the area-perf label Feb 7, 2018

ajcvickers mentioned this issue Feb 8, 2018

Perf: provide a way to batch the generated commands for a complex query #10465

Closed

divega added the consider-for-current-release label Apr 17, 2018

ajcvickers removed the consider-for-current-release label May 21, 2018

divega added the size~4-weeks label May 22, 2018

ajcvickers removed the size~4-weeks label Nov 7, 2018

roji mentioned this issue Nov 21, 2018

Use single SQL query instead of split queries #12098

Closed

roji closed this as completed Sep 25, 2019

roji added the closed-not-needed label Sep 25, 2019

roji mentioned this issue Sep 25, 2019

20x slowdown in gigantic query after updating to EF Core 3 compared to 2.2 #18017

Closed

AndriySvyryd removed this from the Backlog milestone Nov 12, 2019

roji reopened this Apr 21, 2020

roji removed the closed-not-needed label Apr 21, 2020

roji assigned smitpatel Apr 21, 2020

ajcvickers added the consider-for-current-release label Apr 24, 2020

ajcvickers modified the milestones: 5.0.0, Backlog Apr 24, 2020

smitpatel removed the consider-for-current-release label Jul 28, 2020

AndriySvyryd added the area-query label Aug 13, 2020

smitpatel removed their assignment Aug 27, 2020

ajcvickers mentioned this issue Nov 5, 2020

ToQueryString on a split query only shows the first query #22080

Open

This was referenced Feb 27, 2021

Recycling relational and ADO.NET objects in query pipeline #24207

Merged

How to load a related entity after call AddAsync without making another roundtrip to the database #24297

Closed

roji mentioned this issue Mar 31, 2021

AsSplitQuery() extension should use multiple result sets to reduce database roundtrips to 1 #24550

Closed

roji changed the title ~~Batch multiple internally-generated reads~~ Batch split queries Oct 20, 2021

roji added the consider-for-next-release label Oct 20, 2021

roji mentioned this issue Oct 21, 2021

How can I measure a performance of queries executed with SplitQuery() #26388

Closed

ajcvickers removed the consider-for-next-release label Oct 22, 2021

roji mentioned this issue Feb 10, 2022

Idea for EF Core 7: Fetch aggregate entity instance and related tables in single database trip by combining AsSplitQuery and AsSQLBatch #27384

Closed

roji mentioned this issue Aug 18, 2022

Reduce repeating parameters #28768

Closed

roji added the consider-for-next-release label Sep 17, 2022

This was referenced Sep 20, 2022

piggyback split queries dotnet/EntityFramework.Docs#4045

Closed

Consider doing split query on a reference navigation if the other side is a collection #29174

Open

ajcvickers added consider-for-current-release and removed consider-for-next-release labels Oct 20, 2022

ajcvickers assigned AndriySvyryd Oct 21, 2022

roji mentioned this issue Dec 5, 2022

Multiple Include() warning: false positive for ThenInclude()? #29665

Open

roji mentioned this issue Feb 1, 2023

How to deal with array_agg(column1, column2) in Entity Framework and Postgres? npgsql/efcore.pg#2631

Closed

roji mentioned this issue Apr 5, 2023

Performance issue with split queries npgsql/efcore.pg#2713

Closed

roji mentioned this issue Feb 20, 2024

Multiple Result Sets for SplitQuery #33124

Closed

roji mentioned this issue Mar 4, 2024

Change split query implementation to fetch dependents by keys, without reevaluating principal query #12776

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch split queries #10878

Batch split queries #10878

roji commented Feb 5, 2018 •

edited

Loading

ajcvickers commented Feb 8, 2018

divega commented Feb 8, 2018

roji commented Sep 25, 2019

roji commented Apr 21, 2020

roji commented May 1, 2020 •

edited

Loading

ajcvickers commented May 2, 2020

roji commented May 2, 2020

roji commented Mar 4, 2024 •

edited

Loading

Batch split queries #10878

Batch split queries #10878

Comments

roji commented Feb 5, 2018 • edited Loading

ajcvickers commented Feb 8, 2018

divega commented Feb 8, 2018

roji commented Sep 25, 2019

roji commented Apr 21, 2020

roji commented May 1, 2020 • edited Loading

ajcvickers commented May 2, 2020

roji commented May 2, 2020

roji commented Mar 4, 2024 • edited Loading

roji commented Feb 5, 2018 •

edited

Loading

roji commented May 1, 2020 •

edited

Loading

roji commented Mar 4, 2024 •

edited

Loading