Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to execute multiple SQL statements at once. #4

Closed
wants to merge 1 commit into from

Conversation

peter-ss
Copy link
Contributor

Added the ability to execute multiple SQL statements in a single sqlite call. The existing UpdateStatement did not allow this, because the underlying call to sqlite3_prepare and sqlite3_step only handle single SQL statements.

Being able to run multiple statements at once is very useful. For example, several of my migrations have many SQL statements, and I do not want to register each statement as a migration.

I would be happy to write up an example use case for the framework documentation. Just let me know.

This is done by the new MultipleStatement class, which calls
sqlite3_exec under the hood.
@groue
Copy link
Owner

groue commented Aug 17, 2015

Hello @peter-ss,

Thanks for the PR, because I now understand your need.

Yet, when your write:

For example, several of my migrations have many SQL statements, and I do not want to register each statement as a migration

An answer could be to simply run several statements in your migration blocks:

// Database V1
migrator.registerMigration("createTables") { db in
    try db.execute("CREATE TABLE table1 (...)")
    try db.execute("CREATE TABLE table2 (...)")
    try db.execute("CREATE TABLE table3 (...)")
}

OK, now the gist of your commit.

I can't accept it as is it, because the MultipleStatement class you introduce is not a valid subclass of its ancestor Statement. It inherits properties from its super class which it does not support:

  • public internal(set) var arguments: QueryArguments? (which your class does not support/use).
  • let sqliteStatement = SQLiteStatement() (initialized in init with sqlite3_prepare_v2, despite you will to get rid of this function call).

Do you think you could fix this?

groue added a commit that referenced this pull request Aug 17, 2015
… contains multiple SQL statements. Test that trailing semicolon is both accepted and optional.
@groue
Copy link
Owner

groue commented Aug 17, 2015

Hold on @peter-ss. You made me come up with a better idea.

I won't expose yet any "reusable" object since 1. it is not a requested feature, and 2. users who know well sqlite may feel dizzy in front of such an unexpected feature: they would raise an eyebrow if the documentation would talk about "Prepared Multiple Statements", and this is how you lose user's confidence.

Instead, I'll have Database.execute() accept both single and multiple SQL statements. Arguments will just not be supported for multiple statements. I think it's a simpler option. Both simple (API does not grow), and discoverable by power users.

@peter-ss
Copy link
Contributor Author

Hi there,

I agree with your assessment of MultipleStatement. I also agree that it
would be handy just to have an extra function on Database that just
executes the SQL without requiring an entire class. Are you thinking that
we should just move the code from MultipleStatement into a function in
Database (one that's called from execute)? How will we determine if the SQL
passed to execute is a single statement or multiple statements?

I'd be happy to adjust the code however you think best.

Peter

On Mon, Aug 17, 2015 at 7:45 AM, Gwendal Roué [email protected]
wrote:

Hold on @peter-ss https://github.com/peter-ss. You made me come up with
a better idea.

I won't expose yet any "reusable" object since 1. it is not a requested
feature, and 2. users who know well sqlite may feel dizzy in front of such
an unexpected feature: they would raise an eyebrow if the documentation
would talk about "Prepared Multiple Statements", and this is how you lose
user's confidence.

Instead, I'll have Database.execute() accept both single and multiple SQL
statements. Arguments will just not be supported for multiple statements. I
think it's a simpler option. Both simple (API does not grow), and
discoverable by power users.


Reply to this email directly or view it on GitHub
#4 (comment).

groue added a commit that referenced this pull request Aug 17, 2015
groue added a commit that referenced this pull request Aug 17, 2015
groue added a commit that referenced this pull request Aug 17, 2015
…ent and no argument successfully returns the inserted Row ID.
@groue
Copy link
Owner

groue commented Aug 17, 2015

Hi again @peter-ss. I did just add a few commits:

  • One that makes sure UpdateStatement crashes if provided a multiple-statements SQL string. Before this commit, it would ignore statements after the first semi-colon.
  • Some failing tests.

Could you have a look at the failing tests (1, 2 and 3), and tell me if they look sensible to you?

If so, I can let you make them pass, if you like. Tell me: I don't want to short-circuit you.

groue added a commit that referenced this pull request Aug 17, 2015
@groue
Copy link
Owner

groue commented Aug 17, 2015

How will we determine if the SQL passed to execute is a single statement or multiple statements?

Those lines can tell you if a SQL string contains a single or several statements (precisely, the if consumedCharactersCount != sqlCodeUnits.count test). I guess this code would need to be refactored out. We can not just rely on the absence of arguments to trigger sqlite3_exec, and this is explicitly tested here.

@peter-ss
Copy link
Contributor Author

The tests look reasonable, and well thought-out.

So should I build the code that determines if we have a multiple statement
into the Database.execute call? Then if it finds a multiple statement, it
will call a function in Database that performs the sqlite3_exec. We don't
want to add the multiple statement handling into UpdateStatement, correct?

On Mon, Aug 17, 2015 at 8:06 AM, Gwendal Roué [email protected]
wrote:

Hi again @peter-ss https://github.com/peter-ss. I did just add a few
commits:

  • One that makes sure UpdateStatement crashes if provided a
    multiple-statements SQL string. Before this commit, it would ignore
    statements after the first semi-colon.
  • Some failing tests.

Could you have a look at the failing tests (1
244629c,
2
acce232
and 3
a83aad9),
and tell me if they look sensible to you?

If so, I can let you make them pass, if you like. Tell me: I don't want to
short-circuit you.


Reply to this email directly or view it on GitHub
#4 (comment).

@groue
Copy link
Owner

groue commented Aug 17, 2015

@peter-ss You're right, we don't want to add the multiple statement handling in UpdateStatement, which remains focused on single, actually reusable, statements.

So:

  1. The UpdateStatement initializer needs to crash if given multiple statements (done in f2c4af9).
  2. Database.execute must distinguish between single and multiple statements (and violently crash with a nice apology if given multiple statements and arguments).
  3. We don't want the code that distinguishes between single and multiple statements to be duplicated.

This should give us kind of a roadmap, what do you think?

@peter-ss
Copy link
Contributor Author

Yes - sounds good. Let me think on it a bit. I will create a local branch
here and send you a PR to look at when I have something. It will probably
be a few hours - I've got some other job-related things to work on as well.
:)

On Mon, Aug 17, 2015 at 8:41 AM, Gwendal Roué [email protected]
wrote:

@peter-ss https://github.com/peter-ss You're right, we don't want to
add the multiple statement handling in UpdateStatement, which remains
focused on single, actually reusable, statements.

So:

  1. The UpdateStatement initializer needs to crash if given multiple
    statements (done in #f2c4af9).
  2. Database.execute must distinguish between single and multiple
    statements (and violently crash with a nice apology if given multiple
    statements and arguments).
  3. We don't want the code that distinguishes between single and
    multiple statements to be duplicated.

This should give us kind of a roadmap, what do you think?


Reply to this email directly or view it on GitHub
#4 (comment).

@groue
Copy link
Owner

groue commented Aug 17, 2015

Take your time, Peter 😄

groue added a commit that referenced this pull request Aug 17, 2015
…levant if the executed statement is not an INSERT statement).
@peter-ss
Copy link
Contributor Author

Ah - one additional wrinkle: the execute() method returns an
UpdateStatement.Changes struct. What should we do in the case of multiple
statements? How would we have it conform to the existing method signature?

Also, it occurs to me that if we refactor the logic that determines if we
have multiple statements so that it's part of Database, we would be
compiling the SQL when we execute sqlite3_prepare_v2 as part of the
determination. If we find that we DO have a single statement, we would not
want to run prepare on the SQL again (for performance reasons), so we would
want some way to pass UpdateStatement() an already-prepared
SQLiteStatement, which would force a change to the init() signature for
Statement and UpdateStatement. If we find that we DO NOT have a single
statement, then we throw away the results of the call to prepare(), since
we can't use them for multiple statement execution, which is again a
performance hit.

I'm wondering if perhaps adding a second method to Database is a better way
to go. If we added a method named executeMultiple(), it would introduce a
new call into the API, but it would also have the following benefits:

  1. We would avoid changing the sigs for Statement/UpdateStatement to
    account for passing an already-prepared SQLiteStatement.
  2. We would avoid the overhead/performance hit of running
    sqlite3_prepare_v2 on the multiple statement SQL.
  3. We would be able to return a value/struct appropriate for multiple
    statements (either a count of rows affected, or something else)
  4. We would force the developer to explicitly call this method to perform
    execution of multiple statements, to indicate that they do indeed want to
    execute multiple statements at once, which is not a highly-used use case.

If we decide to go this route, I would handle the sqlite3_exec() call right
there in the Database class, instead of creating another class. I would
adjust the error message thrown by Statement when it gets handed a
multiple-statement string to indicate to the developer that they should use
the executeMultiple API instead. We could then adjust the test case to make
sure that the correct error is thrown in this case.

What do you think?

Peter

On Mon, Aug 17, 2015 at 8:41 AM, Gwendal Roué [email protected]
wrote:

@peter-ss https://github.com/peter-ss You're right, we don't want to
add the multiple statement handling in UpdateStatement, which remains
focused on single, actually reusable, statements.

So:

  1. The UpdateStatement initializer needs to crash if given multiple
    statements (done in #f2c4af9).
  2. Database.execute must distinguish between single and multiple
    statements (and violently crash with a nice apology if given multiple
    statements and arguments).
  3. We don't want the code that distinguishes between single and
    multiple statements to be duplicated.

This should give us kind of a roadmap, what do you think?


Reply to this email directly or view it on GitHub
#4 (comment).

@groue
Copy link
Owner

groue commented Aug 17, 2015

Ah - one additional wrinkle: the execute() method returns an
UpdateStatement.Changes struct. What should we do in the case of multiple
statements? How would we have it conform to the existing method signature?

Hm. What about extracting UpdateStatement.Changes to a general DatabaseChanges struct, and leaving the insertedRowID nil in case of multiple statements?

Also, it occurs to me that if we refactor the logic that determines if we
have multiple statements so that it's part of Database, we would be
compiling the SQL when we execute sqlite3_prepare_v2 as part of the
determination. If we find that we DO have a single statement, we would not
want to run prepare on the SQL again (for performance reasons), so we would
want some way to pass UpdateStatement() an already-prepared
SQLiteStatement, which would force a change to the init() signature for
Statement and UpdateStatement.

Correct. Yet signatures of internal APIs are not sacred: I would not care if they would have to change.

If we find that we DO NOT have a single
statement, then we throw away the results of the call to prepare(), since
we can't use them for multiple statement execution, which is again a
performance hit.

This is indeed a problem: we would need to parse the initial statement twice: once with prepare(), and a second time with exec(). Now you have something.

I'm wondering if perhaps adding a second method to Database is a better way
to go.

I'm thinking the same way, now :-)

If we added a method named executeMultiple(), it would introduce a
new call into the API, but it would also have the following benefits:

  1. We would avoid changing the sigs for Statement/UpdateStatement to
    account for passing an already-prepared SQLiteStatement.
  2. We would avoid the overhead/performance hit of running
    sqlite3_prepare_v2 on the multiple statement SQL.
  3. We would be able to return a value/struct appropriate for multiple
    statements (either a count of rows affected, or something else)
  4. We would force the developer to explicitly call this method to perform
    execution of multiple statements, to indicate that they do indeed want to
    execute multiple statements at once, which is not a highly-used use case.

If we decide to go this route, I would handle the sqlite3_exec() call right
there in the Database class, instead of creating another class. I would
adjust the error message thrown by Statement when it gets handed a
multiple-statement string to indicate to the developer that they should use
the executeMultiple API instead. We could then adjust the test case to make
sure that the correct error is thrown in this case.

You have my full acknowledgment, Peter! Thanks for your dedication :-)

I'm still in favor of sharing the result of this new method with the one returned by UpdateStatement.execute() and Database.execute(sql, arguments). You may have a different opinion than mine on leaving nil its insertedRowID.

@groue
Copy link
Owner

groue commented Aug 17, 2015

I'm still in favor of sharing the result of this new method with the one returned by UpdateStatement.execute() and Database.execute(sql, arguments). You may have a different opinion than mine on leaving nil its insertedRowID.

GRDB tries to hide the statefulness of sqlite behind an Object-Oriented API. Sure it leaks in some places, but I have tried hard to close most doors.

The "last inserted Row ID" is a tough one. There is no Database.lastInsertedRowID read-only property, and a Changes.insertedRowID instead (documented along with the sentence "Relevant if and only if the statement is an INSERT statement" in order to limit the leaking...).

If we would return an insertedRowID from executeMultiple(), we would bring back a stateful API, which I don't like. So I'm really in favor of returning a DatabaseChanges (global class extracted from UpdateStatement.Changes), with a nil insertedRowID (documented as such) returned from Database.executeMultiple().

@peter-ss
Copy link
Contributor Author

I'm fine with returning a value consistent with the Database.execute method
(UpdateStatement.Changes). I will do that. Look for a PR shortly.

I will adjust the unit tests accordingly as well. One thing I did notice,
with testChangesReturnedByDatabaseExecute() - the XCTAssertTrue test for
insertedRowID == nil always fails, because the value returned from
sqlite3_last_insert_rowid after the DELETE statement is in fact 2. After
looking at the sqlite documentation, I'm thinking that the rowid is not
cleared by sqlite when the DELETE is executed, by design. It still reflects
the id of the last row inserted, even if a delete has happened since then.
I'm going to remove the assertion from the test.

Peter

On Mon, Aug 17, 2015 at 10:27 AM, Gwendal Roué [email protected]
wrote:

Ah - one additional wrinkle: the execute() method returns an
UpdateStatement.Changes struct. What should we do in the case of multiple
statements? How would we have it conform to the existing method signature?

Hm. What about extracting UpdateStatement.Changes to a general
DatabaseChanges struct, and leaving the insertedRowID nil in case of
multiple statements?

Also, it occurs to me that if we refactor the logic that determines if we
have multiple statements so that it's part of Database, we would be
compiling the SQL when we execute sqlite3_prepare_v2 as part of the
determination. If we find that we DO have a single statement, we would not
want to run prepare on the SQL again (for performance reasons), so we would
want some way to pass UpdateStatement() an already-prepared
SQLiteStatement, which would force a change to the init() signature for
Statement and UpdateStatement.

Correct. Yet signatures of internal APIs are not sacred: I would not care
if they would have to change.

If we find that we DO NOT have a single
statement, then we throw away the results of the call to prepare(), since
we can't use them for multiple statement execution, which is again a
performance hit.

This is indeed a problem: we would need to parse the initial statement
twice: once with prepare(), and a second time with exec(). Now you have
something.

I'm wondering if perhaps adding a second method to Database is a better way
to go.

I'm thinking the same way, now :-)

If we added a method named executeMultiple(), it would introduce a
new call into the API, but it would also have the following benefits:

  1. We would avoid changing the sigs for Statement/UpdateStatement to
    account for passing an already-prepared SQLiteStatement.
  2. We would avoid the overhead/performance hit of running
    sqlite3_prepare_v2 on the multiple statement SQL.
  3. We would be able to return a value/struct appropriate for multiple
    statements (either a count of rows affected, or something else)
  4. We would force the developer to explicitly call this method to
    perform execution of multiple statements, to indicate that they do indeed
    want to execute multiple statements at once, which is not a highly-used use
    case.

If we decide to go this route, I would handle the sqlite3_exec() call right
there in the Database class, instead of creating another class. I would
adjust the error message thrown by Statement when it gets handed a
multiple-statement string to indicate to the developer that they should use
the executeMultiple API instead. We could then adjust the test case to make
sure that the correct error is thrown in this case.

You have my full acknowledgment, Peter! Thanks for your dedication :-)

I'm still in favor of sharing the result of this new method with the one
returned by UpdateStatement.execute() and Database.execute(sql, arguments).
You may have a different opinion than mine on leaving nil its insertedRowID.


Reply to this email directly or view it on GitHub
#4 (comment).

@groue
Copy link
Owner

groue commented Aug 17, 2015

That's so cool :-)

I will adjust the unit tests accordingly as well. One thing I did notice,
with testChangesReturnedByDatabaseExecute() - the XCTAssertTrue test for
insertedRowID == nil always fails, because the value returned from
sqlite3_last_insert_rowid after the DELETE statement is in fact 2.

Yes, I did update that test in 53671c0 :-)

@groue
Copy link
Owner

groue commented Aug 17, 2015

Closed in favor of #6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants