Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix occasional "database is locked" errors while loading sample data #4648

Merged

Conversation

elia
Copy link
Member

@elia elia commented Sep 28, 2022

Summary

SQLite doesn't support full-concurrency and often fails with "SQLite3::BusyException: database is locked" when ActiveJob uses the default threaded :async adapter. ActiveJob is used by ActiveStorage to analyze uploaded images for metadata.

A side effect of this is that SQLite timeouts are no longed triggered and the sample data loads sequentially in way less time (e.g. during sandbox generation).

From https://stackoverflow.com/a/5767295:

SQLite locks the entire database during a write operation (i.e. when a write is happening on any table, no other write, to any table anywhere can happen at the same time). Some databases provide concurrent writes via table-level locks, or sometimes row-level locks. To contrast this to SQLite's implementation, a table-level lock basically means that when you're writing data to a given table, no other thread can write to any record in that table, at the same time (however, writes to other tables can occur simultaneously, in some circumstances). Similarly, row-level locks take it even further, and allow only the necessary rows involved to be locked, allowing concurrent writes to the same table to happen from multiple threads. The idea here is to minimize the amount of data you need to lock for a write operation, which effectively increases the amount of concurrent writes possible across the database, and depending on your implementation/how you use your database, this can significantly increase throughput.

Now, back to your question...

The fact that SQLite is threadsafe doesn't mean that multiple threads can concurrently write to it - it means that it has a way of handling access from multiple threads - which is to (a) allow timeouts/retries, and (b) to return a useful error (SQLITE:Busy) when a lock is currently held on the database. That is, threadsafe means nothing more than, "Multiple threads can access this data in a way that won't result in data corruption due to simultaneous access."

Basically, somewhere in the code, one thread is trying to do its update before another thread has released its lock on the database. This is a common hurdle with SQLite, because the authors/documentation will tell you that SQLite can handle concurrency like a champ. The reality is that what SQLite considers "concurrency support" amounts to trying to be very fast so that locks on the database are only held for a very short time, and therefore locks on the database are released before timeouts are hit. In a lot of cases, this works just fine and never gets in your way. However, having very short-lived locks is not the same as actually allowing concurrent writes from multiple threads.

Think of it like the way that iOS does multitasking (at least as of iOS 5, when I'm writing this) - really what it's doing is putting other apps on pause, and coming back to them. This has the effect that (a) battery life is much better due to lower CPU utilization, and (b) you don't have to start an app from scratch every time you launch it. This is great, but the actual word "multitasking" as used in iOS doesn't technically mean the same thing as "multitasking" in other environments (even Mac OS X).

SQLite is the same way. Do they have "concurrency" support? Well sort of, but the way they define the word "concurrency" isn't the way the rest of the DB world defines "concurrency".

No one is really wrong, but in cases like these, it adds to implementation confusion.

Checklist

Check out our PR guidelines for more details.

The following are mandatory for all PRs:

  • I have written a thorough PR description.
  • I have kept my commits small and atomic.
  • I have used clear, explanatory commit messages.

The following are not always needed (cross them out if they are not):

  • I have added automated tests to cover my changes.
  • I have attached screenshots to demo visual changes.
  • I have opened a PR to update the guides.
  • I have updated the readme to account for my changes.

SQLite doesn't support full-concurrency and often fails with
"SQLite3::BusyException: database is locked" when ActiveJob uses the default
threaded `:async` adapter. ActiveJob is used by ActiveStorage to analyze uploaded
images for metadata.

A side effect of this is that SQLite timeouts are no longed triggered
and the sample data loads sequentially in way less time (e.g. during
sandbox generation).

From https://stackoverflow.com/a/5767295:

SQLite locks the entire database during a write operation (i.e. when a
write is happening on any table, no other write, to any table anywhere
can happen at the same time). Some databases provide concurrent writes
via table-level locks, or sometimes row-level locks. To contrast this
to SQLite's implementation, a table-level lock basically means that
when you're writing data to a given table, no other thread can write
to any record in that table, at the same time (however, writes to
other tables can occur simultaneously, in some circumstances).
Similarly, row-level locks take it even further, and allow only the
necessary rows involved to be locked, allowing concurrent writes to
the same table to happen from multiple threads. The idea here is to
minimize the amount of data you need to lock for a write operation,
which effectively increases the amount of concurrent writes possible
across the database, and depending on your implementation/how you use
your database, this can significantly increase throughput.

Now, back to your question...

The fact that SQLite is threadsafe doesn't mean that multiple threads
can concurrently write to it - it means that it has a way of handling
access from multiple threads - which is to (a) allow timeouts/retries,
and (b) to return a useful error (SQLITE:Busy) when a lock is
currently held on the database. That is, threadsafe means nothing more
than, "Multiple threads can access this data in a way that won't
result in data corruption due to simultaneous access."

Basically, somewhere in the code, one thread is trying to do its
update before another thread has released its lock on the database.
This is a common hurdle with SQLite, because the authors/documentation
will tell you that SQLite can handle concurrency like a champ. The
reality is that what SQLite considers "concurrency support" amounts to
trying to be very fast so that locks on the database are only held for
a very short time, and therefore locks on the database are released
before timeouts are hit. In a lot of cases, this works just fine and
never gets in your way. However, having very short-lived locks is not
the same as actually allowing concurrent writes from multiple threads.

Think of it like the way that iOS does multitasking (at least as of
iOS 5, when I'm writing this) - really what it's doing is putting
other apps on pause, and coming back to them. This has the effect that
(a) battery life is much better due to lower CPU utilization, and (b)
you don't have to start an app from scratch every time you launch it.
This is great, but the actual word "multitasking" as used in iOS
doesn't technically mean the same thing as "multitasking" in other
environments (even Mac OS X).

SQLite is the same way. Do they have "concurrency" support? Well sort
of, but the way they define the word "concurrency" isn't the way the
rest of the DB world defines "concurrency".

No one is really wrong, but in cases like these, it adds to
implementation confusion.

Co-Authored-By: Alberto Vena <[email protected]>
@elia elia force-pushed the elia+kennyadsl/fix-sqlite-busy-error branch from fc47821 to e6763c0 Compare September 28, 2022 13:17
@elia elia marked this pull request as ready for review September 28, 2022 15:12
Copy link
Contributor

@waiting-for-dev waiting-for-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @elia!

@waiting-for-dev waiting-for-dev added the type:enhancement Proposed or newly added feature label Sep 29, 2022
@kennyadsl kennyadsl merged commit 29da36c into solidusio:master Sep 30, 2022
@kennyadsl kennyadsl deleted the elia+kennyadsl/fix-sqlite-busy-error branch September 30, 2022 08:33
@waiting-for-dev waiting-for-dev added the changelog:solidus_sample Changes to the solidus_sample gem label Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog:solidus_sample Changes to the solidus_sample gem type:enhancement Proposed or newly added feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants