Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database Rollback failed on Deadlock #637

Closed
maxstrebel opened this issue Aug 15, 2022 · 8 comments
Closed

Database Rollback failed on Deadlock #637

maxstrebel opened this issue Aug 15, 2022 · 8 comments

Comments

@maxstrebel
Copy link

maxstrebel commented Aug 15, 2022

Description

We experience a lot of problems with our page builder right now. With backend users reporting scrambled blocks and lost blocks. You might want to have a closer look. But in general, since our Upgrade to Craft 4 the blocks will scramble (show up at other places on the page) or vanish.

Right now we remove "Typed Link Field" from our content builder, in the hope we can get rid of some of the problems. When we tested, we found a Deadlock (SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock;), that luckily was investigated by our webhost. I will quote quickly:

It looks as though there was a request processing which included an insert into lenz_linkfield. At the same time another process made a change to structureelements which resulted in the rollback.

I don't have too much experience of how NEO organises its database updates internally. But if it doesn't wrap its own DB updates in with Craft's own DB transactions during a Save I could see a possible scenario where Neo updates all of its models based on some edited Craft Entries, but then Craft rolls back its transaction (due to a deadlock in this case) ending up with the Neo blocks orphaned.

For me this looks like a possible explanation for what we are experiencing. Can you confirm, that this might be an issue?

Here is the deadlock:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2022-08-11 11:31:05 0x7f570e784700
*** (1) TRANSACTION:
TRANSACTION 22379582, ACTIVE 2 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 81 lock struct(s), heap size 8312, 333 row lock(s), undo log entries 227
MariaDB thread id XXX, OS thread handle XXX, query id XXX XXX XXX_staging Update
INSERT INTO `lenz_linkfield` (`elementId`, `fieldId`, `siteId`, `linkedId`, `linkedSiteId`, `linkedTitle`, `linkedUrl`, `payload`, `type`, `dateCreated`, `dateUpdated`, `uid`) VALUES (XXX, XXX, 6, XXX, NULL, 'let\'s test it', 'https://XXX', '{\"customText\":\"Let\'s test it!\"}', 'entry', '2022-08-11 11:31:05', '2022-08-11 11:31:05', 'XXX')
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8292 page no 381 n bits 992 index XXX of table `XXX_staging`.`lenz_linkfield` trx id 22379582 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

*** CONFLICTING WITH:
RECORD LOCKS space id 8292 page no 381 n bits 992 index XXX of table `XXX_staging`.`lenz_linkfield` trx id 22379582 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

RECORD LOCKS space id 8292 page no 381 n bits 992 index XXX of table `XXX_staging`.`lenz_linkfield` trx id 22379585 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;


*** (2) TRANSACTION:
TRANSACTION XXX, ACTIVE 2 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 42 lock struct(s), heap size 3488, 34 row lock(s), undo log entries 32
MariaDB thread id XXX, OS thread handle XXX, query id XX XXX XXX_staging Updating
UPDATE `structureelements` SET `root`=1385814, `dateUpdated`='2022-08-11 11:31:04' WHERE `id`=1385814
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8344 page no 2238 n bits 152 index structureelements_root_idx of table `XXX_staging`.`structureelements` trx id XXX lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

*** CONFLICTING WITH:
RECORD LOCKS space id 8344 page no 2238 n bits 136 index structureelements_root_idx of table `XXX_staging`.`structureelements` trx id XXX lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex XXX; asc supremum;;

Record lock, heap no 44 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex XXX; asc   %I;;
 1: len 4; hex XXX; asc   %I;;

*** WE ROLL BACK TRANSACTION (0)

Any help would be hugely appreciated. Thank you for your time.

Steps to reproduce

It is hard to reproduce. We've had authors working in the backend for testing. After 4 hours of testing the authors provoked 4 deadlocks. But there was no indication why. There were entries with a lot of blocks, but there was also one entry affected with only one redactor and one link block.

  1. Work in the backend
  2. Have a lot of queries while working
  3. Experience Deadlock
  4. Whole content or partial content on page is wiped/scrambled

Other information

  • Neo version: 3.2.5
  • Craft version: 4.2.1.1
  • (if multi-site) What is the affected Neo field's propagation method:
  • Is eager-loading used?
@ttempleton
Copy link
Contributor

If a change to structureelements is involved, then that does sound like it could be happening when Neo builds (or deletes) a block structure, but that is using transactions (e.g. https://github.com/spicywebau/craft-neo/blob/3.3.0/src/services/Blocks.php#L264).

I'll have a look and see if there's anywhere we're missing the use of transactions - but does this still occur after you upgrade to Neo 3.3.0?

@maxstrebel
Copy link
Author

maxstrebel commented Aug 16, 2022

Thank you for the quick fix on 3.3.0 and thanks for looking into it. I will keep you posted.

Unfortunately I'm very sure that 3.3.0 and the "scrambling" problem are unrelated. Our blocks would also show up on parents they are not allowed to be inside. So I was hoping to find a reasonable solution – as I cannot reproduce it, but it occurs to our authors so often that we have to lock our backend to all 35 users as also to remove Typed Link Field, as it was involved in the lock.

@maxstrebel
Copy link
Author

For some context, the problem might also be, that there was no savepoint to rollback to? This is from the logs:

2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL4 does not exist
2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL3 does not exist
2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL2 does not exist
2022-08-11@13:31:05
php-staging
[+]
[warning] craft\db\mysql\Schema::rollBackSavepoint: Tried to roll back a savepoint, but it does not exist: SQLSTATE[42000]: Syntax error or access violation: 1305 SAVEPOINT LEVEL1 does not exist
2022-08-11@13:31:05
php-staging
[+]
[error] yii\db\Exception: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction

@maxstrebel maxstrebel closed this as not planned Won't fix, can't repro, duplicate, stale Aug 17, 2022
@maxstrebel maxstrebel reopened this Aug 19, 2022
@maxstrebel
Copy link
Author

maxstrebel commented Aug 19, 2022

Hi Thomas, sorry I have to reopen this. We still experience a myriad of problems. I do not know if NEO is the alone cause of this. But as Neo behaves strangely (bocks deleting content, etc.) I place it here. At the moment nobody can work in the backend without breaking the content. I hope you can help us get to a stable state again:

Here's two deadlocks from this morning. We moved from a load balanced environment to a simple server. But we still get therese deadlocks.

2022-08-19 08:27:59 [web.ERROR] [yii\db\Exception] SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction The SQL being executed was: UPDATE `structureelements` SET `root`=1428914, `dateUpdated`=‘2022-08-19 08:27:58’ WHERE `id`=1428914

...


2022-08-19 08:32:46 [web.ERROR] [yii\db\Exception] SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction The SQL being executed was: UPDATE `structureelements` SET `root`=1429665, `dateUpdated`=‘2022-08-19 08:32:45’ WHERE `id`=1429665

On another note here's the additional problems we experience right now:

  • Validation problems when matrix fields in a Neo Child are involved. Fields will empty themselves, vanish in resave. Time wise it also matches the time the deadlock happened
  • Recently we had tons of [yii\web\HttpException:400] A provisional draft already exists for element/user 478255/113110 errors
  • Discard changes will cause a Javascript error and make the backend unusable

Currently we're in the process of trying to replicate it in a vanilla Craft install. Thats our current setup (we already removed Typed Link Field and LJ dynamic fields):

"born05/craft-assetusage": "3.0.0",
"clubstudioltd/craft-asset-rev": "7.0.0",
"craftcms/cms": "4.2.1.1",
"craftcms/element-api": "3.0.1.1",
"craftcms/feed-me": "5.0.4",
"craftcms/redactor": "3.0.2",
"mmikkel/retcon": "2.4.2",
"nystudio107/craft-retour": "4.0.2",
"nystudio107/craft-seomatic": "4.0.7",
"putyourlightson/craft-blitz": "4.2.1",
"spicyweb/craft-neo": "3.3.3",
"verbb/formie": "2.0.6",
"verbb/super-table": "3.0.0",
"vlucas/phpdotenv": "^3.4.0",
"wrav/oembed": "2.2.1"

@maxstrebel
Copy link
Author

maxstrebel commented Aug 19, 2022

I can provoke the "A provisional draft already exists for element/user 2206/1." error in a vanilla Craft install with only Neo installed. But I have no clear way to reproduce it. When I can, then with a Neo child, that contains a Matrix field with two fields.

image

It also seems that installing SEOmatic or Typed Link Field make it appear more frequent.

@ntmagda
Copy link

ntmagda commented Aug 19, 2022

Hi all,

I would like to add my observation into the context here:
What we experienced with the Neo blocks is a weird behaviour with saving entries and validation of required fields

What happens:

Content disappears on save when validation warning has been previously triggered.

How to reproduce:

  • Create a new entry
  • Save it
  • Add a required field within NEO block and do not fill it in
  • Add another field in NEO block and fill it in
  • Click save and wait for validation error.
  • Fill in the required field (the one that triggered validation error previously).
  • Save again and watch your previously added data to disappear

System specs:

I managed to reproduce it on the fresh craft install.

Craft Pro 4.2.1.1
Neo version: 3.3.3

To watch it in action:

Screen.Recording.2022-08-19.at.16.06.42.mov

@ttempleton
Copy link
Contributor

ttempleton commented Aug 22, 2022

@maxstrebel When the provisional draft error occurs, do you get more than one POST at the same time to e.g. index.php?p=admin/actions/elements/save-draft? Just wondering if there's anything happening (whether caused by Neo or not) that could cause a second attempted draft save to start before the first one completes.

@ntmagda I think that's unrelated to this issue, but I've added a fix for that (1c19aff) in Neo 3.3.4.

@maxstrebel
Copy link
Author

I guess this might have been several different issues at the end.

Thus I'll close this for now. And thanks for the scramble fix 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants