paritydb support for parachains db. #4838

cheme · 2022-02-02T17:53:42Z

This PR remove rocksdb usage for users that are using paritydb.
Note that users that are currently using paritydb would need to purgedb and resync.

Few things that could be added/change:

Parity-db do not use the cache parameter. This ae177bc could add a simple keyvalue cache , not sure if needed, not even sure it would really help performance.
avstore meta an chain selection column are fully switched to ordered, 8b66d0a could also move content that requires ordering to their own collection (paritydb only) for better performances.
Using a specific trait for the database is not strictly mandatory, can remove it and just use plain KeyValueDb, but the trait could be use for future changes (keyvaluedb trait for instance misses errors for iterator).
write_lock may not be necessary, but I do not know polkadot internal enough to assume there is no concurrent writes (would be strange).
bridge code, I started updated it, but revert the changes in ccc0c8f as the content is not really up to date, probably recent code is in https://github.com/paritytech/parity-bridges-common
the directories are looking like this : chains/polkadot/paritydb/full for all substrate db file chains/polkadot/paritydb/parachains for all polkadot db files.
So purge-db command in substrate should be extended with a --parachains arguments (see test).

This reverts commit ae177bc.

This reverts commit 8b66d0a.

node/subsystem-util/src/database.rs

node/service/src/lib.rs

node/service/src/parachains_db/mod.rs

arkpar · 2022-02-03T16:21:27Z

avstore meta an chain selection column are fully switched to ordered, for parity db we could keep part of the content unordered. This commit 8b66d0a propose add two column for paritydb to store only the content that requires ordering. This should be better for perf, but only really needed if content under AVAILABLE_PREFIX, CHUNK_PREFIX, META_PREFIX from avstore and BLOCK_ENTRY_PREFIX and LEAVES_KEY of chain selection is worth it. cc @ordian

In general it is better to separate different types of data into different columns and not invent key schemes. So +1 from me.

the directory is looking like this : chains/polkadot/paritydb/full for all substrate db file chains/polkadot/paritydb/parachains for all polkadot db files.
Currently purge dir is not purging as necessary (see TODO in test), I am considering paritytech/substrate@master...cheme:purge-dir (purge everything under chains/polkadot/paritydb or chains/polkadot/db).

I think it's fine if purge-db command does not delete all, but a specific database. It already had a --light distinction until recently.

polkadot purge-db 
polkadot purge-db --parachains

rphmeier · 2022-02-18T17:53:01Z

Note that users that are currently using paritydb would need to purgedb and resync.

@cheme Can you add a warning or a migration or something? A note in a PR that nobody in the validator community reads is not enough to inform the community.

Many validators use paritydb already and will be caught by surprise if their node suddenly stops working. We can also provide a --use-old-parachains-db or something like that in the Polkadot CLI so the validators can sync a new node in the background.

cheme · 2022-02-18T18:09:45Z

Can you add a warning or a migration or something?

writing a migration sounds like a good idea (at least from rocksdb to paritydb it is easy), will do nw.

shawntabrizi · 2022-02-18T18:54:59Z

There are labels for this like "migation needed" "breaks api" "breaks everything"

arkpar · 2022-02-18T19:29:41Z

I don't think migration from rocksdb to paritydb is needed at this point. The whole point of introducing --db=auto is so that users are able to use existing database.

Note that users that are currently using paritydb would need to purgedb and resync.

@cheme Is that really the case? Changes in paritydb related to tree index support are backwards compatible with existing database format (version 5). Are they not?

cheme · 2022-02-19T09:51:51Z

I don't think migration from rocksdb to paritydb is needed at this point. The whole point of introducing --db=auto is so that users are able to use existing database.

Note that users that are currently using paritydb would need to purgedb and resync.

@cheme Is that really the case? Changes in paritydb related to tree index support are backwards compatible with existing database format (version 5). Are they not?

yes changes are backward compatible (untested but it was written to be).

At first I did implement 'auto' to keep using existing db, but I switch that off during review :(. I could revert (I remember testing the use case).
Migration looks pretty nice to have to me, but yes main question is should we force switch users.

arkpar · 2022-02-19T10:56:26Z

This is how "auto" is supposed to work at the moment:

For the main db: If a dir with paritydb exists - use it, otherwise create or open a rocksdb instance.
For the parachain db: If paritydb instance exists - use it, otherwise use the same format as main db (as determined on the previous step)

ordian

A node with paritydb synced successfully on Versi and an issue has been created for further improvements mentioned in this PR. So LGTM.

As for the migration, UIUC validators running with --db=auto will continue to have rocksdb as a backend for the parachain db, unless they purge the db (but the ones with --db=paritydb-experimental will create one from scratch). Whether they are advised to use auto or purge the parachains db should be clarified in the release notes.

rphmeier · 2022-02-21T01:35:27Z

but the ones with --db=paritydb-experimental will create one from scratch

That is, that they'll instantiate a new DB but won't require resync?

A node with paritydb synced successfully on Versi

I suspect that we should see how consensus behaves when all nodes use paritydb (it would be great to have this as a ZombieNet test, can you file an issue for that?)

ordian · 2022-02-21T08:49:03Z

That is, that they'll instantiate a new DB but won't require resync?

Yes, AFAIK, purging the parachains DB doesn't require a resync (of the main DB). But another test won't hurt: https://github.com/paritytech/devops/issues/1423#issuecomment-1046612572.

(it would be great to have this as a ZombieNet test, can you file an issue for that?)

#4952

cheme · 2022-02-21T10:50:40Z

cea49b3 did change 'auto' to not change db.
However if https://github.com/paritytech/devops/issues/1423#issuecomment-1046612572 runs well, it would make more sense to have 'auto' switching from rocksdb to paritydb when the main db is paritydb.

ordian · 2022-02-22T11:04:26Z

Yes, AFAIK, purging the parachains DB doesn't require a resync (of the main DB). But another test won't hurt: https://github.com/paritytech/devops/issues/1423#issuecomment-1046612572.

This was successfully tested.

However if https://github.com/paritytech/devops/issues/1423#issuecomment-1046612572 runs well, it would make more sense to have 'auto' switching from rocksdb to paritydb when the main db is paritydb.

Not sure what you mean? I think --db=paritydb-experimental should imply paritydb for both databases. Even if there's an existing rocksdb one (or two).

ordian · 2022-02-22T11:06:47Z

Many validators use paritydb already and will be caught by surprise if their node suddenly stops working.

Do we know how many? Because if they all start with empty parachains db and they are more than 1/3, it might cause some problems with finality.

Code-wise, this PR is good to go IMHO unless we hear some objections today.

This reverts commit cea49b3.

cheme · 2022-02-22T11:20:17Z

Not sure what you mean? I think --db=paritydb-experimental should imply paritydb for both databases. Even if there's an existing rocksdb one (or two).

yes, the change I did (revert of cea49b3), revert to using parity-db for parachains whenever it is used as main database (even when using 'auto').

That was the conclusion of a conversation with @arkpar yesterday:

basically there is no sense in using two different database if switching is smooth.
Also if the switch fine now, it should be better to move as soon as possible (in case future changes would make the switch harder later).

On a different topic, I did implement a first version of remove prefix for paritydb and code is not as simple as I wanted, so I am not sure anymore if it is a good idea (paritytech/parity-db@master...cheme:remove_range).

ordian · 2022-02-23T10:54:41Z

Let's wait for the results of https://github.com/paritytech/devops/issues/1423#issuecomment-1047852355 and then merge it if successful.

arkpar · 2022-03-03T11:49:29Z

bot merge

cheme added 16 commits February 1, 2022 17:22

parity db subsystem without cache and no splitted column

0851b76

fmt

465489c

fix path (auto from parity-db fail)

4cea166

lru cache for db column with cache

ae177bc

Revert "lru cache for db column with cache"

64048f2

This reverts commit ae177bc.

Write_lock mutex

00773ab

theoric code for bridges

08c2029

revert changes

f90f75e

Revert bridge changes

ccc0c8f

fix spec_version

43cac88

update parity db

6904c7d

test purge-db

013f7ad

Use specific ordered collection with paritydb.

8b66d0a

Revert "Use specific ordered collection with paritydb."

b3b4514

This reverts commit 8b66d0a.

fix chain selection tests.

dd0dde5

remove patch

413c0df

cheme commented Feb 2, 2022

View reviewed changes

node/subsystem-util/src/database.rs Show resolved Hide resolved

arkpar reviewed Feb 2, 2022

View reviewed changes

node/service/src/lib.rs Outdated Show resolved Hide resolved

fix auto.

86d2322

ordian reviewed Feb 3, 2022

View reviewed changes

node/service/src/lib.rs Show resolved Hide resolved

bkchr reviewed Feb 3, 2022

View reviewed changes

node/service/src/parachains_db/mod.rs Outdated Show resolved Hide resolved

Remove useless exists directory method

65ed77d

purge chain without parity-db removal

223b285

cheme added B1-releasenotes C1-low PR touches the given topic and has a low impact on builders. labels Feb 4, 2022

spellcheck

69fabed

cheme marked this pull request as ready for review February 4, 2022 13:14

cheme added the A0-please_review Pull request needs code review. label Feb 4, 2022

arkpar approved these changes Feb 14, 2022

View reviewed changes

cheme added 3 commits February 18, 2022 11:03

fix assertion

c1f8f61

format

1c23ec7

update parity-db and fmt

4cb2ade

rphmeier added A5-grumble and removed A0-please_review Pull request needs code review. labels Feb 18, 2022

ordian approved these changes Feb 19, 2022

View reviewed changes

ordian mentioned this pull request Feb 21, 2022

Zombienet: add a test with --db=paritydb-experimental #4952

Closed

Auto keep using rocksdb when it exists.

cea49b3

Revert "Auto keep using rocksdb when it exists."

ab44bb6

This reverts commit cea49b3.

ordian removed the A5-grumble label Feb 23, 2022

cheme added 3 commits March 3, 2022 10:05

Merge branch 'master' into no_kvdb4, update parity-db version.

dc47efe

Update kvdb version.

71489b3

Merge branch 'master' into no_kvdb4

02a56ec

paritytech-processbot bot merged commit 0879bdb into paritytech:master Mar 3, 2022

librelois mentioned this pull request Mar 21, 2022

Update substrate/polkadot/cumulus from v0.9.17 to v0.9.18 moonbeam-foundation/moonbeam#1357

Closed

Dengjianping mentioned this pull request Apr 19, 2022

Bump deps to v0.9.18 Manta-Network/Manta#481

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paritydb support for parachains db. #4838

paritydb support for parachains db. #4838

cheme commented Feb 2, 2022 •

edited

Loading

arkpar commented Feb 3, 2022 •

edited

Loading

rphmeier commented Feb 18, 2022 •

edited

Loading

cheme commented Feb 18, 2022

shawntabrizi commented Feb 18, 2022

arkpar commented Feb 18, 2022

cheme commented Feb 19, 2022 •

edited

Loading

arkpar commented Feb 19, 2022

ordian left a comment •

edited

Loading

rphmeier commented Feb 21, 2022 •

edited

Loading

ordian commented Feb 21, 2022 •

edited

Loading

cheme commented Feb 21, 2022

ordian commented Feb 22, 2022

ordian commented Feb 22, 2022

cheme commented Feb 22, 2022

ordian commented Feb 23, 2022

arkpar commented Mar 3, 2022

paritydb support for parachains db. #4838

paritydb support for parachains db. #4838

Conversation

cheme commented Feb 2, 2022 • edited Loading

arkpar commented Feb 3, 2022 • edited Loading

rphmeier commented Feb 18, 2022 • edited Loading

cheme commented Feb 18, 2022

shawntabrizi commented Feb 18, 2022

arkpar commented Feb 18, 2022

cheme commented Feb 19, 2022 • edited Loading

arkpar commented Feb 19, 2022

ordian left a comment • edited Loading

Choose a reason for hiding this comment

rphmeier commented Feb 21, 2022 • edited Loading

ordian commented Feb 21, 2022 • edited Loading

cheme commented Feb 21, 2022

ordian commented Feb 22, 2022

ordian commented Feb 22, 2022

cheme commented Feb 22, 2022

ordian commented Feb 23, 2022

arkpar commented Mar 3, 2022

cheme commented Feb 2, 2022 •

edited

Loading

arkpar commented Feb 3, 2022 •

edited

Loading

rphmeier commented Feb 18, 2022 •

edited

Loading

cheme commented Feb 19, 2022 •

edited

Loading

ordian left a comment •

edited

Loading

rphmeier commented Feb 21, 2022 •

edited

Loading

ordian commented Feb 21, 2022 •

edited

Loading