Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: support check --upgrade to upgrade old versions of RocksDB/RevIndex #581

Merged
merged 16 commits into from
Jan 11, 2025

Conversation

ctb
Copy link
Collaborator

@ctb ctb commented Jan 9, 2025

This PR deals with the ramifications of a format change for RocksDB/RevIndex databases that probably occurred around v0.9.7, in which the 'storage' column was added to RevIndex databases. Prior to that version, RevIndex databases always stored sketches externally; see #390, #416, and sourmash-bio/sourmash#3250.

The addition of the 'storage' column breaks when trying to open old RocksDB/RevIndex databases with more recent versions of the plugin; the error message is: Invalid argument: Column family not found: storage.

This PR:

  • updates manysearch and fastmultigather so that a slightly better error message is reported;
  • upgrades index with an option --upgrade that will open when searching against a RocksDB, they will automatically upgrade the internal RocksDB format from older versions. This is as simple as opening them in read/write mode instead of read-only mode, which permits sourmash to upgrade the RevIndex;
  • adds tests for RocksDB/RevIndex databases created with v0.9.5 and v0.9.13 (with both internal and external storage) so that plugin tests will properly detect database format changes.

An alternative solution would be to always open the database in rw mode so that it just gets upgraded automatically, but it gives me the heeby jeebies to open a database in rw mode unnecessarily...

Fixes #580

TODO

  • create issue with error message as doc bait

@bluegenes
Copy link
Contributor

I like the idea of having check update them! Can we output a warning from standard check and then have a flag, --update to go ahead and update?

@ctb ctb changed the title WIP: open rocksdb in rw mode so that old databases can be upgraded MRG: support check --upgrade to upgrade old versions of RocksDB/RevIndex Jan 11, 2025
@ctb ctb changed the base branch from main to update_py_sourmash_v4_8_13 January 11, 2025 16:33
Base automatically changed from update_py_sourmash_v4_8_13 to main January 11, 2025 17:05
@ctb
Copy link
Collaborator Author

ctb commented Jan 11, 2025

Ready for review & merge!

@ctb
Copy link
Collaborator Author

ctb commented Jan 11, 2025

I like the idea of having check update them! Can we output a warning from standard check and then have a flag, --update to go ahead and update?

implemented!

Copy link
Contributor

@bluegenes bluegenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

@ctb ctb merged commit a9a4934 into main Jan 11, 2025
3 checks passed
@ctb ctb deleted the fix_ro_rocksdb branch January 11, 2025 18:45
@ctb ctb mentioned this pull request Jan 15, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants