Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement .insert on LCA_SqliteDatabase and/or support on-disk/lowmem lca index #1964

Open
ctb opened this issue Apr 19, 2022 · 0 comments
Open

Comments

@ctb
Copy link
Contributor

ctb commented Apr 19, 2022

In #1808, we are adding LCA_SqliteDatabase, which supports on-disk LCA databases.

The "standard" way of creating these is to use sourmash lca index ... -F sql, which has the disadvantage of first creating an LCA_Database in memory, and only then saving it to disk. This is unnecessarily memory intensive.

An alternative would be to directly support creation and update of an LCA database on disk in lca index, perhaps through the implementation of an insert on LCA_SqliteDatabase. This would be much lower memory.

Right now, you can do this by combining sourmash sig cat ... -o lca.sqldb (to create a SqliteIndex) then sourmash tax prepare -t <tax> -o lca.sqldb (to add a LineageDB_Sqlite table), but this skips all of the
consistency checking that sourmash lca index does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant