Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update databases on taxadb-cache #123

Open
AnneSoBen opened this issue Sep 11, 2024 · 1 comment
Open

Update databases on taxadb-cache #123

AnneSoBen opened this issue Sep 11, 2024 · 1 comment

Comments

@AnneSoBen
Copy link

Hello,
First, thank you for developing taxadb, I just discovered and tested the package and it is very useful to me!
On the page Data Sources, you write: « The taxadb maintainers take semi-annual snapshots and distribute versioned releases of the underlying data ». I guess that you refer to the data published here: https://github.com/boettiger-lab/taxadb-cache/tree/master/data/2022.
These data seem to date back to 2022. Do you think it would be possible to update these files?
Indeed, I want to match TAXREF (a taxonomy held by the French National Natural History Museum) names to the GBIF backbone taxonomy but some TAXREF names have no match in the GBIF taxonomy using taxadb, although they are present in the last GBIF backbone that I downloaded.
It would be awesome to update the taxonomies on taxadb-cache but if you can’t, I can use the scripts you provided in https://github.com/boettiger-lab/taxadb-cache/tree/master/R to preprocess the GBIF taxonomy myself (I managed to do it with the gbif.R script). However, don’t know how to tell taxadb where to read the database I created when using taxadb::taxa_tbl(). Could you explain how to do it? I suppose that there is something to do with the "backends" (and the environmental variable TAXADB_HOME) but I am very new to these tools.

Thank you in advance for your answers!

@cboettig
Copy link
Member

thanks yes, we're overdue for an update. Some of these are easier to do than others as some providers keep changing schema on us. stand by we'll try and get this done soon.

As a stop gap option at least I recommend COL, https://www.catalogueoflife.org/data/download, or the current GBIF backbone, https://hosted-datasets.gbif.org/datasets/backbone/current/backbone.zip

in the next release I'm hoping to include a helper function to access these two directly, since they already come in a reasonably accessible darwin core archive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants