-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sharding? #199
Comments
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
Finally, remember to use https://discuss.ipfs.io if you just need general support. |
Specifically, the flatfs (flat-file backed datastore) provides a sharding option because some filesystems don't handle large directories very well. None of the other datastores provide such an option. Honestly, sharding just doesn't make sense in S3 and would massively complicate the query logic. |
@Stebalien Would be great to know why you think so, as the js-ipfs plugin for s3 does support sharding and in our case it has been useful to prevent rate limiting from s3. We really don't see it as possible to use this plugin as it is without sharding. |
Also @Stebalien please see this discussion about why sharding is useful in s3 in general and more specifically why it is useful for IPFS s3 datastore: ipfs/js-datastore-s3#27 |
Interesting, I stand corrected. In javascript, this isn't actually a feature in the s3 datastore but in a "wrapper" datastore that transforms keys. That's probably the correct way to implement this and that implementation would live in https://github.com/ipfs/go-datastore/. However, it's going to be non-trivial to correctly handle queries, offsets, etc. Basically, every query would need to iterate over all shards at the same time, interleaving the results. If you want to submit a datastore to do this, take a look at how queries are handled in https://github.com/ipfs/go-datastore/blob/ed11f242ef104130b10a1e86728ab3779cd23c64/mount/mount.go#L209. |
By default, go-ipfs provides a sharding option for the datastore. When using this plugin the datastore is not being sharded.
As described in previous issues, the serialization in the
datastore_spec
is not 1:1 because when I try to addshardFunc
this results in an error.Is there a way to achieve sharding for the data stored in S3?
The text was updated successfully, but these errors were encountered: