Skip to content
This repository has been archived by the owner on Sep 30, 2023. It is now read-only.

Automatically pin entries #6

Closed
oed opened this issue Jul 21, 2019 · 9 comments
Closed

Automatically pin entries #6

oed opened this issue Jul 21, 2019 · 9 comments

Comments

@oed
Copy link

oed commented Jul 21, 2019

Currently entries are added with ipfs.dag.put. This doesn't automatically pin the entries that are added to ipfs. This could be problematic when the ipfs garbage collection gets activated. To solve this after each dag.put call ipfs.pin.add(<cid>) should also be called.

@oed
Copy link
Author

oed commented Aug 30, 2019

This is becoming more urgent as GC is being added to the next js-ipfs release: ipfs/js-ipfs#2396

@alanshaw
Copy link

Note: GC in the next release is not run automaticallly when repo is full or on a schedule. You have to explicitly call the command if you want it to run.

@mistakia
Copy link

mistakia commented Aug 30, 2019

I think I'm beginning to lean toward leaving pinning (and unpinning) to the application (or perhaps even in orbit-db-store as it would be a cleaner/simpler implementation).

(1) Not all use cases will want to pin by default, thus we will need to pass the choice to the user/app. (2) I believe we would have to make changes elsewhere to pin "replicated" entries as this would only address pinning locally created entries.

Being that this isn't all that urgent yet, I'll experiment with this and report back once my thoughts are better formulated.

@oed
Copy link
Author

oed commented Sep 2, 2019

Cool @mistakia! Makes sense that you might not always want to pin. In your mind is this only when the DB is opened temporarily? I'm trying to figure out in which cases this would be desired.
(2) I'm not sure I follow. I thought all entries will be passed to the io module?

@shamb0t
Copy link
Contributor

shamb0t commented Sep 2, 2019

Thanks for the heads up @oed @alanshaw!

@mistakia I agree that not all use-cases will want to pin by default and applications could certainly benefit from some flexibility. A simple approach would be to pin per store, so only a flag in the options during store creation is required and the code-changes limited to passing the flag from orbit-db-store to ipfs-log and orbit-db-io. To do this per entry would require additional changes to each store and is a little more complicated regarding replicated entries as you said. It's not obvious how to decide which replicated entries to pin since replication is not called by the user. In this case one could listen for the replicate event at the application layer and pin entries there.

How does adding a pin flag to the store options to handle automatic per store pinning sound? Per entry pinning would be left to the user to handle by listening to replicate and write events

@oed
Copy link
Author

oed commented Sep 2, 2019

I think adding a pin option per store would be great! I think it should be true by default.

@mistakia
Copy link

mistakia commented Sep 2, 2019

@shamb0t I think that would be a good approach as it would handle the most common scenario and allow for applications to handle the rest. I also agree with @oed that it should be true by default.

@oed - in my use case its desirable for a user to temporarily open a remote db to "browse".

I spent some time this weekend implementing pinning and gc and came across these thoughts:

(1) I found it much easier to implement outside of orbit-db land (lots of passing the flag around). With the exception of issues related to (4), it was quite easy to add pinning with orbit-db as-is.

(2) Pinning an entry will also recursively pin links by default (although it seems that it will ignore links to objects that it does not have?). In my use case, it is desirable to pin log entries and some of the links contained within but not all. A pin flag per store would work as I could just set the flag to false and handle selectively pinning the links within the application.

(3) In my use case, a db could be opened temporarily to "browse" but then the user may decide to keep it, at which point non-pinned entries that are already replicated will need to be pinned. I'm still not sure what approach would be best but I'm leaning toward this being handled at the application layer. I wonder if a simple method to change the pin flag could be useful for some.

(4) Obviously, the manifest will need to be pinned. And in the case of an IPFS access controller the access controller address (address getter not working, but exposed on the options object) and write array address need to be pinned (which is currently not exposed to the application layer).

Based on my experience, I think a simple, default on, pin flag per store is the way to go. Overall, contrary to my initial thoughts on this, I think it would be best for orbit-db to have a simple approach to pinning and allow for maximum flexibility as it is application specific and highly dependent on how data is structured within IPFS.

@shamb0t
Copy link
Contributor

shamb0t commented Dec 3, 2019

I've been this out with ipfs-log and am getting a big performance hit when enabling pinning:

$ node benchmarks/benchmark-append.js
Starting benchmark...
1239 queries per second, 1239 queries in 1 seconds (Entry count: 1239)
1894 queries per second, 3133 queries in 2 seconds (Entry count: 3133)
2316 queries per second, 5449 queries in 3 seconds (Entry count: 5449)
2377 queries per second, 7826 queries in 4 seconds (Entry count: 7826)
2442 queries per second, 10268 queries in 5 seconds (Entry count: 10268)
2430 queries per second, 12698 queries in 6 seconds (Entry count: 12698)
2280 queries per second, 14978 queries in 7 seconds (Entry count: 14978)
2389 queries per second, 17367 queries in 8 seconds (Entry count: 17367)
2456 queries per second, 19823 queries in 9 seconds (Entry count: 19823)
--> Average of 2213.6 q/s in the last 10 seconds

with pinning enabled:

$ node benchmarks/benchmark-append.js
Starting benchmark...
42 queries per second, 42 queries in 1 seconds (Entry count: 42)
42 queries per second, 84 queries in 2 seconds (Entry count: 84)
38 queries per second, 122 queries in 3 seconds (Entry count: 122)
33 queries per second, 155 queries in 4 seconds (Entry count: 155)
29 queries per second, 184 queries in 5 seconds (Entry count: 184)
28 queries per second, 212 queries in 6 seconds (Entry count: 212)
24 queries per second, 236 queries in 7 seconds (Entry count: 236)
24 queries per second, 260 queries in 8 seconds (Entry count: 260)
21 queries per second, 281 queries in 9 seconds (Entry count: 281)
--> Average of 30.2 q/s in the last 10 seconds

Seems pinning shouldn't be enabled by default and may be done separately in batches. What do you guys think?

You try it out the benchmarks by changing the boolean here https://github.com/orbitdb/ipfs-log/blob/upgrade/ipfs-0.40.0-rc.0/benchmarks/benchmark-append.js#L21 and running node benchmarks/benchmark-append.js

@aphelionz
Copy link
Contributor

Let's leave pinning disabled by default for now 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants