-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Add an offline hint to contexts #4009
Comments
I like using contexts this way. 👍 here, though we will have to clearly define expectations; what exactly is meant by a 'network call' and how do we behave in contexts where everything is a networked call? the hard part here is bitswap, how would bitswap behave when given an 'offline' context? |
I assume it would refuse to work. For context, I want to be able to easily construct arbitrary offline variants of services but don't want to add more functions like |
Yeah, agreed. We just need to be careful and make sure we don't miss adding the offline checking logic to any service that needs it. |
I am not sure I like the idea of a "hint". For somethings (like maybe the GC) I greatly prefer a hard guarantee that the service will not make network calls because there is no code for it in the service used, rather than a hint that could be ignored. |
I do agree that GetOfflineLinkService is a hack, but I think there should be a more uniform way to get at a service that does not use the network rather than ask an existing service to avoid network requests. In addition to the my dislike of a "hint" that can be ignored, it will add complexity to existing services as they now will need a check for this "hint" and take appropriate action. |
Unfortunately, I think this would require a
I agree. Hints like this do make me feel a bit uneasy. Ideally libp2p interfaces would take a context so we could enforce this at the networking level (we could even use contexts to help with bitswap accounting) but the reader/writer interfaces don't accept contexts. Actually, on further inspection, this might run into problems where not all functions on networked interfaces accept contexts (e.g., We can actually enforce the "is-online" check using the type system but that would add a lot of complexity and is very ugly: import "context"
type OnlineContext interface {
context.Context
_OnlineContext()
}
type onlineContext struct {
context.Context
}
func (c onlineContext) _OnlineContext() {}
func Offline(ctx context.Context) context.Context {
return context.WithValue(ctx, "offline", true)
}
func WithOnline(ctx context.Context) *OnlineContext {
if value, ok := ctx.Value("offline").(bool); ok && value {
return nil
} else {
var c OnlineContext
c = onlineContext{ctx}
return &c
}
}
func needsContext(ctx OnlineContext) {}
func main() {
ctx := context.Background()
ctx = Offline(ctx)
onlineCtx := WithOnline(ctx)
if onlineCtx == nil {
panic("need online context")
}
needsContext(*onlineCtx) // If you forget the check, this panics.
} We can avoid the panic with a helper function: ifOnline(ctx, func(ctx OnlineContext) {
// do something online.
}) |
Perhaps we are taking the wrong approach here, although it would be useful it is not really necessary in order to eliminate the hackish The only reason GetOfflineLinkService exists is because it replaced this code in 721df36 (p.r. #3255):
that is to replace code that manually constructed an offline service. I added [Edits for clarity and to remove quote from @whyrusleeping since it may of been out of context or misinterpreted] |
@kevina's version has now been implemented. The next step is to decide if we do, in fact want the context hint. |
Actually my fix was not implemented. What was implemented was basically how things where before I created the LinkService. I did not create the LinkService as a means of catching or an optimization but rather as a means for being able to perform garbage collection when some of the leaf nodes are not available. More to the point, in the filestore data for leaves may become inaccessible at any time if the backing file has changed or is removed. If the node is pinned, then this could brake garbage collection. Fortunately, the filestore does not need to read the backing file to determine what the links are, hence the LinkService was created. This original filestore code at https://github.com/ipfs-filestore/go-ipfs was able to store both Protobuf and Raw leaves. For protobuf leaves it also stored the non-data part in the database so it could tell that it was a protobuf leaf and hence had no links without having to open the backing file. With Raw leaves you can tell from just the Cid that it is a leaf so this case does not need to be handled by the filestore. The basic filestore implementation that got merged can only handle raw leaves so our existing optimization, of not consulting the blockstore if the Cid indicated a raw leaf, is sufficient to prevent problems with the GC if the backing file is changed or removed. So for now the change in #4610 will not cause a problem, however in the future it could cause a problem, if we get a more functional filestore implementation that can also store protobuf leaves, or even if there are just cases when only the links are available and the nodes may not be. |
@kevina sorry, I misread your post above as the fix (well, a fix).
Got it. I was informed it was an for future optimizations.
There was already going to be a problem as I just pulled this code out of the current
Kind of off topic, but I don't really see why one might want to do this. Isn't the that purpose of the filestore to deduplicate between the repo and the user's filesystem? |
This is off topic so I don't want to spent too much time on it but consider that a leaf that is not raw contains more than just data from the backing file as it is protocol buffer. Or put in simplified terms (and not 100% accurate) non-raw leafs contain a header to indicate what they are then the data itself. The header is stored in the database as well as how to retrieve the data component from a file. When a request is made for the node the data is retrieved from the file and then recombined with header in the database to reconstruct the original protocol buffer node. With Raw nodes there is no header so all that needs to be done is retrieve the data from the file. |
I see. That's really cool! |
Thanks. And since the header is stored in the database it is possible for the filestore to know that it is a protobuf leaf node without having to touch the disk, but with the current abstraction before LinkService there was no way for it to communicate this to the GC. That is reason I created the LinkService. Getting back on topic, note that I am not against adding an offline hint to context, I just wish there was a better way, but so far I have not been able to think of one. |
With #5654 this is probably not going to be useful. Feel free to reopen if you disagree. |
I've still wanted it a few times internally but it's likely more trouble than it's worth. |
Instead of manually constructing services without online components, it would be useful to have a context hint that tells services to avoid network requests. That is:
One could then define offline service adapters as follows:
The text was updated successfully, but these errors were encountered: