Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Lock API metadata and add missing argument to Unlock #162

Merged
merged 8 commits into from
Jan 26, 2022

Conversation

butonic
Copy link
Contributor

@butonic butonic commented Jan 21, 2022

@glpatcern @wkloucek Setting just an mtime does not tell the server when it can release the lock. I propose we use a Timestamp to store the expiration timestamp instead. This would also align better with the WebDAV Timeout request header for the LOCK method: http://www.webdav.org/specs/rfc2518.html#rfc.section.9.8

It also allows setting the 30min WOPI timeout, documented as MUST by https://docs.microsoft.com/en-us/microsoft-365/cloud-storage-partner-program/rest/concepts#lock and https://docs.microsoft.com/en-us/microsoft-365/cloud-storage-partner-program/rest/files/refreshlock

@butonic butonic requested a review from labkode as a code owner January 21, 2022 09:31
@glpatcern
Copy link
Member

After all, it's totally equivalent: either the expiration time is implemented on setting the value, or when checking it. The timeout itself (e.g. the 30 minutes for WOPI) is either way a parameter of the server configuration.

That said I'm happy with the change if you like it more...

@glpatcern
Copy link
Member

Now if we go ahead and change the API, let's also think about the signatures for the methods requiring to validate against a given appname: in my understanding we can't really get it out of the context, and I overlooked that, so I'd be in favor of adding Lock to UnlockRequest (it's already there for RefreshLockRequest).

@wkloucek
Copy link
Contributor

After all, it's totally equivalent: either the expiration time is implemented on setting the value, or when checking it. The timeout itself (e.g. the 30 minutes for WOPI) is either way a parameter of the server configuration.

That said I'm happy with the change if you like it more...

Having an expiration timestamp is easier in that case. The client (eg. WOPI server) can calculate that (having a config option for that).

When setting an lock (begin) timestamp, the storage driver needs to know how long a lock should last. This would then be a config option of the storage driver.

Therefore I would go for the expiration timestamp and dedicate this power to the clients instead to the storage.

Now if we go ahead and change the API, let's also think about the signatures for the methods requiring to validate against a given appname: in my understanding we can't really get it out of the context, and I overlooked that, so I'd be in favor of adding Lock to UnlockRequest (it's already there for RefreshLockRequest).

I'm fine with that, too.

@butonic
Copy link
Contributor Author

butonic commented Jan 21, 2022

One more thing. The docs explicitly name the JSON properties. While h for holder makes sense we cannot unmarshal an object like that: a plains string gives no clue wether the holder is a user or an app. @glpatcern did you have some prefix in mind for the value, eg. uid: and app:?
I'd prefer to use two different properties uid and app, maybe?

@butonic
Copy link
Contributor Author

butonic commented Jan 21, 2022

Oh ... another thing: in WebDAV we can use a Depth:infinity header when locking a collection. It indicates that all transitive children in a collection are covered by the lock. We could send that as Metadata, or as an optional Depth property? A ZeroDepth boolean flag should be good enough. Setting it to true would lock only a collection resource. For other resources types it would have no effect.

And should metadata be a string or an []byte?

@glpatcern
Copy link
Member

OK, agreed that mtime vs exp means the timeout config value has to be stored in the storage driver vs the app provider or whichever other consumer, so after all it's more convenient to use exp.

For the format of the holder: am I pretending too much from protobuf to make a distinction between string and User ? ;-) Jokes apart, the point here is that the holder is either an app (for WOPI) xor an user (for a check-out/check-in future feature), not both at the same time. If we put both can we still have some oneof semantic?

For the depth: I never thought about recursively locking a container with its children, i.e. I always implicitly assumed depth = 0. I'm not really sure we want to go as far as allowing a user to lock a folder and all its content - and implement the consequent logic at the level of ocdav for sync client and web UI. So I'd rather spell out in the docs that depth is assumed to be 0, unless you think it's a valid (future) use case?

@butonic
Copy link
Contributor Author

butonic commented Jan 21, 2022

on the grpc level the oneof semantic works ... it is just that the json representation for that does not work as documented. I will continue with my webdav lock implmentation to see where this is going and what the code looks like when trying to actually enforce the locks.

Regarding depth: WOPI only deals with file locks. But webdav also 'dabbles' with folder locks ;-) Iwould be happy to assume depth:0 for now ... unfortunately, webdav assumes depth:infinity: http://www.webdav.org/specs/rfc2518.html#rfc.section.8.10.4
We should however be able to leave out locks on collections in the first implementation: http://www.webdav.org/specs/rfc2518.html#n-required-support

If the server does support locking it may choose to support any combination of exclusive and shared locks for any access types.

So, I'll go forward, assuming depth:infinity for all requests, but only allow it for files. No need to send the depth in the lock for new. Can be added when implementing locks on collections.

@butonic
Copy link
Contributor Author

butonic commented Jan 21, 2022

Finally, Metadata should probably be an Opaque Map, otherwise users will have to decide on an encoding

@glpatcern
Copy link
Member

In fact we're in parallel working at the eos implementation and likely finding pitfalls/details to be sorted out. For the depth, I agree with you, let's leave it out for now even though WebDAV assumes infinity when not specified (oh well!).

Finally, for the metadata again not sure we need anything more complex than []byte (to take any string encoding out). WOPI locks are indeed []byte strings.

@glpatcern glpatcern changed the title change lock mtime to expiration Change lock mtime to expiration and add missing argument to Unlock Jan 24, 2022
@glpatcern glpatcern self-requested a review January 24, 2022 09:22
Copy link
Member

@glpatcern glpatcern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As just discussed at our virtual coffee break, we can go ahead as is for now and merge this PR.

Joern is going to look if a more generic metadata payload (map of opaque) can be useful for WebDAV or not (it's not the case for WOPI), and in case we can still change the API later on as we are fleshing out the code both on WebDAV and WOPI.

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

@glpatcern When listing / stating files we may need to return the Locks on a resource. For this we should add an array of Locks to the ResourceInfo. What do you think?

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

allow returning a list of Lock messages in ResourceInfo

@glpatcern
Copy link
Member

@glpatcern When listing / stating files we may need to return the Locks on a resource. For this we should add an array of Locks to the ResourceInfo. What do you think?

That would be welcome to avoid to blindly call GetLock to all files. We're getting really into the full implementation, that's good I think.

@glpatcern
Copy link
Member

@butonic but what is the meaning of having multiple locks for a given single resource? A priori SetLock fails if a pre-existing lock exists.

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

Webdav allows shared locks ....

@glpatcern
Copy link
Member

Webdav allows shared locks ....

As WOPI does, i.e. any user may alter a shared lock provided the write rights are there and the lock is LOCK_TYPE_SHARED, but what's the meaning of multiple shared locks for a single file?

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

Lat me just quote from http://www.webdav.org/specs/rfc2518.html#n-exclusive-vs.-shared-locks

The WebDAV extensions to HTTP do not need to provide all of the communications paths necessary for principals to coordinate their activities. When using shared locks, principals may use any out of band communication channel to coordinate their work (e.g., face-to-face interaction, written notes, post-it notes on the screen, telephone conversation, Email, etc.) The intent of a shared lock is to let collaborators know who else may be working on a resource.

Shared locks are included because experience from web distributed authoring systems has indicated that exclusive locks are often too rigid. An exclusive lock is used to enforce a particular editing process: take out an exclusive lock, read the resource, perform edits, write the resource, release the lock. This editing process has the problem that locks are not always properly released, for example when a program crashes, or when a lock owner leaves without unlocking a resource. While both timeouts and administrative action can be used to remove an offending lock, neither mechanism may be available when needed; the timeout may be long or the administrator may not be available.

We should probably learn from that ;-) Not for WOPI, but to let others know who is working on a file in the web ui. Maybe a WOPI app lock is best represented by one or multiple shared locks in WebDAV, which would allow listing who is collaborating?

@glpatcern
Copy link
Member

glpatcern commented Jan 24, 2022

We should probably learn from that ;-) Not for WOPI, but to let others know who is working on a file in the web ui. Maybe a WOPI app lock is best represented by one or multiple shared locks in WebDAV, which would allow listing who is collaborating?

I like the idea of exposing who is in a WOPI collab session, which so far has been completely within the apps, not externalized. But I'd rather change the lock structure such that the lock metadata includes a list of users and is still a single entity - makes it easier for WOPI apps to refresh lock / unlock. Yet, I have to double check that the WOPI apps do refresh a lock when a new user joins a collab session, not sure they are requested to (but we'd have other ways to intercept that - the /wopi/iop/open-in-app call is in our hands!).

The whole structure would be then something like this?

LockType type = 1;
repeated UserId users = 2;   // at least one
string app_name = 3;   // optional, defaults to ''
string metadata = 4;   // optional
Timestamp exp = 5;

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

I like that, let me update the pr ...

@glpatcern glpatcern changed the title Change lock mtime to expiration and add missing argument to Unlock Rework Lock API metadata and add missing argument to Unlock Jan 24, 2022
@glpatcern glpatcern self-requested a review January 24, 2022 12:09
@glpatcern
Copy link
Member

Looks all good now, @ishank011 can you give a look too? I'm fine for merging this and then we can adapt cs3org/reva#2444 with @gmgigi96 as well.

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

@glpatcern making these changes I still don't feel good about having only a single Lock property on the ResourceInfo. While it is true that you can only really have a single exclusive write lock on a file (which is a technical detail) the current data structures do not allow us to implement the WebDAV shared locks at all, because in WebDAV every lock has an opaquelocktoken. Technically, our decomposedfs internally only writes a single {nodeid}.lock file. It is not only used for locking, but also to store the metadata for all locks.

Why do we even have the lock type SHARED if we can only set one lock? That does not make sense.

When WOPI locks a file it will always lock it exclusively. That also works when a shared WebDAV lock exists. But in WebDAV land we have to be able to distinguish multiple locks, which is why every lock will get a dedicated opaquelocktoken.

When rereading https://docs.microsoft.com/en-us/microsoft-365/cloud-storage-partner-program/rest/concepts#lock

All WOPI operations that modify files, such as PutFile, will include a lock ID as a parameter in their request. Usually the expected lock ID will be passed in the X-WOPI-Lock request header (but not always; UnlockAndRelock is an exception which uses X-WOPI-OldLock instead). WOPI requires that hosts compare the lock ID passed in a WOPI request with the lock ID currently on a file and respond appropriately when the lock IDs do not match. In particular, WOPI clients expect that when a lock ID does not match the current lock, the host will send back the current lock ID in the X-WOPI-Lock response header. This behavior is critical, because WOPI clients will use the current lock ID in order to determine what further WOPI calls to make to the host.

So the X-WOPI-Lock is discoverable and we can expose it via WebDAV. The WOPI access_token obviously needs to be kept secret though.

Also https://docs.microsoft.com/en-us/microsoft-365/cloud-storage-partner-program/rest/files/lock#post-wopifilesfile_id

In the case where the file is locked by someone other than a WOPI client, hosts should still always include the current lock ID in the X-WOPI-Lock response header. However, if the current lock ID is not representable as a WOPI lock (for example, it is longer than the maximum lock length), the X-WOPI-Lock response header should be set to the empty string or omitted completely.

AFAICT WOPI and WebDAV Locks are implemented by dedicated services. The WOPI docs already mention external locks. In order to build these services on top of the CS3 API we should

  • support multiple locks when listing files
  • rename Lock.Metadata to Lock.Token as we only really transport the token here, both WOPI and WebDAV can use this
  • add an Opaque property to Lock as with the ResourceInfo in order to let implementations transport arbitrary data

The implementation for now only needs to deal with exclusive write locks. And that is the only thing I need to implement right now.

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

@glpatcern Ok, I defined a common Lock message that is used in the ResourceInfo twofold:

  • Lock: the exclusive lock ... as ther can only be one. Now by design. Used by WOPI and WebDAV exclusive / write locks
  • AdvisoryLocks: a list of advisory or shared locks that are purely indicative. Used by WebDAV shared locks ... when we want to implement them.

@butonic butonic requested a review from glpatcern January 24, 2022 14:58
@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

I hope this is good for @wkloucek and @ishank011 now as well. If so, merge pls ;-)

@glpatcern
Copy link
Member

Looks good, I see you came back to a single user for the exclusive lock as opposed to the list of users for a collab app. After all that's also fine - the lock is held by the app - but then why not going back to the oneof semantic?

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

I think a lock should be able to carry both: an app and the user. It is not an either or relationship. You could set advisory locks for different devices, eg to say that a user is viewing a file on his mobile or in the web ui. And for the write / exclusive lock it could be the user that list had edit permissions. IIRC @mmeeks mentioned that. I'm not 100% sure we need that but limiting it to oneof seems overly restrictive.

@glpatcern
Copy link
Member

The interesting thing is I also thought to have both user and app_name at some stage, then converged towards oneof aiming for simplicity following the discussion with @wkloucek.

Now I don't see a problem to keep both, and even enforcing user to be REQUIRED: the idea being that a WOPI-mediated operation is carried on behalf of one of the users, so we have enough context in WOPI to store the user that last modified an app lock. This way we could enable the web UI to show info such as File is being edited in Collabora, last editor was user123.

@butonic
Copy link
Contributor Author

butonic commented Jan 24, 2022

Hehe, so I am coming from a UML tool developer perspective here: whenever you want to model something you need to allow inconsistent state in order to truly model the full lifecycle of objects. In this case, while an app may hold a WOPI lock, all users may have left the session ... then the user should be empty. I don't know if it is even possible to get that information out of the application. I admit we may have a different approach of building apis: I am trying to build a generic API that can transport different intenst, whereas you seem to try to already enforce limitations.

But now I'm interested in why @wkloucek proposed the oneof?

@glpatcern
Copy link
Member

Ahah yes, seems like I'm trying to constrain what should not be done!

For that scenario of an empty user, it is actually impossible: when the last user quits a collaborative session, the app unlocks the file. And even if the app forgets to unlock, the lock metadata is still semantically correct: it contains the user that last performed an action through the app.

Copy link
Member

@glpatcern glpatcern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are largely cosmetic changes, but by now you know I prefer a spelled out and more constrained behavior ;-)

cs3/storage/provider/v1beta1/resources.proto Outdated Show resolved Hide resolved
cs3/storage/provider/v1beta1/resources.proto Outdated Show resolved Hide resolved
cs3/storage/provider/v1beta1/resources.proto Outdated Show resolved Hide resolved
cs3/storage/provider/v1beta1/resources.proto Outdated Show resolved Hide resolved
cs3/storage/provider/v1beta1/resources.proto Outdated Show resolved Hide resolved
butonic and others added 2 commits January 25, 2022 12:48
@butonic
Copy link
Contributor Author

butonic commented Jan 25, 2022

@labkode @ishank011 hit merge unless you have any objections, pls.

@butonic
Copy link
Contributor Author

butonic commented Jan 25, 2022

Regarding locks: how should a client make an upload, rename, delete, copy or move call when he locked a resource? AFAICT clients need to sent along the lock id in both WOPI (as X-WOPI-Lock header) and WebDAV (as If header). For GRPC we can either add a LockID parameter to the requests, or we send the headers as grpc metadata ... and put them into the context for storage drivers to check.

We don't need to add that to this PR. But it is something I need to consider for the implementation. As does every storage driver implementation that wants to support locking.

@glpatcern
Copy link
Member

For GRPC we can either add a LockID parameter to the requests, or we send the headers as grpc metadata ... and put them into the context for storage drivers to check.

This makes perfect sense but like you said we don't need to add this for now. In particular, enforcing the LockID check via GRPC should be a development coordinated with extending the WOPI server to always pass down the line its WOPI id for its CS3 requests to Reva, otherwise the WOPI server would cut itself out after locking a file!

So yes you should consider it for "your" implementation (as well as for our eos storage driver) but let's coordinate.

That said, if no further comments I'll merge by lunch break.

@glpatcern glpatcern merged commit 2fb79d4 into cs3org:main Jan 26, 2022
@butonic butonic deleted the change-lock-mtime-to-expiration branch January 26, 2022 13:02
@butonic
Copy link
Contributor Author

butonic commented Jan 26, 2022

@glpatcern true, I'll add an implementation for the ocdav service in my PR for WebDAV locks. Like the wopi server it is a client to the GRPC CS3 api and needs to send the lockid with modifying requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants