-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta-Issue: Versioning of resources in CLAW #740
Comments
@bseeger good questions! some thoughts
I have so many questions really, we pushed the versioning/revision thing a lot into the future, but I feel the reasons were mostly correct: at the moment we felt fedora's 4 versioning system could change and was not completely consistent because of the LDP restricted/confined dependencies on tree snapshotting v/s previous (Fedora 3) and Drupal's per resource ones |
There will not be tree snapshots in Fedora anymore. Each resource will be versioned independently. But I have a more interesting question. If we have versions in Drupal and we could restore a Drupal version, save it and that would PUT it into Fedora overwriting the object. Do we need versioning turned on in Fedora? In Fedora we are thinking that a restore will probably be a manual process of the GET of a Memento and a PUT to the live resource anyways. |
@whikloj, cool, but how is the LDP tree integrity maintained if only per resource snapshot happens on Fedora? ok, I guess out of scope here... but maybe not completely, means we could /under that reasoning, do the same in Drupal, avoid maintaining Linked data integrity, if Drupal allows us (because Drupal actually thinks it is linked entity integrity, at DB level). Indeed an interesting question |
So for the LDP tree we are going to remove referential integrity from Mementos, so if you:
That memento of /foo will still say The actual LDP tree has integrity, but the past versions are just pictures of a resource in time. It will be up to you to determine how much of the tree you want to preserve, by traversing it and taking snapshots if you so choose. |
@whikloj great explanation! that makes sense, may I make you another question? So having snapshots as non-validated pictures of moments in time is very very good. So what happens then when you want to restore a certain point? and the integrity of the old one (its existence refers to currently non-existing resources, web ACL or even namespaces, who knows?) conflicts with the current's LDP state, is that where a GET and PUT (manual, probably fixing and editing) is needed? so in other words, not automatic restoring possible right? |
@DiegoPino right In the case of my example if you wanted to restore the Memento of
So you would need to edit the resource between GET and PUT. There is a larger question of whether Fedora should drop referential integrity altogether, but that is for a different time. |
@whikloj nice. So that piece of the workflow is clear to me know. So we need to make comparisons to drupal's one. Need to research more on that piece. Thanks for walking me(us) through |
@whikloj @bseeger @DiegoPino I'm inclined to leverage Drupal's core versioning features as much as possible. I'd prefer saving and restoring versions be done through Drupal's UI and database. That work is already done for us and we'd never have to talk to Fedora directly. If people want to push versions from Drupal to Fedora for posterity, we can certainly automate that for folks. That feels like a valid use case and in the spirit of digital preservation. But there are times where users wouldn't want or need it, so we'll need to make sure the feature can be turned off. In particular I'm thinking about instances with Fedoras that are configured to auto-version. |
+1. The less people have to be aware of Fedora the more CLAW shows its advantages. |
Dusting this off and turning it into a meta-issue for our roadmap. We will need to develop a gameplan to move forward on this one. If we stick to using Drupal's core (and possibly contributed?) functionality to manage versions, then we need to have Camel listeners responding to CRUD events on Drupal versions and create Mementos in Fedora accordingly. |
I am very interested in this issue and have taken the following approach:
I'm not sure if this is
Although it is convenient that drupal creates "revisions" of nodes, I'm not sure it's actually helpful for passing anything to Fedora. https://www.drupal.org/docs/8/modules/jsonapi/revisions |
@elizoller I think the addition of a createVersion and getVersions to Chullo will be needed and so if you want to submit a PR to chullo to add that it would be great. I would suggest that I think you have too much of the Fedora logic up in the milliner level and a couple assumptions we should avoid. I would pass the resource's URI to chullo (ie. This way:
|
Sound advice @whikloj. I'll definitely second sticking to advertised headers in a HEAD request. Don't let the fact that you're making an extra request deter you @elizoller. It's a bullet proof way to deal with the issue. |
@elizoller Also, might be retries if you're seeing six fails in a row? Check the karaf logs too and see what islandora-indexing-fcrepo is up to. |
Thanks for the advice @whikloj and @dannylamb |
So this could create a lot of versions in Fedora and as Mementos are not deltas but an immutable copy of the resource at the time of versioning, this could introduce some bloat. I'm also thinking specifically of binaries and how the flysystem connector deals with this? But we will need to work that out anyways as part of this meta-issue. Functionally I think you need to provide the desired BODY as part of your POST request if you provide a specific timestamp (here) We will probably need a way to enable/disable versioning so it can be turned off for those that don't want it yet. But otherwise I think you could open PRs (even draft PRs if you'd like) and we can work out the issues there. If you're comfortable with that. |
@elizoller If you want, I'd start with the Chullo PR and we'll go from there. There's a lot of angles that need to be explored w/r/t when versions are sliced, but the utility functions to do the actual slicing can come in now np. |
I understand the concern about bloat since memento creates full copies. I am curious to see how Fedora v6 will implement OCFL because from what I see, OCFL stores versions as deltas not full copies and so I wonder how that will change the way Fedora thinks about versioning. From what I have heard, Fedora 6 will have the option to be OCFL compliant but won't necessarily require it. This is only nodes so it wouldn't effect binaries and this not flysystem either, right? Just looking at the docs again for when a timestamp is provided (https://wiki.duraspace.org/display/FEDORA5x/RESTful+HTTP+API+-+Versioning#RESTfulHTTPAPI-Versioning-BluePOSTCreateanewversionedresource(anewLDPRm)) and I see what you're saying. Example one is just create a version right now and requires no parameters. Example two is create a version at a specific time and here is the time and the BODY to do so. I will work to adjust the chullo API to handle both of those examples more completely. I agree, especially due to bloat mentioned above, that versioning should be a feature that can be enabled/disabled through Drupal. I am thinking that to do so, it would need to be a separate "action" in Drupal from the index in Fedora action, so that it could be connected to a setting that would control the action being included in the context. Otherwise, MillinerServer doesn't have any clue about drupal settings, right? @dannylamb I'll put in a PR to chullo when I finish fleshing out the second example in the fedora API docs. |
@elizoller You got it. If we push all this into context, it becomes infinitely configurable. Everything milliner needs to know should get pushed into the message that goes on to the queue and away it goes. Milliner will need a special route for versions (or respect certain headers maybe), but it shouldn't be too big of a deal. |
Upvote for Context here. That would allow very finegrained control over when a new version is created. |
@elizoller regarding Fedora 6 and OCFL... if your institution has any desires or requirements I'd recommend you (and maybe @tallgood) should make your voices heard. I am also 👍 for context and suggest that passing the |
@whikloj and/or @dannylamb I modified islandora module like so: 8.x-1.x...asulibraries:versioning which basically adds an action for creating a version, and if the event["type"] is "Version" then it adds "createVersion" as true (otherwise sets as false) on the $event["object"] which gets emitted. Not really sure if this works since I don't know how to develop in/debug Alpaca. Any advice on that? after https://github.com/Islandora-CLAW/Alpaca/blob/d2a9d71582a0b8745d7a02aae2658d1c5c60bdf0/islandora-indexing-fcrepo/src/main/java/ca/islandora/alpaca/indexing/fcrepo/FcrepoIndexer.java#L131 i would need to add a line like `.setProperty("createVersion").simple("${exchangeProperty.event.object.createVersion}") in Milliner, i could see this going two ways. one would be to modify the saveNode route and corresponding method to look at the createVersion parameter coming from alpaca (assuming that just setting the property above will pass it through to milliner). then saveNode might look something like this:
otherwise, if you think it should be a separate route in milliner, something like let me know if i'm on the right track (or not) here. |
@elizoller nice work 👏 . We do need to add some logging to Alpaca to help with people working in the stack. Two things I can think of are:
@dannylamb might have more thoughts on this. But at some point I would start opening PRs...once a change get really big it can be daunting to test and get it. Don't feel it has to be complete, changes can come later. |
@elizoller Throw up what you've got into some PRs and we'll go from there. You're tackling an issue that cuts all the way through the stack (good on ya!) so it'll take some time to massage all the moving parts until they start working together. It's a bit of a trial by fire, but by the time you're done, you'll be able to do pretty much anything in Islandora 8. So throw up those PRs and we'll jump in to help you out. You're not doing this alone. And don't be too scared of Camel. If you can grok Drupal's migrate framework, you'll be just fine with Camel. They're actually very very similar in the end, it's just, y'know... Java. |
I checked this set of PRs again and I'm fairly certain that the call to update Fedora is fired off before the call to create the new version (see https://github.com/Islandora/Alpaca/pull/61/files#diff-a0a4e90c5cf5024700844b61bbe9e12eR139) |
is it possible this issue can be closed? |
In regards to versioning media objects, I am working on that (once I can build Alpaca again... :P ). I think files will probably have to have a discussion? |
See above PRs for versioning on media. They aren't perfect but I think it's a pretty good start. I was just messing around with versioning files and I ran into an interesting thing where Drupal says you can't overwrite an existing file. @seth-shaw-unlv I think you mentioned something about what are we going to do about files in a past tech call but I didn't see the issue until I actually went to try to implement it in the FedoraAdapter for Flysystem. And it actually worked 😱 (versions available from fedora fcr:versions endpoint) |
Contrib modules FTW. Keep the technical debt low! |
@DonRichards brought up file versioning yesterday in conversation so I hunted down this issue. There are some good PRs that have sat for a bit. We should test these and bring them in as soon as possible. |
Versioning files PR is up at Islandora/islandora#793 Also not sure why the Fedora tests are failing in https://travis-ci.org/github/Islandora/islandora/jobs/718940804 |
@elizoller 's Islandora/islandora#793 has been merged. Have we thought about a UI in Drupal for viewing/restoring/deleting file versions? Open a new issue? |
I vote new issue. This one is a bit long in the tooth, and it would help to focus discussion forward. |
How will versioning be done in CLAW? This issue is a reminder to consider how versioning will be integrated at the Drupal layer and how that will integrate with fedora's memento versioning model.
As discussed in a recent tech meeting, the idea might be to follow what Drupal does. By default it looks like Drupal does history based revisioning - meaning any change is revisioned. I found this by looking at: 'Structure' -> 'Content Types' -> pick a content type and 'edit' -> look at 'Published Options' for 'Create new revision'. This was checked by default in my system.
Some notes/considerations from the tech meetings:
The text was updated successfully, but these errors were encountered: