This repository has been archived by the owner on Oct 11, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Split backups by (resource owner, service) at CLI layer #1609
Split backups by (resource owner, service) at CLI layer #1609
Changes from 2 commits
a86bc00
507cca9
8ad0fe7
5944a0c
0906a6c
dc80934
ae7b59d
bbb1fb0
6b30e52
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the plan for iterating over the resource owner here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this was mostly me attempting to get something working with what we already have. The issue I ran into with this patch is that there's no way to extract the other fields from the scope to make a selector. We avoid this problem in GC because we just iterate on the users and pass the full scope around. Here though we need to extract the user and then make a new selector that contains all the information in the current scope except with the users slice replaced by the user we have in the current iteration.
I think long-term it would be better to have some selector function that would return either scopes (to be given to
Include
to make a newSelector
) orSelector
s that have the cartesian product of(user, other options)
for each scope. This could then replace both the call toDiscreteScopes
and iterating over users in a scope.I don't feel I know enough about selectors to know how to implement something that would return the cartesian product in a clean manner though so if you could help out there I'd really appreciate it. Also, while I've said "users" here, it's really going to be resource owners because this same pattern will also be used in other services
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. This complication is where I got the idea to slice the resourceOwners out of the scopes and elevate them to the selector itself. If we imagine a selector structure like:
then the cartesian product can be handled as, "for each resource owner (the backup boundary), this selection of scopes (the data boundary within the backup)". Which is a statement that more closely matches the design intent of corso usage than our current system.
To get the product, splitting selectors could be handled as
sel.SplitByOwners(users) => []selectors
(one selector for each user). Or, maybe even more simply,sel.DiscreteOwners(users) => []string
(returns users if *, or the selected set of users) where the resource owner list/individual id gets passed to callers alongside the selector. Kinda like how GC does it down the line with scopes.If we don't want to pull the above lift, I think the next best option is to add a func similar to discrete scopes that returns a slice of scopes (one per resource owner), rather than a single scope. Ex:
SplitDiscreteScopes(user) => []scope
. It keeps the world messier when we think about what it means for a selector to contain multiple overlapping/conflicting scopes relative to their generated backups (eg: we'd split operations along scopes, which is more like a per-category tuple, and a single selector can list multiple category scopes with the same user meaning multiple backups for that user in a single selection). But it'll get us from A to B in a short time.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit more implementation specific, but in the interest of removing footguns and sticking to the new assumption that each backup will represent only a single (resource owner, service) pair I would highly recommend making whatever houses this information a different type than a
selector.Selector
so that Backup cannot be called on something that has multiple resource ownersI'm not really a fan of separating out this information. GC may split it out, but I think it would require a fair amount of busy work to pull that information out through the whole stack and I don't really think it would get us anything extra. I'd prefer to stick to some thing like
[]selector.Selector
Selector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks reasonable as long as
sel
in the example you gave is of typeBoundary
.Is this something you can implement? I'm still not confident I can navigate the selectors package to add behavior easily π
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
sel
in example containsthe set of resource owners
. Theboundary
exhibits a single resource owner. Names can be swapped as wanted, important part is that arepo.NewBackup
would require the struct that is specific to an owner.Sure can. I'll make a ticket for it. However, I don't want to block you on this work. What will be helpful in the meantime is for you to continue with the tools at hand (basically, keep on with this PRs design, discreteScopes and all). It'll be ugly for a minute, and then we'll tidy it up again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm kind of stuck at the moment with the current approach. I don't know of a way to easily convert a scope with many resource owners to a selector for a single resource owner. I don't know how to get the other properties in the scope (excluding resource owners). If I could do that then I could do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticket in question: #1617
I'd suggest the following:
scope.Clone()
method. Basic "for k, v in map1; map2[k] = v".set(cat, val, ops)
scope method.For the time being, that'll cause a backup like
--user u1 --data email,contacts
to produce two backups, even though it could have only done one. Unhappiness like that aside, would that unblock you?