Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce caveat support in CockroachDB #921

Merged
merged 1 commit into from
Oct 25, 2022
Merged

introduce caveat support in CockroachDB #921

merged 1 commit into from
Oct 25, 2022

Conversation

vroldanbet
Copy link
Contributor

@vroldanbet vroldanbet commented Oct 20, 2022

Closes #916

What

introduces support for caveats in CockroachDB datastore

How

  • following a similar design to namespaces
  • introduce corresponding migration (add new table, add new columns to relation_tuple
  • Watch API implementation was adjusted to read changes on caveats from the After payload which we weren't using beore because everything came via the Primary Key
  • adjusted tests comparing revisions from reads and writes, since that assumption does not hold with CRDB (they are monotonically increasing tho)
  • added new assertion in test suite to make sure TOUCH operations that only change caveat name of a relationship also works as intended
  • removed flag from NewPGXExecutor given now PSQL and CRDB are aligned
  • using caveat names as overlapping key

@github-actions github-actions bot added the area/datastore Affects the storage system label Oct 20, 2022
@vroldanbet vroldanbet self-assigned this Oct 20, 2022
@vroldanbet vroldanbet force-pushed the crdb-caveats branch 5 times, most recently from ee2b1a0 to 110e411 Compare October 21, 2022 12:44
@vroldanbet vroldanbet added priority/0 urgent This must be done ASAP area/caveats Affects caveated relationships labels Oct 21, 2022
@vroldanbet vroldanbet marked this pull request as ready for review October 21, 2022 12:49
@vroldanbet vroldanbet requested a review from a team October 21, 2022 12:49
})
}

func (cr *crdbReader) executeWithOverlappingKeys(ctx context.Context, f func(ctx context.Context, tx pgx.Tx) ([]string, error)) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is allocating an extra slice to track overlap keys, why not just loop over the names similar to how we do for namespaces?

there will almost always be only 1 key (and the "keyset" only stores it once), so it's not much help to store it N times. The default is even simpler - we discard the keys entirely (just write to a single static key instead).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, it's a tradeoff I deliberately took, although I'm not 100% sold given the overhead. The goal was to spare us a bunch of repeated lines of code. This the executeWithOverlappingKeys could be used in other methods of the readerwriter implementation.

I'm not sure I follow what you mean the keyset stores it just once. Isn't each caveat name added to the keyset to compute the overlapping?

  • WriteCaveats accepts a slice of caveats. Each name is an overlapping key.
  • DeleteCaveats also accepts a slice of caveats, same here

Copy link
Contributor

@ecordell ecordell Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow what you mean the keyset stores it just once. Isn't each caveat name added to the keyset to compute the overlapping?

There's two types of keyset: static and prefix.

If it's a static keyset, then addOverlapKey is a no-op, we always touch the same key.

If it's a prefix keyset, then addOverlapKey extracts the prefix (i.e. mynamespace/<whatever>) and only stores the prefix.

In theory you can make requests with multiple prefixes, in practice it's always just 1 prefix.

By collecting up the keys like this, you'll build a big list in memory only to throw it away (because we actually only need the on single prefix from the request, not an array of all of the caveat names)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ecordell I've updated the code to handle overlapping key one level above and avoid the extra allocation

In theory you can make requests with multiple prefixes, in practice it's always just 1 prefix.

I'd like to understand this better. Given WriteCaveats will be handled by API WriteSchema, wouldn't it be possible to send a schema write request with caveats / namespaces with different prefixes? If so, doesn't that require us to loop over all namespaces and caveats to make sure we don't miss any overlapping key?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, doesn't that require us to loop over all namespaces and caveats to make sure we don't miss any overlapping key?

Yes, and the "keyer" is for tracking all prefixes from the request. But most deployments will either not use prefixes at all (authzed dedicated, spicedb users) or will have a single prefix enforced a layer up (authzed serverless). And in SpiceDB deployments that do use prefixes, the default "overlap strategy" is static (meaning all namespaces write to a single static key).

If you use the "prefix" overlap strategy, then prefixes in SpiceDB w/ CRDB are really "new enemy protection domains" rather than just for organization. I don't think we have any telemetry on it, but I doubt there are many (if any) SpiceDB users who have bothered - it's only really useful if you are trying to enforce multitenancy on top of SpiceDB and/or you're hitting QPS limits with static (and can untangle the namespaces in some reasonable way / don't care about newenemy problem)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context, helps a lot! Intentionally using prefix separation to reduce the bottleneck caused by serialization errors could indeed be an option for certain SpiceDB deployments, but still to be seen in real-life (beyond Authzed Serverless 😄 ). In any case, my take away is that, while it is relatively rare, we still need to make sure we account for all potential prefixes in an incoming request.

internal/datastore/crdb/watch.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ecordell ecordell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vroldanbet vroldanbet merged commit bf05adf into main Oct 25, 2022
@vroldanbet vroldanbet deleted the crdb-caveats branch October 25, 2022 13:22
@github-actions github-actions bot locked and limited conversation to collaborators Oct 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/caveats Affects caveated relationships area/datastore Affects the storage system priority/0 urgent This must be done ASAP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support caveats in CockroachDB datastore
2 participants