Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nexus] webhooks #7277

Draft
wants to merge 45 commits into
base: main
Choose a base branch
from
Draft

[nexus] webhooks #7277

wants to merge 45 commits into from

Conversation

hawkw
Copy link
Member

@hawkw hawkw commented Dec 18, 2024

No description provided.

hawkw and others added 2 commits December 18, 2024 12:00
This commit adds (unimplemented) public API endpoints for managing Nexus
webhooks, as described in [RFD 364][1].

[1]: https://rfd.shared.oxide.computer/rfd/364#_external_api
@hawkw hawkw force-pushed the eliza/webhook-models branch from 51f7f8e to 139cfe6 Compare December 18, 2024 21:10
@hawkw hawkw changed the base branch from eliza/webhook-api to main December 18, 2024 21:11
@hawkw hawkw requested a review from augustuswm December 18, 2024 21:11
@hawkw hawkw force-pushed the eliza/webhook-models branch from df87c31 to 326b3f7 Compare January 4, 2025 00:38
@hawkw hawkw force-pushed the eliza/webhook-models branch from 140aea4 to 0b80c8f Compare January 8, 2025 17:28
hawkw added 5 commits January 9, 2025 12:19
this way, rather than transforming the globs into patterns that we can
match in the database every time an event is dispatched, we instead
use globs to generate "exact" subscriptions to specific named event
classes. this avoids doing a full scan over the subscription table every
time an event is dispatched, since we can look up subscribed receivers
by event class name instead.

we'll have to do this processing on receiver creation or when a
subscription is added, and when new event classes are added (e.g. on
software updates).

thanks @andrewjstone for pointing me in this direction!
@hawkw hawkw changed the title [nexus] Webhook DB models [nexus] webhooks Jan 11, 2025
@hawkw hawkw force-pushed the eliza/webhook-models branch from 41cf0b0 to 2bc5925 Compare January 17, 2025 19:20
@hawkw
Copy link
Member Author

hawkw commented Jan 24, 2025

I think I've come around a bit to @andrewjstone's proposal that the event classes be a DB enum, so I'm planning to change that. I'd like to have a way to include a couple "test" variants in there that aren't exposed in the public API, so I'll be giving some thought to how to deal with that.

@hawkw
Copy link
Member Author

hawkw commented Jan 24, 2025

I think I've come around a bit to @andrewjstone's proposal that the event classes be a DB enum, so I'm planning to change that.

Glob subscription entries in webhook_rx_event_glob should capture the schema version when they're created, so that we can trigger reprocessing (generating the exact event class subscriptions for those globs) if the schema has changed. It's probably fine for nexus to do glob reprocessing on startup rather than in a bg task, although online update might invalidate that assumption.

@hawkw
Copy link
Member Author

hawkw commented Jan 24, 2025

As far as GCing old events from the event table, dispatching an event should probably add a count of the number of receivers it was dispatched to, and then when we successfully deliver the event, we increment a count of successes. That way, we would not consider an event entry eligible to be deleted unless the two counts are equal; we want to hang onto events that weren't successfully delivered so any failed deliveries can be re-triggered.

GCing an event would also clean up any child delivery attempt records.

Comment on lines +61 to +69
// This performs a `SELECT ... FOR UPDATE SKIP LOCKED` on the
// `webhook_event` table, returning the oldest webhook event which has not
// yet been dispatched to receivers and which is not actively being
// dispatched in another transaction.
// NOTE: it would be kinda nice if this query could also select the
// webhook receivers subscribed to this event, but this requires
// a `FOR UPDATE OF webhook_event` clause to indicate that we only wish
// to lock the `webhook_event` row and not the receiver.
// Unfortunately, I don't believe Diesel supports this at present.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this isn't going to work with CRDB, as it doesn't actually implement SKIP LOCKED. We're going to have to redesign the mechanism that prevents duplicate delivery creation. Another option could be to change the webhook_delivery table to have a UNIQUE constraint on (rx_id, event_id), but this doesn't work with the way I wanted to implement the resend event API (creating a new entry in webhook_delivery.

I think the correct thing to do here is something like:

INSERT INTO omicron.public.webhook_delivery (
    {rx_id},
    {event_id},
    /* ... */
)
WHERE NOT EXISTS (
     SELECT 1 FROM omicron.public.webhook_delivery
     WHERE rx_id == {rx_id} AND event_id == {event_id}
)

so that we can prevent creating duplicate deliveries without a lock. But, I can't seem to find a way to do INSERT ... WHERE NOT EXISTS in Diesel, so this probably needs to be a CTE...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant