Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent: use automations for all the things #1910

Merged
merged 7 commits into from
Feb 3, 2025
Merged

Conversation

psFried
Copy link
Member

@psFried psFried commented Jan 30, 2025

Description:

Resolves #1902

Reworks how agent processes async jobs by replacing Handlers with automations::Executors. The old handlers module is completely removed. The main reason for this is that automations have more appropriate error handling, which prevents failures in one task from impacting the execution of other tasks.

There's a commit per type here, with a bit more detail.

Deploying this will require a brief downtime in order to avoid having multiple agents try to process the same jobs. Deploying with zero downtime would be possible if we first update the existing agents to skip processing any jobs for which an internal.tasks row exists. This didn't seem worth the effort to me, since we should be able to limit the downtime to just a few minutes. So the plan is:

  • Deploy agents
  • Wait until all old agents have exited
  • Run the migration

This change is Reviewable

@psFried psFried force-pushed the phil/alltomations branch 2 times, most recently from 9f35b7c to 406beb2 Compare January 31, 2025 20:28
@psFried psFried changed the title agent: use automations for publications agent: use automations for all the things Jan 31, 2025
@psFried psFried marked this pull request as ready for review February 3, 2025 12:12
@psFried psFried requested review from jgraettinger and jshearer and removed request for jgraettinger February 3, 2025 12:13
Copy link
Contributor

@jshearer jshearer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a couple quick questions but LGTM -- I love the way all of these tasks can be reframed as automations

crates/agent/src/publications/commit.rs Show resolved Hide resolved
crates/agent/src/discovers/executor.rs Outdated Show resolved Hide resolved
Removes the publications handler and replaces it with an `automations::Executor` impl.

This required some changes to how we update the `publications` table row in
order to make the automation task idempotent.  It's possible for the
publication to commit and for the agent to fail after that point, but before
it's able to update the `publications` row to let the user know that it was
successful. it's able to update the `publications` row to let the user know
that it was successful. In that case, we could end up attempting the
publication a second time. So this updates our publisher to allow for updating
the `publications` row in the same transaction that updates the live specs, so
that we can tell whether a given publication has been completed before
attempting a (potentially second) run.

Also removes the legacy `background` and `auto_evolve` fields, which were no
longer used.

The approach for integration tests is to have a separate `automations::Server`
for each type of task, so that we can run them independently of one another.
Swaps discovers from the old `Handler` to the new `automations::Executor`.
For this it seemed pretty straight forward to commit the results of the discover
as part of the `automations::Outcome::apply` function. The integration test
harness was updated to create single-type `automations::Server` instances
as needed, since it seemed silly to keep both the regular and type-erased
versions of each executor as properties of the harness.
Swaps out evolutions from handlers to automations. There's no existing
integration test coverage for interactive evolutions, so no impact there.
Basically the same treatment as the others.

I found it easier to switch the tests over to using the `TestHarness`, so we
can re-use it's existing functionality for running automations tasks and
clearing out the database before each test run.
Switches the `connector_tags` job from a `Handler` to an
`automations::Executor`, similar to the others.
Replaces the final `Handler` with an `automations::Executor`, and fully removes
the `handlers` module.
Updates the handler-automations migration to create tasks for any existing jobs
that are still queued.  This ensures we don't miss any during the cut over.

Also tweaks the trigger to set the `wake_at` and `inbox` as part of the insert,
just as a minor simplification. They were only separate calls because the
initial version of this used an `on conflict do nothing` to allow for a task to
already exist. But now we require that the task does _not_ exist (and prevent
updates to already-queued tasks) and so there's no reason to do this in two
separate steps.
@psFried psFried merged commit 7d6f557 into master Feb 3, 2025
3 checks passed
@psFried psFried deleted the phil/alltomations branch February 3, 2025 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

control-plane: move legacy handlers into automations
2 participants