Sending event id to Sidekiq instead of serialized event #755

matiasgarcia · 2020-08-21T13:00:59Z

In the docs, there is some suggested code to manage async handlers on Sidekiq.

The Scheduler is as follows:

module RailsEventStore
  class SidekiqScheduler
    def call(klass, serialized_event)
      klass.perform_async(serialized_event.to_h)
    end

    def verify(subscriber)
      subscriber.is_a?(Class) && subscriber < Sidekiq::Worker
    end
  end
end

And the AsyncHandler from the library is:

module RailsEventStore
  module AsyncHandler
    def perform(payload)
      super(Rails.configuration.event_store.deserialize(payload.symbolize_keys))
    end
  end
end

This works perfectly fine but we found ourselves hitting max memory error on Redis under heavy load. This is when I realized that the Scheduler serializes the whole event.

I was wondering if it's possible to provide just the event_id and write a custom AsyncHandler to retrieve it.

Like this:

module RailsEventStore
  class SidekiqScheduler
    def call(klass, serialized_event)
      klass.perform_async(serialized_event.event_id)
    end

    def verify(subscriber)
      subscriber.is_a?(Class) && subscriber < Sidekiq::Worker
    end
  end
end

module RailsEventStore
  module AsyncHandler
    def perform(payload)
      event = RailsEventStoreActiveRecord::Event.find(payload)
      super(event)
    end
  end
end

Or is there any problem on doing so?

The text was updated successfully, but these errors were encountered:

mostlyobvious · 2020-08-21T13:33:39Z

I was wondering if it's possible to provide just the event_id and write a custom AsyncHandler to retrieve it.

That sounds reasonable and totally valid approach as long as it is scheduled "after commit" (so the event is already committed and visible outside any possible transaction wrapping publish process).

Nitpicks:

RailsEventStoreActiveRecord::Event is a private detail of an adapter, you shouldn't use it directly — there's an API on Client for retrieving event by its id https://railseventstore.org/docs/read/#reading-specific-events
I'd prefer having your app specific namespace to redefining modules and classes from RailsEventStore namespace for scheduler and handler behaving differently than an upstream one, you can swap these components in configuration:

event_store = RailsEventStore::Client.new(
  dispatcher: RubyEventStore::ComposedDispatcher.new(
                     RubyEventStore::ImmediateAsyncDispatcher.new(scheduler: MyScheduler.new),
                     RubyEventStore::Dispatcher.new)
)

matiasgarcia · 2020-08-21T13:54:05Z

Thanks for the guidance @pawelpacana!

Our rails event store is configured as follows:

  event_store = RailsEventStore::Client.new(
    dispatcher: RubyEventStore::ComposedDispatcher.new(
      RailsEventStore::AfterCommitAsyncDispatcher.new(scheduler: RailsEventStore::SidekiqScheduler.new),
      RubyEventStore::Dispatcher.new
    )
  )

Could it be that with RailsEventStore::AfterCommitAsyncDispatcher I don't really have to publish events in an after_commit hook but instead after_save/update/destroy should be enough?

mpraglowski · 2020-08-21T15:36:41Z

There are two additional callbacks that are triggered by the completion of a database transaction: after_commit and after_rollback. These callbacks are very similar to the after_save callback except that they don't execute until after database changes have either been committed or rolled back. They are most useful when your active record models need to interact with external systems which are not part of the database transaction.

from: https://guides.rubyonrails.org/active_record_callbacks.html#transaction-callbacks

The "external systems which are not part of the database transaction" the documentation mentions here is the Redis used by Sidekiq to store scheduled jobs.

Could it be that with RailsEventStore::AfterCommitAsyncDispatcher I don't really have to publish events in an after_commit hook but instead after_save/update/destroy should be enough?

Then you are at risk of running handlers (Sidekiq jobs) for events that have not been stored because transaction has been rolled back.

mostlyobvious · 2020-08-24T12:18:10Z

@matiasgarcia with RailsEventStore::AfterCommitAsyncDispatcher you're fine — the dispatcher would not schedule anything before commit or after rollback

ApplicationRecord.transaction do
   # do something
   event_store.publish(...) # not scheduling yet, as others might not see uncommitted changes from this transaction or it might be rolledback
end
# scheduling now

matiasgarcia · 2020-08-28T16:04:16Z

Thanks @pawelpacana.

I was wondering if maybe a maintained version of this AsyncHandler should be in the gem? I am saving 92% of my Redis storage with this change.

mostlyobvious · 2020-09-01T09:34:22Z

It's an interesting tradeoff to save on redis memory in exchange for additional query on a database. I'm not sure yet what is better as a default choice in AsyncHandler.

I'd accept PR with and alternative or configurable strategy for AsyncHandler for now 👍

paneq · 2020-09-05T09:17:43Z

@pawelpacana I like the idea!

I think we could change the implementation but keep the dispatcher/scheduler contract of (event, serialized_event) the same so that if someone wants to use the serialized version they can still do it. It's beneficial in case the consumer does not have access to RES storage.

matiasgarcia · 2020-11-02T14:57:31Z

It's an interesting tradeoff to save on Redis memory in exchange for an additional query on a database. I'm not sure yet what is better as a default choice in AsyncHandler.

I'd accept PR with and alternative or configurable strategy for AsyncHandler for now

In our case, we were firing around 200k events in a short time span (10 minutes) with a Redis DB of 100MB storage. This made it quickly reach the maximum threshold, because in our events we were tracking changes to AR models, making the payload quite big. So in this specific use case, it paid off to just send the event id. The additional query to fetch the event should be fairly quick since they are ids, although I don't have a benchmark to guarantee this :)

mostlyobvious · 2020-11-03T11:26:21Z

For the record and anyone looking for clear transition path:

following handler mixin works well both with full payload and just an event_id

module EventFromEventId
  def perform(payload)
    event_id = payload.symbolize_keys.fetch(:event_id)
    super(event_store.read.event(event_id))
  end
end

passing slice of the event payload containing event_id, for one-way compatibility

class Scheduler
  def call(klass, serialized_event)
    klass.perform_later({ event_id: serialized_event.event_id })
  end
  def verify(subscriber)
    Class === subscriber && !!(subscriber < ActiveJob::Base)
  end
end

Pro: - refactoring-friendly (i.e. migrating event_type) because event data is not stored temporarily in redis, one moving part to care about less - constant size in redis as opposed to size dependent on data and metadata - simpler configuration (no need for marching scheduler and async handler serializers) Con: - handlers require database access (they usually do have it) - additional SQL query to load an event Serialization cost remains unchanged. Whether passing an id or full-serialized payload, the data and metadata need to be deserialized. Also there's no additional serialization step for scheduler, it reuses serialization for repository if that is needed. #755

mostlyobvious closed this as completed Aug 24, 2020

mostlyobvious mentioned this issue Jun 21, 2022

Decouple serialisation-related logic from AsyncHandler #1334

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sending event id to Sidekiq instead of serialized event #755

Sending event id to Sidekiq instead of serialized event #755

matiasgarcia commented Aug 21, 2020

mostlyobvious commented Aug 21, 2020

matiasgarcia commented Aug 21, 2020

mpraglowski commented Aug 21, 2020

mostlyobvious commented Aug 24, 2020

matiasgarcia commented Aug 28, 2020

mostlyobvious commented Sep 1, 2020

paneq commented Sep 5, 2020

matiasgarcia commented Nov 2, 2020 •

edited

Loading

mostlyobvious commented Nov 3, 2020

Sending event id to Sidekiq instead of serialized event #755

Sending event id to Sidekiq instead of serialized event #755

Comments

matiasgarcia commented Aug 21, 2020

mostlyobvious commented Aug 21, 2020

matiasgarcia commented Aug 21, 2020

mpraglowski commented Aug 21, 2020

mostlyobvious commented Aug 24, 2020

matiasgarcia commented Aug 28, 2020

mostlyobvious commented Sep 1, 2020

paneq commented Sep 5, 2020

matiasgarcia commented Nov 2, 2020 • edited Loading

mostlyobvious commented Nov 3, 2020

matiasgarcia commented Nov 2, 2020 •

edited

Loading