-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sending event id to Sidekiq instead of serialized event #755
Comments
That sounds reasonable and totally valid approach as long as it is scheduled "after commit" (so the event is already committed and visible outside any possible transaction wrapping publish process). Nitpicks:
event_store = RailsEventStore::Client.new(
dispatcher: RubyEventStore::ComposedDispatcher.new(
RubyEventStore::ImmediateAsyncDispatcher.new(scheduler: MyScheduler.new),
RubyEventStore::Dispatcher.new)
) |
Thanks for the guidance @pawelpacana! Our rails event store is configured as follows:
Could it be that with |
from: https://guides.rubyonrails.org/active_record_callbacks.html#transaction-callbacks The "external systems which are not part of the database transaction" the documentation mentions here is the Redis used by Sidekiq to store scheduled jobs.
Then you are at risk of running handlers (Sidekiq jobs) for events that have not been stored because transaction has been rolled back. |
@matiasgarcia with RailsEventStore::AfterCommitAsyncDispatcher you're fine — the dispatcher would not schedule anything before commit or after rollback ApplicationRecord.transaction do
# do something
event_store.publish(...) # not scheduling yet, as others might not see uncommitted changes from this transaction or it might be rolledback
end
# scheduling now |
Thanks @pawelpacana. I was wondering if maybe a maintained version of this AsyncHandler should be in the gem? I am saving 92% of my Redis storage with this change. |
It's an interesting tradeoff to save on redis memory in exchange for additional query on a database. I'm not sure yet what is better as a default choice in AsyncHandler. I'd accept PR with and alternative or configurable strategy for AsyncHandler for now 👍 |
@pawelpacana I like the idea! I think we could change the implementation but keep the dispatcher/scheduler contract of |
In our case, we were firing around 200k events in a short time span (10 minutes) with a Redis DB of 100MB storage. This made it quickly reach the maximum threshold, because in our events we were tracking changes to AR models, making the payload quite big. So in this specific use case, it paid off to just send the event id. The additional query to fetch the event should be fairly quick since they are ids, although I don't have a benchmark to guarantee this :) |
For the record and anyone looking for clear transition path:
module EventFromEventId
def perform(payload)
event_id = payload.symbolize_keys.fetch(:event_id)
super(event_store.read.event(event_id))
end
end
class Scheduler
def call(klass, serialized_event)
klass.perform_later({ event_id: serialized_event.event_id })
end
def verify(subscriber)
Class === subscriber && !!(subscriber < ActiveJob::Base)
end
end |
Pro: - refactoring-friendly (i.e. migrating event_type) because event data is not stored temporarily in redis, one moving part to care about less - constant size in redis as opposed to size dependent on data and metadata - simpler configuration (no need for marching scheduler and async handler serializers) Con: - handlers require database access (they usually do have it) - additional SQL query to load an event Serialization cost remains unchanged. Whether passing an id or full-serialized payload, the data and metadata need to be deserialized. Also there's no additional serialization step for scheduler, it reuses serialization for repository if that is needed. #755
Pro: - refactoring-friendly (i.e. migrating event_type) because event data is not stored temporarily in redis, one moving part to care about less - constant size in redis as opposed to size dependent on data and metadata - simpler configuration (no need for marching scheduler and async handler serializers) Con: - handlers require database access (they usually do have it) - additional SQL query to load an event Serialization cost remains unchanged. Whether passing an id or full-serialized payload, the data and metadata need to be deserialized. Also there's no additional serialization step for scheduler, it reuses serialization for repository if that is needed. #755
In the docs, there is some suggested code to manage async handlers on Sidekiq.
The Scheduler is as follows:
And the AsyncHandler from the library is:
This works perfectly fine but we found ourselves hitting max memory error on Redis under heavy load. This is when I realized that the Scheduler serializes the whole event.
I was wondering if it's possible to provide just the
event_id
and write a custom AsyncHandler to retrieve it.Like this:
Or is there any problem on doing so?
The text was updated successfully, but these errors were encountered: