Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InMemoryAdapter runs easily into race conditions #188

Closed
mentero opened this issue Feb 16, 2018 · 15 comments
Closed

InMemoryAdapter runs easily into race conditions #188

mentero opened this issue Feb 16, 2018 · 15 comments
Assignees

Comments

@mentero
Copy link

mentero commented Feb 16, 2018

Hi, we ran into a problem with using InMemoryAdapter today. I prepared an example repository that reproduces the problem.

https://github.com/mentero/ruby_event_store_test

Long story short. We have daily workers running at night and we wanted to publish some events from them. We have similar setup to this one. The problem is there are many sidekiq workers running at the same time and we got flooded by RubyEventStore::WrongExpectedEventVersion quickly.

We have investigated the problem and it seems like a classic race condition caused by expected_version being calculated outside of the mutex. It is very easy to run into a scenario where two workers calculate the same version number before entering synchronized block.

I don't know why InMemoryAdapter was chosen over ActiveRecord one, but nevertheless this behaviour should probably be considered a bug as it makes InMemoryAdapter very brittle.

@paneq
Copy link
Member

paneq commented Feb 16, 2018

@mentero Thanks for your ticket!

a) InMemoryAdapter is not documented right now
b) We've never expected it to be used in production, I believe.
c) what RES version are we talking about?

We will definitely need to think about, but I cannot promise solving this soon.

@mentero
Copy link
Author

mentero commented Feb 16, 2018

@paneq

We have 0.20 in our app but I reproduced that on the 0.25 in the repository linked above

@paneq
Copy link
Member

paneq commented Feb 16, 2018

@paneq
Copy link
Member

paneq commented Feb 16, 2018

Related:

module RubyEventStore
  RSpec.describe InMemoryRepository do
    # There is no way to use in-memory adapter in a
    # lock-free, unlimited concurrency way
    let(:test_race_conditions_any)   { false }
    let(:test_race_conditions_auto)  { true }
    let(:test_expected_version_auto) { true }
    let(:test_link_events_to_stream) { true }

https://github.com/RailsEventStore/rails_event_store/blob/master/ruby_event_store/spec/in_memory_repository_spec.rb

@paneq
Copy link
Member

paneq commented Feb 16, 2018

@mentero Also, we would be happy if you fill out our poll about RES usage: https://goo.gl/forms/ecNpJsL5qM2M8Ls93 if you haven't yet :) Thank you and have a great day!

@paneq
Copy link
Member

paneq commented Feb 16, 2018

but I reproduced that on the 0.25 in the repository linked above

:(

@mentero
Copy link
Author

mentero commented Feb 16, 2018

The problem is

happening before this

which causes this to fail

raise WrongExpectedEventVersion unless (stream.size - 1).equal?(expected_version)

@paneq
Copy link
Member

paneq commented Feb 16, 2018

@mentero indeed.

@mentero
Copy link
Author

mentero commented Feb 16, 2018

I have tried a naive fix and moved calculating expected_version inside the mutex but it broke tests that expect some failures when :auto is used outside of a lock

@paneq
Copy link
Member

paneq commented Feb 16, 2018

@mentero I just checked and right now There is no way to use in-memory adapter in a lock-free, unlimited concurrency way. It was not designed like that. Right now :any version behaves as :auto ( http://railseventstore.org/docs/expected_version/ ) in InMemoryRepository.

We could try to fix it so the InMemoryRepository would work according to the documentation :)

I think it would require calling stream.size - 1 inside the lock for :any and outside for :auto.

@paneq
Copy link
Member

paneq commented Feb 16, 2018

@mentero Is that application ever reading those events? Is it ever clearing the always growing in memory list of events?

If it is never reading, then you it means you are using RES just as pub/sub without intended persistence. In that case you could substitute a Repository which does nothing and it would work better for the usecase :) I think it might be even beneficial to have such NullRepository which does not save as part of RES.

@siemakuba
Copy link

@paneq this is some weird mind-reading happening around here...
I've just pushed a Pull Request with exactly this kind of changes - an object utilizing the NullObject pattern that is passed as a repository to the RES Client.

It's dead simple, but first tests show that It'd work just fine.
I'd be happy to prepare a PR for RES introducing this solution.

@siemakuba
Copy link

And yes, your're absolutely right with:

If it is never reading, then it means you are using RES just as pub/sub without intended persistence.

;)

@paneq
Copy link
Member

paneq commented Feb 16, 2018

Siema Kuba :)

I'd be happy to prepare a PR for RES introducing this solution.

yes, please :)

@paneq paneq self-assigned this Feb 23, 2018
paneq added a commit that referenced this issue Feb 23, 2018
Guarantees
When there are many threads (or processes) writing concurrently (at the same time) to a stream with expected_version set to :any, all of those writes should succeed.
When you later try to read those events you from a stream, you don't have a guarantee about particular order you are going to get them in.
Example: If 3 processes A,B,C publish events to stream foo-1 at the same time, all of them should succeed. When you read events from stream foo-1 you will get events A,B,C or A,C,B or B,A,C or B,C,A or C,A,B or C,B,A.
Should never fail
When a single thread (or process) writes events A, B to a stream, it is guaranteed you will retrieve events A,B in that exact order when reading from a stream.

as in http://railseventstore.org/docs/expected_version/

We use lock (mutex) under the hood but that's not an issue. SQL DBs also
somewhere deep down might use locks to when updating (uniq) indexes
(which ActiveRecordRepository used).

So this more closely resembles production behavior of :any

Refs: #188
@paneq paneq closed this as completed in 06d10af Feb 27, 2018
@paneq
Copy link
Member

paneq commented Feb 27, 2018

@mentero @siemakuba fixed on master, to be included in next release. Thank you for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants