Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support EventStore Projection #8

Open
Xatter opened this issue Feb 13, 2019 · 11 comments
Open

Support EventStore Projection #8

Xatter opened this issue Feb 13, 2019 · 11 comments

Comments

@Xatter
Copy link

Xatter commented Feb 13, 2019

For EventStore storage specifically, it looks like this library is designed so that EventStore is only used for storage of events and that subsequent communicating that events occured is offloaded to Kafka. The EventStore product does support messaging via PersistentSubscription and it would be nice for smaller projects to be able to use just EventStore and not have to spin up a Kafka cluster if that kind of scale is not needed.

Specifically I'd love an API where you can subscribe either to a specific aggregate user-1234 or a projection user-* and have the library handle things like checkpointing (storing the last read offset for a client in EventStore itself) similar to how PersistentSubscriptions work in the EventStore API today.

@bartelink
Copy link
Collaborator

Yes - using Kafka is definitely not a panacea and having consistent programming models between Equinox.Cosmos and Equinox.EventStore ia part of the idea; such stuff logically fits in an Equinox.EventStore.Projection library, which would match such facilities as presented in Equinox.Cosmos.Projection (which used the MS ChangeFeedProcessor library and stores state in an auxilliary bookmark collection).

As evidenced by this discussion

a) I personally am no expert in such ES facilities
b) there are internal users within Jet that are interested in such a facility at a high level
c) we're looking at building out just that

This would manifest as:
a) dotnet new eqxprojector -e template which would produce a Projector project that tails an ES instance
b) dotnet new eqxprojector -e -k template which would do the same as dotnet new eqxprojector -k (feeding to Kafka in the projector with one adding appropriate handling/filtering as desired)
c) dotnet new eqxprojector -e -c template mode which would feed from ES to an Equinox.Cosmos store (thats what the thread is about and what I have shortlisted to build)

Now, to your specific use case:-
a) but in general the idea would be to do a high level subscription tailing $all and the batteries included impl would be feeding to Kafka using the same components as used for dotnet new eqxprojector as it stands
b) supporting a more constrained subscription would be an extra feature on top of a

So, in short, if you wanted to do a PR to add such a thing
a) it could live in Equinox.EventStore.Projection with a sample adjacent to the Cosmos one in https://github.com/jet/dotnet-templates/tree/eqx-prj/equinox-projector-cosmos
b) it might move about a little and be merged with the above ES -> Cosmos sync facility sitting alongside

In terms of this organically happening without input from your side, I'd venture:
a) in general folks here tend to use Kafka as the consumer abstraction as that's all rigged and well maintained (i.e. nobody has had a similar ask internally atm)
b) whether it makes sense to use a wrapper out of a library if you have ES in place and expertise around working with that API might be debatable

@bartelink
Copy link
Collaborator

There's super-early WIP wrt this in jet/dotnet-templates#11

@bartelink
Copy link
Collaborator

Quick update: while there is work in train to provide wiring for projecting from EventStore, I should point out (esp upon re-reading the OP) that there are no current plans regarding storing consumer positions within EventStore itself coming from usage inside Jet.

I'll leave this open as I don't feel the need to rule it out, but in terms of adding logic into EventStore.fs to support it, my default stance would be to wait until the projecting to Kafka and/or bulk export side reaches some level of feature completeness.

Having said that... If you had a spike or more detail, I'd definitely be interested to hear more...

@bartelink
Copy link
Collaborator

have sliced and diced some PRs - jet/dotnet-templates#16 shows the current WIP I alluded to above

@bartelink
Copy link
Collaborator

jet/dotnet-templates#16 has had a lot of progress - it's about to sprout a way to store progress / offsets in Cosmos.
Some of the progress writing will likely then be backported into equinox-projector in non-kafka mode.
Between those two templates, you can treat Equinox.Cosmos + the CosmosDb ChangeFeed as a poor/rich man's Kafka (but consistent) projection system
Right now it's looking like no part of the solution involves tracking offsets in EventStore - it can potentially be made pluggable though (which would e.g. facilitate storing state in Equinox.SqlStreamStore and/or mixing and matching in other ways)

@bartelink
Copy link
Collaborator

bartelink commented Sep 11, 2019

Looking like we'll build this - idea is to provide a dotnet-templates template which consumes from $all, and maintains offset in an Equinox.Cosmos instance using the Checkpoint aggregate in Propulsion.Cosmos, which is used for Sync from ES->Cosmos. Clearly this makes no sense for apps that don't already have a dependency on such a store.

With a bit of hacking, it should be possible to hook this to store it elsewhere. Interested to hear what sort of backing stores people have in mind (realistically, I personally won't be committing to actually doing the implementation)

@CumpsD
Copy link
Contributor

CumpsD commented Jan 29, 2020

This is an interesting issue. I'm using ES and want to project to SQL Server/Elasticsearch/.... by trailing $all.

In the past I've had this kind of functionality with SQL Stream Store and Projac to project events into SQL Server tables, but I didn't find anything similar for ES. Initially I started writing it myself, but if propulsion would support this, that would be awesome.

@bartelink
Copy link
Collaborator

It's looking like my work needs mean I'll be writing up a proper end to end set of docs for Propulsion that explain how its pipeline fits together soon (I edited and deleted forward looking statements here).

There are similar systems out there which have similar architectures for other platforms (the name escapes me) - it should be mentioned that the split of the concepts in Propulsion derives from scaling and perf tuning of large sync and projection pipelines with real data and needs (mainly mixing and matching ES and Cosmos) - i.e. this is not me trying to build some abstract projection framework.

Sources+checkpoints

the summaryProjector template has wiring to read:
a) ES $all - with checkpoints saved in Equinox.Cosmos (writes an event every hour, updates a snapshot every 5s) Why? because in apps I'm working on, they always have an Equinox.Cosmos store to hand anyway
b) CosmosDB CFP - checkpoints managed by CFP (updates the checkpoint documents every 5s but there is no record)

An easy enough to achieve extension would be to make an adjusted version of (a) that writes to an EventStore stream every 5s that has been configured to only maintain the most recent event (it's pretty pluggable, if you e.g. had something like Consul, you could write to that too)

this is the main open question that is critical to resolving this Issue as I see it

Schedulers / Handlers

  • dotnet new summaryProjector shows how to
    • accumulate events, maintaining order within each stream
    • handle the buffered spans per stream (up to a specified max degree of parallelism)
  • dotnet new proProjector shows how to do various more advanced streamwise projections (only from CFP, but it works the same way)

I need to write docs on this

Propulsion Sinks

Propulsion has Sinks for ES and Cosmos. Sinks take event streams and replicate them - i.e. if you are migrating from ES->ES, ES->Cosmos, Cosmos->ES or Cosmos->Cosmos - the dotnet new proSync template demonstrates this. One could make a Propulsion.SqlStreamStore that would have the same facilities (syncing from/to ES/Cosmos/SqlStreamStore into or out of SqlStreamsStore)

this is an aside to your question, and the OP

ElasticSearch / SQL

TL;DR Its conceivable that one could Projac-ish helpers in a Propulsion.Sql module and a similar Propulsion.ElasticSearch helper - but that would not be a "sink" as such - from Propulsion's point of view, its really all just schedulers running handlers, and there's no real reason for general helpers that are not reading/writing/forwarding events to live in Propulsion.

I'll expand a bit in case that helps with conveying the thinking a bit:

Internally, we don't have any work in the near term writing to Elastic Search (don't get me wrong, we use it, just not that much in projections on the team I work on atm).

On Azure, for cases where you're targeting SQL but are not doing lots of indexing and/or searching/random queries, I'd actually give sinking to Equinox.Cosmos a try - dotnet new proTrackingConsumer and dotnet new proSummaryConsumer show the way - if you're projecting to whats effectively a document, that's actually very efficient from a cached reading perspective.

For Sql, we do (we have internal closed source helpers which are pretty generic - i.e. they might cover Projac-like usage, but are not as purposefully designed). I dont have plans/anticipate it crossing my radar to extract/polish such a facility, but I do know that lots of folks would use it. If you had a free standing set of helpers, a Propulsion.Sql module might not be crazy though - the first thing to make it do would be to port dotnet new proSummaryConsumer to it. Not sure if it would make that much sense to actually tie its releases and/or development to this repo though - ultimaltely Propulsion majors on:

  • sources from Kafka/Cosmos CFP/SqlStreamStore/EventStore
  • with parallel/streamwise/multi-stream scheduling strategies
  • invoking handlers that do things
    • special case: if that thing is "syncing events", then the logic to do that can sometimes make sense to host in Propulsion, e.g. Propulsion.EventStore has logic for syncing events to ES, Propulsion.Cosmos has some helpers for syncing events, but in general delegates to Equinox.Cosmos for the actual writing)

@bartelink
Copy link
Collaborator

related: this (incomplete) PR provides for projecting from ESDB $all, storing checkpoints in ESDB

@bartelink
Copy link
Collaborator

@bartelink
Copy link
Collaborator

bartelink commented May 12, 2022

related: Propulsion.EventStoreDb in V4 implements the reader using gRPC. Leaving this issue open as there's no in-the box checkpointing that stores the checkpoints in ESDB (supported checkpoint stores are DynamoDB, CosmosDB, Postgres and SQL Server)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants