Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka Streams support? #181

Open
sneko opened this issue May 8, 2018 · 11 comments
Open

Kafka Streams support? #181

sneko opened this issue May 8, 2018 · 11 comments

Comments

@sneko
Copy link

sneko commented May 8, 2018

Hi!

I would just know if you plan to support Kafka Streams in this Go library?

I already sent you an email but I've never been answered 😢

Thanks 😄

@edenhill
Copy link
Contributor

edenhill commented May 9, 2018

There is no effort or plan at this time, users are advised to use KSQL (which can be used from any language) to reap the benefits of kafka streams.

@sneko
Copy link
Author

sneko commented Aug 15, 2018

@edenhill, just to follow up: unfortunately while using KSQL we are not able to access its data in KTable and Global KTable at a specific time.

I mean, I can't do a "SELECT * FROM users" from the KSQL API even if my KSQL instances have the user table as a state in their datastore.

It's truly sad since a lot of Confluent blog articles are promoting the fact of having datastore directly inside the microservice.

So I thought about 2 solutions:

  • either use Kafka Streams that implements KTable... but that's only in Java and I'm using Golang
  • or put a KSQL instance aside my microservice (like a sidecar container) that would process Streams for me then I would be able to query the KSQL API on my Global KTables but unfortunately that's not possible at this time

Do you see any better solution? Or do you have Golang on the roadmap?

Currently I'm putting basic database (like MongoDB) aside each microservice, and when they start they create a connector through Kafka Connect to populate their local database. The main concern with this solution is that I don't have an optimized view of my data. Just several tables corresponding to several topics that I need to query with "JOIN" operations.

I didn't find a better solution :(

Thank you

@miguno
Copy link

miguno commented Aug 16, 2018

Hi @sneko!

You are right that this is currently possible with Kafka Streams via its Interactive Queries feature, but not yet with KSQL (which is built on top of Kafka Streams).

I mean, I can't do a "SELECT * FROM users" from the KSQL API even if my KSQL instances have the user table as a state in their datastore.

For the record of other readers, you can do a SELECT * FROM users query in KSQL. But this is, at the moment, a streaming SELECT query that keeps running forever. And what @sneko is interested in is:

unfortunately while using KSQL we are not able to access its data in KTable and Global KTable at a specific time.

That is, if I understand correctly, the need to do a "traditional" SELECT query (like in MySQL) that returns the appropriate output records and then terminates. Is that correct?

If yes, then we are tracking this feature request at KSQL GH-530: Support point-in-time queries for Tables, and please upvote the feature by giving your 👍 in the first post. :-) We are definitely interested in implementing this feature, but other tasks have gotten higher priority at the moment.

@sneko
Copy link
Author

sneko commented Aug 16, 2018

Hi @miguno !

Yeah I was talking about a traditional SELECT. I will explain why in the following:

After reading a lot on Kafka patterns in a microservices system, it seems for me that the best usage is to have the view (merge of different topics) stored in a datastore that is local to the microservice.

In my previous message I'm talking about having a KSQL aside my microservice and be able to do a traditional SELECT to this datastore because I would be able to do requests like "getUser($id)", "getAllUsers()"...

Since this feature is on your roadmap but without the highest priority I need to get around to use the desired pattern.

So I will explain the workaround I imagined. If you could give me your thoughts about it, it would be great.

Guess we have two entity types, User and Order. I would like to get a view stored next to my OrderService where I could fetch my orders but with also getting the firstname and lastname of the user. The idea results in making a view joining orders and users on the user.id that is stored in both topics "topic-users" and "topic-orders".

I could ask a KSQL cluster (not 1 node per microservice, but a distinct KSQL cluster) to create a STREAM (a new topic named "topic-view-by-orders") where it will push the merge (= view) of new events and users.

Then at each start of my OrderService I would connect to the Kafka "topic-view-by-orders" topic and consume all the events from the earliest event in this topic. It means that each time I receive an event I can set it in a RocksDB or whatever key/value storage I have.

Note: I could also instanciate a new connector in a Kafka Connect to populate my local datastore (Redis or whatever except RocksDB since it's a library)

When my OrderService consumer is almost to the latest topic offset, I can consider it as ready to receive HTTP requests from my own API gateway.

Note: I'm still looking for a way to be sure that the KSQL cluster is well advanced in the building of the "topic-view-by-orders" topic. I mean, to say that my OrderService is ready, I need to be sure that the "topic-view-by-orders" has merged events that correspond to the almost latest events in "topic-users" and "topic-orders".

If my workaround seems bad, what do you advise to me to have view locally stored with my microservice? I'm not an expert in Kafka ecosystem but since I can see everywhere about local datastore corresponding to a minimal view of the data required, I don't understand why there is not that much examples on it 😢 . Maybe I missed something 😀 ?

Thanks,

PS: I had already upvoted the issue ;)

@miguno
Copy link

miguno commented Aug 20, 2018

Thanks for sharing all this context information, @sneko -- this is very helpful! 👍

@miguno
Copy link

miguno commented Aug 20, 2018

And what you outlined for your DIY approach ("workaround) seems good to me.

@maeglindeveloper
Copy link

maeglindeveloper commented Aug 21, 2018

Hi everyone ! :)

Seems that "issue" is also related to the following one right?

confluentinc/ksql#1751

As @sneko, I'm also wondering how can we say that our Microservice consuming kafka topics can be set to "Ready", assuming that it needs to consume a lot of event in our topic and create its local store (RocksDb or whatever).

:)

@miguno
Copy link

miguno commented Aug 21, 2018

One way to determine Readiness is to check/monitor the offsets of the consumed topics ("has my microservice read all / almost all the available offsets = messages yet?").

@miguno
Copy link

miguno commented Aug 21, 2018

For the metrics that KSQL exposes please take a look at:

@maeglindeveloper
Copy link

@miguno, thanks for you answers, that's really helpfull!

But what about topic which has been created by KSQL joining two topics (users and orders) which have a lot of data (as mentionned in the following issue confluentinc/ksql#1751) ?

When creating our stream using KSQL CREATE STREAM..., we can then ask for the Status of the STREAM. But maybe it is still processing the data? :s and the topic which is the image of that Stream is not fully updated?

As @sneko , I'm still confused about that :s.

@sneko
Copy link
Author

sneko commented Aug 22, 2018

Hi @miguno, when you say that this approach seems good to you, what does it imply?

I mean, is there any official path to follow advised by the Confluent team to have a local view for each microservice?

Even if you “validate” the approach I described I still feel uncertain about it... it would be great to know which flow of data the Confluent team thinks about when you wrote articles that talk about local optimized view in microservices.

Thank you 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants