Kafka Streams support? #181

sneko · 2018-05-08T20:13:06Z

Hi!

I would just know if you plan to support Kafka Streams in this Go library?

I already sent you an email but I've never been answered 😢

Thanks 😄

edenhill · 2018-05-09T08:58:26Z

There is no effort or plan at this time, users are advised to use KSQL (which can be used from any language) to reap the benefits of kafka streams.

sneko · 2018-08-15T21:43:43Z

@edenhill, just to follow up: unfortunately while using KSQL we are not able to access its data in KTable and Global KTable at a specific time.

I mean, I can't do a "SELECT * FROM users" from the KSQL API even if my KSQL instances have the user table as a state in their datastore.

It's truly sad since a lot of Confluent blog articles are promoting the fact of having datastore directly inside the microservice.

So I thought about 2 solutions:

either use Kafka Streams that implements KTable... but that's only in Java and I'm using Golang
or put a KSQL instance aside my microservice (like a sidecar container) that would process Streams for me then I would be able to query the KSQL API on my Global KTables but unfortunately that's not possible at this time

Do you see any better solution? Or do you have Golang on the roadmap?

Currently I'm putting basic database (like MongoDB) aside each microservice, and when they start they create a connector through Kafka Connect to populate their local database. The main concern with this solution is that I don't have an optimized view of my data. Just several tables corresponding to several topics that I need to query with "JOIN" operations.

I didn't find a better solution :(

Thank you

miguno · 2018-08-16T12:35:33Z

Hi @sneko!

You are right that this is currently possible with Kafka Streams via its Interactive Queries feature, but not yet with KSQL (which is built on top of Kafka Streams).

I mean, I can't do a "SELECT * FROM users" from the KSQL API even if my KSQL instances have the user table as a state in their datastore.

For the record of other readers, you can do a SELECT * FROM users query in KSQL. But this is, at the moment, a streaming SELECT query that keeps running forever. And what @sneko is interested in is:

unfortunately while using KSQL we are not able to access its data in KTable and Global KTable at a specific time.

That is, if I understand correctly, the need to do a "traditional" SELECT query (like in MySQL) that returns the appropriate output records and then terminates. Is that correct?

If yes, then we are tracking this feature request at KSQL GH-530: Support point-in-time queries for Tables, and please upvote the feature by giving your 👍 in the first post. :-) We are definitely interested in implementing this feature, but other tasks have gotten higher priority at the moment.

sneko · 2018-08-16T13:26:07Z

Hi @miguno !

Yeah I was talking about a traditional SELECT. I will explain why in the following:

After reading a lot on Kafka patterns in a microservices system, it seems for me that the best usage is to have the view (merge of different topics) stored in a datastore that is local to the microservice.

In my previous message I'm talking about having a KSQL aside my microservice and be able to do a traditional SELECT to this datastore because I would be able to do requests like "getUser($id)", "getAllUsers()"...

Since this feature is on your roadmap but without the highest priority I need to get around to use the desired pattern.

So I will explain the workaround I imagined. If you could give me your thoughts about it, it would be great.

Guess we have two entity types, User and Order. I would like to get a view stored next to my OrderService where I could fetch my orders but with also getting the firstname and lastname of the user. The idea results in making a view joining orders and users on the user.id that is stored in both topics "topic-users" and "topic-orders".

I could ask a KSQL cluster (not 1 node per microservice, but a distinct KSQL cluster) to create a STREAM (a new topic named "topic-view-by-orders") where it will push the merge (= view) of new events and users.

Then at each start of my OrderService I would connect to the Kafka "topic-view-by-orders" topic and consume all the events from the earliest event in this topic. It means that each time I receive an event I can set it in a RocksDB or whatever key/value storage I have.

Note: I could also instanciate a new connector in a Kafka Connect to populate my local datastore (Redis or whatever except RocksDB since it's a library)

When my OrderService consumer is almost to the latest topic offset, I can consider it as ready to receive HTTP requests from my own API gateway.

Note: I'm still looking for a way to be sure that the KSQL cluster is well advanced in the building of the "topic-view-by-orders" topic. I mean, to say that my OrderService is ready, I need to be sure that the "topic-view-by-orders" has merged events that correspond to the almost latest events in "topic-users" and "topic-orders".

If my workaround seems bad, what do you advise to me to have view locally stored with my microservice? I'm not an expert in Kafka ecosystem but since I can see everywhere about local datastore corresponding to a minimal view of the data required, I don't understand why there is not that much examples on it 😢 . Maybe I missed something 😀 ?

Thanks,

PS: I had already upvoted the issue ;)

miguno · 2018-08-20T09:09:16Z

Thanks for sharing all this context information, @sneko -- this is very helpful! 👍

miguno · 2018-08-20T09:11:33Z

And what you outlined for your DIY approach ("workaround) seems good to me.

maeglindeveloper · 2018-08-21T07:58:40Z

Hi everyone ! :)

Seems that "issue" is also related to the following one right?

confluentinc/ksql#1751

As @sneko, I'm also wondering how can we say that our Microservice consuming kafka topics can be set to "Ready", assuming that it needs to consume a lot of event in our topic and create its local store (RocksDb or whatever).

:)

miguno · 2018-08-21T08:12:19Z

One way to determine Readiness is to check/monitor the offsets of the consumed topics ("has my microservice read all / almost all the available offsets = messages yet?").

miguno · 2018-08-21T08:20:02Z

For the metrics that KSQL exposes please take a look at:

https://docs.confluent.io/current/ksql/docs/operations.html#monitoring-and-metrics
https://docs.confluent.io/current/streams/monitoring.html#streams-monitoring (KSQL also exposes its underlying Kafka Streams metrics)

maeglindeveloper · 2018-08-21T08:33:10Z

@miguno, thanks for you answers, that's really helpfull!

But what about topic which has been created by KSQL joining two topics (users and orders) which have a lot of data (as mentionned in the following issue confluentinc/ksql#1751) ?

When creating our stream using KSQL CREATE STREAM..., we can then ask for the Status of the STREAM. But maybe it is still processing the data? :s and the topic which is the image of that Stream is not fully updated?

As @sneko , I'm still confused about that :s.

sneko · 2018-08-22T09:51:06Z

Hi @miguno, when you say that this approach seems good to you, what does it imply?

I mean, is there any official path to follow advised by the Confluent team to have a local view for each microservice?

Even if you “validate” the approach I described I still feel uncertain about it... it would be great to know which flow of data the Confluent team thinks about when you wrote articles that talk about local optimized view in microservices.

Thank you 🙏

edenhill added the enhancement label May 9, 2018

sneko mentioned this issue Aug 22, 2018

Being able to know how advanced the stream processing is confluentinc/ksql#1751

Open

edenhill added the wontfix label Oct 17, 2019

lenimartin mentioned this issue Aug 25, 2020

Kafka stream #512

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka Streams support? #181

Kafka Streams support? #181

sneko commented May 8, 2018

edenhill commented May 9, 2018

sneko commented Aug 15, 2018 •

edited

Loading

miguno commented Aug 16, 2018

sneko commented Aug 16, 2018 •

edited

Loading

miguno commented Aug 20, 2018

miguno commented Aug 20, 2018

maeglindeveloper commented Aug 21, 2018 •

edited

Loading

miguno commented Aug 21, 2018

miguno commented Aug 21, 2018

maeglindeveloper commented Aug 21, 2018

sneko commented Aug 22, 2018

Kafka Streams support? #181

Kafka Streams support? #181

Comments

sneko commented May 8, 2018

edenhill commented May 9, 2018

sneko commented Aug 15, 2018 • edited Loading

miguno commented Aug 16, 2018

sneko commented Aug 16, 2018 • edited Loading

miguno commented Aug 20, 2018

miguno commented Aug 20, 2018

maeglindeveloper commented Aug 21, 2018 • edited Loading

miguno commented Aug 21, 2018

miguno commented Aug 21, 2018

maeglindeveloper commented Aug 21, 2018

sneko commented Aug 22, 2018

sneko commented Aug 15, 2018 •

edited

Loading

sneko commented Aug 16, 2018 •

edited

Loading

maeglindeveloper commented Aug 21, 2018 •

edited

Loading