Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return 1 or 0 instead of -1 when detecting an invalid offset in Kafka #2621

Merged
merged 9 commits into from
Apr 7, 2022

Conversation

RamCohen
Copy link
Contributor

@RamCohen RamCohen commented Feb 10, 2022

When getting the consumer offset of a Kafka topic for which no offset
was committed yet, the scaler returns with -1 instead of 0, which
causes to scale to the maximum number of replicas.

This fix changes the behavior to return in that case either '1' (the default) or '0', depending on a new scaler parameter scaleToZeroOnInvalidOffset which default to 'false'

Also fixed some typos and used interim variables for error strings.

Fixes #2612 #2033

@RamCohen
Copy link
Contributor Author

/run-e2e kafka*

@JorTurFer
Copy link
Member

/run-e2e kafka*

Hi @RamCohen , only people with write permission can trigger this.
Now we have 2 execution in progress but I'll trigger this after those finish

@JorTurFer
Copy link
Member

JorTurFer commented Feb 10, 2022

/run-e2e kafka*
Update: You can check the progres here

@RamCohen RamCohen changed the title Return 0 instead of -1 when detecting an invalid offset in Kafka Return 1 or 0 instead of -1 when detecting an invalid offset in Kafka Feb 17, 2022
@JorTurFer
Copy link
Member

Does this make sense @kedacore/keda-core-contributors ?

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, thanks! Could you please open documentation PR?

@zroubalik
Copy link
Member

@bpinske @PaulLiang1 PTAL

@PaulLiang1
Copy link
Contributor

@bpinske @PaulLiang1 PTAL

looks good.
also replied on #2612 (comment)

@zroubalik
Copy link
Member

I wonder whether we can extend the e2e test to cover this scenario? I think that creating a topic without any message should cause the invalid offset problem, right?

@zroubalik
Copy link
Member

I wonder whether we can extend the e2e test to cover this scenario? I think that creating a topic without any message should cause the invalid offset problem, right?

@RamCohen do you think you can contribute this?

@RamCohen
Copy link
Contributor Author

I'll have a look, but I'm not sure where have we landed with the proposed solution vis-a-vis having a configuration parameter for returning either 0 or 1 and what would be its default

@rrmcclymont
Copy link

thanks for this pull request! we are having problems in production due to this issue :) do we have an estimate when the new version will be released?

@JorTurFer
Copy link
Member

JorTurFer commented Mar 14, 2022

thanks for this pull request! we are having problems in production due to this issue :) do we have an estimate when the new version will be released?

hi @rrmcclymont ,
Our plan is to release v2.7 on May, hopefully on May 9th. I guess that this PR will be merged before that date so it should be included on v2.7.
BTW, if you are interested, you can check the content for the next release in our backlog

@rrmcclymont
Copy link

thanks @JorTurFer glad to see that we will include this fix in the following release :)

@RamCohen RamCohen force-pushed the kafka_invalid_offset branch 2 times, most recently from b1042f0 to 35b592a Compare March 30, 2022 09:51
@RamCohen
Copy link
Contributor Author

Added tests

@zroubalik
Copy link
Member

zroubalik commented Apr 1, 2022

/run-e2e kafka*
Update: You can check the progres here

@zroubalik
Copy link
Member

zroubalik commented Apr 5, 2022

/run-e2e kafka*
Update: You can check the progres here

@zroubalik
Copy link
Member

@RamCohen could please open docs PR for this?

@RamCohen
Copy link
Contributor Author

RamCohen commented Apr 5, 2022

kedacore/keda-docs#741

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,

@RamCohen so the last bit is Changelog

RamCohen added 5 commits April 5, 2022 14:51
When getting the consumer offset of a Kafka topic for which no offset
was committed yet, the scaler returns with -1 instead of 0, which
causes to scale to the maximum number of replicas.

Also fixed some typos and used interim variables for error strings.

Fixes kedacore#2612

Signed-off-by: Ram Cohen <[email protected]>
Signed-off-by: Ram Cohen <[email protected]>
Signed-off-by: Ram Cohen <[email protected]>
Update to use strimzi operator to v0.23.0 and Kafka 2.6.0 in order to properly
work on Kubernetes 1.21 and up due to deprecation of beta CRD api
Also refactor common deploy and status checks to use internal methods

Signed-off-by: Ram Cohen <[email protected]>
@RamCohen RamCohen force-pushed the kafka_invalid_offset branch from 34713b8 to e9b8e01 Compare April 5, 2022 11:56
Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot @RamCohen!

@zroubalik
Copy link
Member

zroubalik commented Apr 7, 2022

/run-e2e kafka*
Update: You can check the progres here

@akshay201
Copy link

akshay201 commented Jun 8, 2023

@zroubalik What is the expected behaviour now.
If we do not have any consumer group created for a topic and new messages have arrived in topic, in this case I suppose invalid offset will be returned and keda will return 0 or scale to 0 if scaleToZeroOnInvalidOffset is set to true. Doubt is then if there are no consumers present, there would be no consumer group and offset will be invalid always. How is the problem related to this is solved with this so that consumers are scaled up or keda is activated on invalid offset but messages present in topic?

We want to scale consumers to 0 when there are no messages in topic and scale back consumers if there any new messages.

Apologies if I did not understand or if I am missing something here. Thanks.

@pnorth1
Copy link

pnorth1 commented Jun 8, 2023

@akshay201 We faced the same problem as you. I can share the main approaches we considered. I'm also curious to hear from others on how they've worked around this.

  1. One option we considered was to configure the ScaledObject to use a custom metrics endpoint. One could implement logic to connect to the kafka brokers and measure consumer group lag and/or available messages and return a metric on the http endpoint that the ScaledObject is pointing to.

  2. The option we ultimately decided to implement was to have a separate process connect to the Kafka brokers and check if a topic had a valid consumer group lag. If there was not a valid consumer group lag (the consumers have not yet committed any offsets), then the process would publish a set of dummy-messages to the topic. Consumers knew how to digest and ignore these no-op messages, but it still allowed the consumers to commit offsets. In this case, we left scaleToZeroOnInvalidOffset to false so that an initial consumer would exist to commit offsets for the dummy messages.

During exploration we wondered if it was possible to manually commit offsets from the consumer's Kafka client, but I believe the broker's rejected the commit since it was trivial (committing to offset 0 for all partitions). We never really determined if this was a product of the kafka client we were using, or a broker setting, or totally unavoidable. So, that could be something quick to try as well.

@akshay201
Copy link

akshay201 commented Jun 9, 2023

@pnorth1 It seems like issue is resolved from the documentation of latest version
https://keda.sh/docs/2.0/scalers/apache-kafka/#new-consumers-and-offset-reset-policy

Did you try new version?

Update is there from 2.x+ version.

If anyone else tried the same I would like to hear if it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kafka scaler reports -2 and scale to maxReplicas
7 participants