-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: added prober package to generalize readiness probes #1242
Conversation
cb9b27f
to
a58e510
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for splitting. This would much much faster then in review!
Some initial comments, need to run home (:
pkg/component/component.go
Outdated
@@ -6,6 +6,11 @@ import ( | |||
"github.com/improbable-eng/thanos/pkg/store/storepb" | |||
) | |||
|
|||
// Component is a generic component interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing trailing period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, should be added
pkg/prober/prober.go
Outdated
} | ||
|
||
// SetLogger sets logger used by the Prober. | ||
func (p *Prober) SetLogger(logger log.Logger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was needed formerly when there was mutex for logger as well but you suggested to drop it and I just didn't notice there is no need for the getter/setter now without it.
Thanks for pointing out, will remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
pkg/prober/prober.go
Outdated
p.logger = logger | ||
} | ||
|
||
func (p *Prober) getLogger() log.Logger { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed, more info above
pkg/prober/prober.go
Outdated
func NewProber(component component.Component, logger log.Logger) *Prober { | ||
initialErr := fmt.Errorf(initialErrorText, component) | ||
prober := &Prober{} | ||
prober.SetComponent(component) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prober should not use public method TBH (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just create prober with filled struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, just out of curiosity why should be this avoided? Is that generic thing in Golang not to use own public methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed also calls to SetNotReady
and SetNotHealthy
which caused that the mutex locking is gone but since it's the instantiation of the struct it should be ok I hope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there is no need for this. Literally we will never in our code require dynamic reload of component. We don't have components that transforms into another components right?
Every bit of genercity is introducing complexity. Since we don't need that now (YAGNI rule), we can avoid that complexity, right? (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks for for the explanation :)
b054f62
to
c2961f1
Compare
@bwplotka thanks fot the comments! Should be all resolved now I hope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better (:
Still some comments! Let's aim for perfection.
pkg/prober/prober.go
Outdated
// NewProber returns Prober representing readiness and healthiness of given component. | ||
func NewProber(component component.Component, logger log.Logger) *Prober { | ||
initialErr := fmt.Errorf(initialErrorFmt, component) | ||
prober := &Prober{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need this variable, IMO just return &Prober
pkg/prober/prober.go
Outdated
return prober | ||
} | ||
|
||
// RegisterInRouter registers readiness and liveness probes to mux. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// RegisterInRouter registers readiness and liveness probes to mux. | |
// RegisterInRouter registers readiness and liveness probes to router. |
pkg/prober/prober.go
Outdated
|
||
// IsReady returns error if component is not ready and nil if it is. | ||
func (p *Prober) IsReady() error { | ||
p.readyMtx.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's Move to sync.RWMutex
and do Rlock
and RUnlock
here (: There will be more readers than writerrs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks I didn't know there is also RWMutex, TIL :)
pkg/prober/prober.go
Outdated
} | ||
} | ||
|
||
// IsReady returns error if component is not ready and nil if it is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// IsReady returns error if component is not ready and nil if it is. | |
// IsReady returns error if component is not ready and nil otherwise |
pkg/prober/prober.go
Outdated
|
||
// SetReady sets components status to ready. | ||
func (p *Prober) SetReady() { | ||
if p.IsReady() != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong I think as those two methods are locked so you can have a race. IF you need this log line, please add this to SetNotReady
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, just so I understand where the race is. The IsReady
locks but once finished it again unlocks it. And only after it returns and it does not return nil
the SetNotReady
is called which again locks and unlocks. Which part could go wrong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, now I get it. Meanwhile I unlock it the status of prober could change.
pkg/prober/prober.go
Outdated
|
||
// SetNotReady sets components status to not ready with given error as a cause. | ||
func (p *Prober) SetNotReady(err error) { | ||
if err != nil && p.IsReady() == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong from the same reason above, there can be a race.
Please lock here and then check p.readiness
instead to check what to log. What do you think? (:
pkg/prober/prober.go
Outdated
|
||
// SetHealthy sets components status to healthy. | ||
func (p *Prober) SetHealthy() { | ||
if p.IsHealthy() != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly the same issue as for Readiness.
Let's move to RWLock
and log under lock as well (it's only on write so should be fine)
pkg/prober/prober.go
Outdated
// HandleIfReady if probe is ready calls the function otherwise returns 503. | ||
func (p *Prober) HandleIfReady(f http.HandlerFunc) http.HandlerFunc { | ||
return func(w http.ResponseWriter, r *http.Request) { | ||
p.readyMtx.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not reusing p.IsReady
? (: In this case it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially that we don't know how long f
is and what it does from this function perspecitve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. This could cause serious issues.
c2961f1
to
f4c6453
Compare
@bwplotka thank you for all the valuable comments/suggestions and for answering all my stupid questions so I can learn from it, I really appreciate it. Is it ok with you this way? |
No worries, just tell if you are rdy for next review iteration (: I think it's rdy now? |
Yes, it should be ready I hope. |
f4c6453
to
06ed8e1
Compare
@bwplotka if you find a while and could take a look at this? It's rebased and I added metrics for observing the Prober state if that is ok with you
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FUSAKLA 🎉 Amazing code right now 👍
Super clean and with attention to details - I love it.
I have some more suggestions let me know what you think. Otherwise LGTM! Thanks.
pkg/prober/prober.go
Outdated
const ( | ||
healthyEndpointPath = "/-/healthy" | ||
readyEndpointPath = "/-/ready" | ||
okProbeFmt = "thanos %v is %v" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we inline all constants that are used only once? (: There is no point of those if they are not references more than once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, inlined them probably will be easier to read 👍
pkg/prober/prober.go
Outdated
} | ||
|
||
func (p *Prober) writeResponse(w http.ResponseWriter, probeFunc func() error, probeType string) { | ||
err := probeFunc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can do if err := probeFn(); err != nil
here (: Also we usually do Fn
suffix for function vars so probeFn
would be nice? (: What do you think?
pkg/prober/prober.go
Outdated
http.Error(w, fmt.Sprintf(errorProbeFmt, p.component, probeType, err), probeErrorHTTPStatus) | ||
return | ||
} | ||
if _, e := io.WriteString(w, fmt.Sprintf(okProbeFmt, p.component, probeType)); e == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please let's keep err
as error variable name! (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Becuse of introducing the new name there is a bug here -> you log "err", err
not "err", e
as it should be (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for noticing, changed to err
pkg/prober/prober.go
Outdated
healthiness: initialErr, | ||
readiness: initialErr, | ||
readyStateMetric: prometheus.NewGauge(prometheus.GaugeOpts{ | ||
Name: fmt.Sprintf("thanos_%s_ready", component), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH instead of that I would just do `prober_ready", and then:
- have
label
component. This will allow for better, potential aggregations in future. - you can have just
prober_ready
and do not specifythanos
prefix. This allows to reuse this package outside of Thanos even (why not)!
Thanos can then on caller side do prometheus.WrapRegistererWithPrefix("thanos"
(:
What do you think?
same for prober_healthy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about the naming a lot bud than looked to the rest of packages and found everywhere hardcoded the thanos prefix so I wanted to stay consistent but I agree your suggestion is more variable.
Applied as suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we did not think about that in initial implementation, but now I would aim for no thanos_
prefix inside packages.
Ideally only main adds those
BTW nice idea with metric 👍 |
@bwplotka thanks for those comments! All should be resolved as you suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks for this 👍
This PR adds prober package/component which will unify readiness and health checks of all thanos components.
For more info please refer to the PR #656