Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query: Misleading message when there are no metrics during the announced timerange of a store #4478

Closed
amine250 opened this issue Jul 23, 2021 · 7 comments

Comments

@amine250
Copy link

Thanos, Prometheus and Golang version used:

Item Version
Thanos 0.22.0
Revision 3656e29
GoVersion go1.16.6

Object Storage Provider:

AWS S3

What happened:

Wanting to time shard the store component, I've configured --max-time=-7d and --min-time=-30d in one of the stores. No metrics have been scraped during this timerange.

While the date and the timerange are valid, the querier show that the max time is invalid.

image

What you expected to happen:

Show the Max Time that has been configured in the command line.

How to reproduce it (as minimally and precisely as possible):

  1. Set up a store component with a --max-time and --min-time during which there are no metrics in the object storage.
  2. Set up a querier with a --store to this newly created store component
  3. Check the stores page in the Querier.

Full logs to relevant components:

Querier logs

level=info ts=2021-07-23T15:01:08.74084361Z caller=http.go:63 service=http/server component=query msg="listening for requests and metrics" address=0.0.0.0:10902
level=info ts=2021-07-23T15:01:08.740911872Z caller=intrumentation.go:48 msg="changing probe status" status=ready
level=info ts=2021-07-23T15:01:08.74108337Z caller=tls_config.go:191 service=http/server component=query msg="TLS is disabled." http2=false
level=info ts=2021-07-23T15:01:08.741155382Z caller=grpc.go:123 service=gRPC/server component=query msg="listening for serving gRPC" address=0.0.0.0:10901
level=info ts=2021-07-23T15:01:13.744004411Z caller=storeset.go:463 component=storeset msg="adding new storeAPI to query storeset" address=10.170.52.40:10901 extLset=    

Store logs

level=info ts=2021-07-23T14:58:58.570319278Z caller=factory.go:46 msg="loading bucket configuration"
level=info ts=2021-07-23T14:58:58.570639964Z caller=inmemory.go:172 msg="created in-memory index cache" maxItemSizeBytes=131072000 maxSizeBytes=262144000 maxItems=maxInt 
level=info ts=2021-07-23T14:58:58.571011047Z caller=options.go:24 protocol=gRPC msg="disabled TLS, key and cert must be set to enable"
level=info ts=2021-07-23T14:58:58.57198871Z caller=store.go:428 msg="starting store node"
level=info ts=2021-07-23T14:58:58.572154776Z caller=intrumentation.go:60 msg="changing probe status" status=healthy
level=info ts=2021-07-23T14:58:58.572185718Z caller=http.go:63 service=http/server component=store msg="listening for requests and metrics" address=0.0.0.0:10902
level=info ts=2021-07-23T14:58:58.572451601Z caller=tls_config.go:191 service=http/server component=store msg="TLS is disabled." http2=false
level=info ts=2021-07-23T14:58:58.572504159Z caller=store.go:363 msg="initializing bucket store"
level=info ts=2021-07-23T14:58:59.029327473Z caller=fetcher.go:476 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=456.792995ms cached=81 returned=0 partial=0
level=info ts=2021-07-23T14:58:59.029454706Z caller=store.go:369 msg="bucket store ready" init_duration=456.927721ms
level=info ts=2021-07-23T14:58:59.029738907Z caller=intrumentation.go:48 msg="changing probe status" status=ready
level=info ts=2021-07-23T14:58:59.02980071Z caller=grpc.go:123 service=gRPC/server component=store msg="listening for serving gRPC" address=0.0.0.0:10901
level=info ts=2021-07-23T14:58:59.095072798Z caller=fetcher.go:476 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=65.597957ms cached=81 returned=0 partial=0
level=info ts=2021-07-23T15:01:59.236357669Z caller=fetcher.go:476 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=206.760051ms cached=83 returned=0 partial=0
level=info ts=2021-07-23T15:04:59.182740606Z caller=fetcher.go:476 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=152.725687ms cached=83 returned=0 partial=0
level=info ts=2021-07-23T15:07:59.179752613Z caller=fetcher.go:476 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=149.906207ms cached=83 returned=0 partial=0

@bill3tt
Copy link
Contributor

bill3tt commented Aug 9, 2021

@amine250 that indeed sounds like a bug in query - help wanted.

@jmichalek132
Copy link
Contributor

I did some digging into this and the issue is the fronted gets the min max time it shows from this InfoResponse returned by store

type InfoResponse struct {

which is set to:

mint = math.MaxInt64
maxt = math.MinInt64

in this function

func (s *BucketStore) TimeRange() (mint, maxt int64) {

when no data is available in the bucket during the specified --min-time and --max-time .
So the solution could be pass values of those args as part of the InfoResponse?
WDYT @ianbillett ?

@bwplotka
Copy link
Member

bwplotka commented Sep 30, 2021

To me, this info was always supposed to be the actually expected time range of ANY data from this component.

This means that we probably should add that as comment to storeAPI protocol too. To describe it, let's show following examples:

  1. --min-time=-10 --maxtime=-5 and zero actual data. To me Info should give min: -Inf, max: -Inf or some other "sentinel" value that frontend can describe as "well, there is no data there". That additional reason why it's useful, it's that Querier will then never query this store at all - which is what we want, no?
  2. --min-time=-10 --maxtime=-5 and lots of data between -20, 10, info should give min:-10, max:-5
  3. --min-time=-10 --maxtime=-5 but data is only between -10, -7 info should give min:-10, max:-7 - because there is no point in Querier asking for data for -5 e.g if there is nothing there.

@bwplotka
Copy link
Member

Anyway, this issue is valid. IMO Action Items:

  1. Clarify Info API about this contract
  2. Fix all implementation to apply those

@stale
Copy link

stale bot commented Jan 9, 2022

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Jan 9, 2022
@stale
Copy link

stale bot commented Mar 2, 2022

Closing for now as promised, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Mar 2, 2022
@matej-g
Copy link
Collaborator

matej-g commented Mar 3, 2022

This has been addressed in #4908 as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants