Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store: fails with "unexpected fault address" #2155

Closed
anas-aso opened this issue Feb 19, 2020 · 1 comment
Closed

store: fails with "unexpected fault address" #2155

anas-aso opened this issue Feb 19, 2020 · 1 comment

Comments

@anas-aso
Copy link
Contributor

Thanos, Prometheus and Golang version used:
Thanos: thanosio/thanos:master-2020-02-18-a354bfba
Prometheus: prom/prometheus:v2.15.1

Object Storage Provider:
GCP

What happened:
Thanos Store crash due to :

fatal error: fault
   unexpected fault address ....

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):
I switched Thanos Store in our testing environment to use the latest master docker build master-2020-02-18-a354bfba and enable the new experimental flag --experimental.enable-index-header.

I am not sure how to reproduce it since Thanos Store was under low load. Switching to the latest master with the experimental flag was the only change I did

PS: It just happened once within the last 24 hours.

Full logs to relevant components:

Logs

fatal error: fault
unexpected fault address 0x7fa0afc0fba2
  /usr/local/go/src/runtime/alg.go:171 +0xb5 fp=0xc00e2a5ad0 sp=0xc00e2a5a88 pc=0x403325
  /usr/local/go/src/runtime/map.go:1177 +0x3f1 fp=0xc00e2a5b88 sp=0xc00e2a5ad0 pc=0x40fd41
github.com/thanos-io/thanos/pkg/store/cache.(*InMemoryIndexCache).set(0xc0009e7480, 0x1e83a03, 0x8, 0xb93e6dc93b4c7001, 0x32d9891877bdb237, 0x1c5e080, 0xc008d4d5a0, 0xc0108809b0, 0x30, 0xd9450)
  /go/src/github.com/thanos-io/thanos/pkg/store/cache/inmemory.go:224 +0x257 fp=0xc00e2a5d78 sp=0xc00e2a5c98 pc=0xed7097
  /usr/local/go/src/runtime/map.go:1117 +0x94 fp=0xc00e2a5bb0 sp=0xc00e2a5b88 pc=0x40f934
runtime.goexit()
  /go/pkg/mod/github.com/hashicorp/[email protected]/simplelru/lru.go:62 +0x26d fp=0xc00e2a5c98 sp=0xc00e2a5c38 pc=0xed372d
  /go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:54 +0x66
goroutine 34 [select]:
created by go.opencensus.io/stats/view.init.0

  /usr/local/go/src/runtime/sigqueue.go:147 +0x9c
goroutine 39 [syscall, 988 minutes]:
goroutine 24 [chan receive]:
created by os/signal.init.0
  /go/pkg/mod/[email protected]/stats/view/worker.go:32 +0x57

aeshashbody()
[...]

I cropped the logs since the full logs is > 3000 lines and most of them are due to SIG TERM/KILL from Kubernetes since Thanos Store was unresponsive.

Anything else we need to know:
Thanos is running on a GKE cluster.

@squat
Copy link
Member

squat commented Feb 19, 2020

Hi @anas-aso this issue looks like a duplicate of #2147. A fix for this has just been merged via #2151. Please try master to verify that your issue is resolved :)

@squat squat closed this as completed Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants