Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

receiver:Error happens when Thanos receiver upload data to Object storage #4325

Closed
btdan opened this issue Jun 10, 2021 · 4 comments
Closed
Labels

Comments

@btdan
Copy link

btdan commented Jun 10, 2021

Thanos receiver receives data from Prometheus successfully, and also save in the local disk. But Failed to upload the data to object storage sometimes. That means, sometimes it can upload data to object storage successfully, meanwhile sometimes it fails.
The relevant to bug reports from thanos receiver is as follows:

_level=info ts=2021-06-10T01:51:26.221171953Z caller=shipper.go:337 component=receive component=multi-tsdb tenant=default-tenant msg="upload new block" id=01F7R8VR019A876FPJF03NETNK

http://10.0.90.203/api/v1/list/bucket/chaosuan/?delimiter=%2F&prefix=01F7R8VR019A876FPJF03NETNK
level=warn ts=2021-06-10T01:51:27.423358333Z caller=receive.go:567 component=receive component=uploader msg="upload failed" elapsed=1.232621448s err="upload: upload 01F7R8VR019A876FPJF03NETNK: failed to clean block after upload issue. Partial block in system. Err: upload chunks: upload file /aiops/prometheusData/metricsData/default-tenant/thanos/upload/01F7R8VR019A876FPJF03NETNK/chunks/000001 as 01F7R8VR019A876FPJF03NETNK/chunks/000001: failed to upload(multipart) 01F7R8VR019A876FPJF03NETNK/chunks/000001: UploadOneChunk failed,父路径(01F7R8VR019A876FPJF03NETNK/chunks)不存在,或路径有误: upload chunks: upload file /aiops/prometheusData/metricsData/default-tenant/thanos/upload/01F7R8VR019A876FPJF03NETNK/chunks/000001 as 01F7R8VR019A876FPJF03NETNK/chunks/000001: failed to upload(multipart) 01F7R8VR019A876FPJF03NETNK/chunks/000001: UploadOneChunk failed,父路径(01F7R8VR019A876FPJF03NETNK/chunks)不存在,或路径有误"
level=warn ts=2021-06-10T01:51:27.423464116Z caller=receive.go:621 component=receive component=uploader msg="recurring upload failed" err="upload: upload 01F7R8VR019A876FPJF03NETNK: failed to clean block after upload issue. Partial block in system. Err: upload chunks: upload file /aiops/prometheusData/metricsData/default-tenant/thanos/upload/01F7R8VR019A876FPJF03NETNK/chunks/000001 as 01F7R8VR019A876FPJF03NETNK/chunks/000001: failed to upload(multipart) 01F7R8VR019A876FPJF03NETNK/chunks/000001: UploadOneChunk failed,父路径(01F7R8VR019A876FPJF03NETNK/chunks)不存在,或路径有误: upload chunks: upload file /aiops/prometheusData/metricsData/default-tenant/thanos/upload/01F7R8VR019A876FPJF03NETNK/chunks/000001 as 01F7R8VR019A876FPJF03NETNK/chunks/000001: failed to upload(multipart) 01F7R8VR019A876FPJF03NETNK/chunks/000001: UploadOneChunk failed,父路径(01F7R8VR019A876FPJF03NETNK/chunks)不存在,或路径有误"_

父路径 means parent path
不存在,或路径有误 means not exist, or the path is wrong

The command about thanos receiver is as follows:
thanos receive
--tsdb.path "/aiops/prometheusData/metricsData"
--grpc-address 0.0.0.0:10907
--http-address 0.0.0.0:10909
--receive.replication-factor 1
--label "receive_replica="0""
--label "receive_cluster="aiopssection""
--receive.local-endpoint 10.0.90.202:10907
--receive.hashrings-file /aiops/hashring.json
--remote-write.address 0.0.0.0:10908
--objstore.config-file "/aiops/bucket.yml" > thanos_receive.log 2>&1 &

Thanos, Prometheus and Golang version used:

Output of "thanos --version":
thanos, version 0.21.0-dev (branch: iharbor, revision: d4a0751 943fbfc78add079)
build user: root@Thanos
build date: 20210514-02:07:40
go version: go1.16.1
platform: linux/amd64

Object Storage Provider: the customed object storage just like S3

What happened: Thanos receiver fails to upload data to object storage sometimes.

What you expected to happen: Thanos receiver should upload data to object storage always successfully.

How to reproduce it (as minimally and precisely as possible):

Full logs to relevant components:

Environment:

  • OS (e.g. from /etc/os-release):
    NAME="CentOS Stream"
    VERSION="8"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="8"
    PLATFORM_ID="platform:el8"
    PRETTY_NAME="CentOS Stream 8"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:centos:centos:8"
    HOME_URL="https://centos.org/"
    BUG_REPORT_URL="https://bugzilla.redhat.com/"
    REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
    REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

  • Kernel (e.g. uname -a):
    Linux Thanos 4.18.0-294.el8.x86_64 Initial structure and block shipper #1 SMP Mon Mar 15 22:38:42 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

@btdan
Copy link
Author

btdan commented Jun 10, 2021

visit the url http://10.0.90.203/api/v1/list/bucket/chaosuan/?delimiter=%2F&prefix=01F7R8VR019A876FPJF03NETNK
and get this response:

Api Root List Bucket Object Instance

List Bucket Object Instance
列举存储桶内对象和目录

GET /api/v1/list/bucket/chaosuan/?delimiter=%2F&prefix=01F7R8VR019A876FPJF03NETNK
HTTP 404 Not Found
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept

{
"code": "NoSuchKey",
"message": "无效的参数prefix,对象或目录不存在"
}

@yeya24
Copy link
Contributor

yeya24 commented Jun 10, 2021

What kind of object storage are you using? Custom obj store means that it is developed by your company and it is S3 compatible?

If it is this case, I suspect it is caused by bugs in your custom obj store.

@stale
Copy link

stale bot commented Aug 10, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Aug 10, 2021
@stale
Copy link

stale bot commented Aug 24, 2021

Closing for now as promised, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants