-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add thanos objmeta component for large workload. (#6468) #6553
base: main
Are you sure you want to change the base?
Add thanos objmeta component for large workload. (#6468) #6553
Conversation
Sonatype Lift is retiringSonatype Lift will be retiring on Sep 12, 2023, with its analysis stopping on Aug 12, 2023. We understand that this news may come as a disappointment, and Sonatype is committed to helping you transition off it seamlessly. If you’d like to retain your data, please export your issues from the web console. |
scripts/quickstart.sh
Outdated
@@ -307,17 +346,37 @@ done | |||
|
|||
sleep 0.5 | |||
|
|||
if [ -n "${GCS_BUCKET}" -o -n "${S3_ENDPOINT}" ]; then | |||
if [ -n "${GCS_BUCKET}" -o -n "${S3_ENDPOINT}" -o -n "${LOCAL_BUCKET_ENABLED}" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.
ℹ️ Expand to see all @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
scripts/quickstart.sh
Outdated
@@ -307,17 +346,37 @@ done | |||
|
|||
sleep 0.5 | |||
|
|||
if [ -n "${GCS_BUCKET}" -o -n "${S3_ENDPOINT}" ]; then | |||
if [ -n "${GCS_BUCKET}" -o -n "${S3_ENDPOINT}" -o -n "${LOCAL_BUCKET_ENABLED}" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined.
ℹ️ Expand to see all @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
@hanjm Thanks for the work. Do you have any benchmark/data to share about how much this new component helps? |
@yeya24 hi, i use it in compactor and thanos store, and reduce all the operation='exist' request against to object storage. the compactor run faster, the thanos store sync faster. benchmark result at local.
|
With this change #6474, we should have less |
it not solved this problem in large workload. use objstore.WithRecursiveIter will lead to less when number of block grow, iteration result is cached use a single value, which is a big value and cannot set success to remote cache backend. |
That's really interesting feedback, do have some rough numbers about how big the result of a recursive |
Yeah, this seems a bit overkill. We've heard feedback multiple times that Thanos is already too hard to understand. Having a component like this adds significant complexity. Perhaps we could fix caching instead? Redis/memcached/etc. seem perfectly fitted for such use-case IMHO. For example, we could use Redis lists to store this data. |
The length of block id is 26 bytes, for example 1 million blocks blocks got 24MB. it is too large for remote cache.
For large workload it is not. so it is a optional component only need for large workload. thanos/docs/components/objmeta.md Line 3 in 7c31615
May be, but it not solved completely.
|
quickstart.sh support thanos objmeta support delete-marker and no-compact-marker, multi backend make backend as interface Signed-off-by: Jimmie Han <[email protected]>
change redis client fix bucket construct Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
Signed-off-by: Jimmie Han <[email protected]>
8db7276
to
1692dc4
Compare
Signed-off-by: Jimmie Han <[email protected]>
1692dc4
to
c4b94cc
Compare
From #6468 and the discuss at slack , we think it is nesseray to manage block metadata in another more efficiently and cheaper way when thanos blocks grow very high.
may releated issue in the past: #3712 #3150
thanos/pkg/shipper/shipper.go
Lines 195 to 196 in cdba35b
thanos/pkg/block/fetcher.go
Lines 320 to 321 in cdba35b
thanos/pkg/block/fetcher.go
Lines 857 to 859 in cdba35b
thanos/pkg/compact/compact.go
Lines 1537 to 1540 in cdba35b
thanos/pkg/store/cache/caching_bucket.go
Lines 155 to 165 in 40e1b2d
Changes
This PR add a optional component objmeta, allow fast iter all the block and get block metadata, save the cost from object storage.
*NOTE: This component is only needed when your thanos workload is very large, how large is large? when the number of blocks in object storage is more than 100,000 or the HEAD request to object storage more than 10k per second.
Verification
Benchmark