Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

objstore : implement Aliyun OSS #1573

Merged
merged 9 commits into from
Oct 24, 2019
Merged

objstore : implement Aliyun OSS #1573

merged 9 commits into from
Oct 24, 2019

Conversation

wujinhu
Copy link
Contributor

@wujinhu wujinhu commented Sep 26, 2019

Open this PR as discussed in #1234 , will update soon.

@wujinhu
Copy link
Contributor Author

wujinhu commented Sep 26, 2019

@bwplotka It seems a little difficult to make commits signed off, there is a lot of conflicts when I run the rebase command. I think it will be easier if I revert all those commits and submit a new one. And I will reference #1234 in the new one. What do you think of this?

@woodliu
Copy link

woodliu commented Oct 14, 2019

@wujinhu What is the status now? This guy shaulboozhiao has deleted the code he had...
We are going to use Ali oss to support thanos, but can't find the supported code.

@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 14, 2019

@woodliu Sorry for the delay, will update this week.

@woodliu
Copy link

woodliu commented Oct 14, 2019

@wujinhu What is the status now? This guy shaulboozhiao has deleted the code he had...
We are going to use Ali oss to support thanos, but can't find the supported code.

I have try and merge the history code from your repo, it works, there's not much files changed, it seems that merge manually is a better way.

@woodliu Sorry for the delay, will update this week.

I have try and merge the history code from your repo, it works, there's not much files changed, it seems that merge manually is a better way.

@bwplotka
Copy link
Member

Any luck? (: Feel free to squash the original commits into one.

@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 18, 2019

revert commits and apply patch!

@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 18, 2019

Sth is wrong with the patch, I am fixing!

@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 18, 2019

@bwplotka @jojohappy Please help review this PR, thanks!😆

pkg/objstore/oss/oss.go Outdated Show resolved Hide resolved
pkg/objstore/oss/oss.go Show resolved Hide resolved
bucket *alioss.Bucket
}

func NewTestBucket(t testing.TB) (objstore.Bucket, func(), error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code should be in *_test.go files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this is the idiom for now. This is because _test.go files even exported methods are not exported between packages.

On the other hand this method is really the same across objectstores so we might want to refactor it to one (:

return NewTestBucketFromConfig(t, c, false)
}

func calculateChunks(name string, r io.Reader) (int, int64, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't do this type castin, let's just take an interface which has io.Reader & Size() methods instead of supporting only os.File & strings.Reader types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little ambiguous about this. We take S3 as an example, could you please give me comments in more details. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do this in another PR. As this would require quite a lot of refactoring.

If we change the interface in https://github.com/thanos-io/thanos/blob/master/pkg/objstore/objstore.go#L26
from:

Upload(ctx context.Context, name string, r io.Reader) error

to

type ReadSizer interface{
   io.Reader
   Size() int64
} 

Upload(ctx context.Context, name string, r ReadSizer) error

We would avoid this hack entirely.

@bwplotka what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is no way to have this working without knowing the size beforehand? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change interface, but the S3 example is bad - we should never rely on size as it block streaming case. However the truth is there is no case we need streaming currently for write (:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option is to actually read all the bytes into memory and then we would know the size, which sucks for memory requirements.

I would prefer to have ReadSizer, it actually helps upload case, as the uploader client can split the object into chunks, make it parallel, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's the thing, for streaming purposes you buffer let's say 128KB then if you buffer more you know you need multipart. While you traverse the file you stream parts until end of the reader (: I think that's what GCS is doing for example. Essentially you can implement this without knowing size beforehand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact this will be required if we would like to have this: #1673

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right. Actually in this case we shouldn't need a Size() method, we can just do what you said and upload parts the way you described.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this discussion here: #678

pkg/objstore/oss/oss.go Show resolved Hide resolved
pkg/objstore/oss/oss.go Outdated Show resolved Hide resolved
pkg/objstore/oss/oss.go Outdated Show resolved Hide resolved
pkg/objstore/oss/oss.go Show resolved Hide resolved
pkg/objstore/oss/oss.go Outdated Show resolved Hide resolved
return nil
}

func NewTestBucketFromConfig(t testing.TB, c Config, reuseBucket bool) (objstore.Bucket, func(), error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code should be in *_test.go files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@povilasv could we refactor obj testing in another PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

pkg/objstore/oss/oss.go Outdated Show resolved Hide resolved
Signed-off-by: wujinhu <[email protected]>
Signed-off-by: wujinhu <[email protected]>
Signed-off-by: wujinhu <[email protected]>
Signed-off-by: wujinhu <[email protected]>
Signed-off-by: wujinhu <[email protected]>
Signed-off-by: wujinhu <[email protected]>
@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 21, 2019

@povilasv @jojohappy Thanks for your comments, they are very helpful, and I updated the PR.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, generally LGTM, some minor comments only. Very clean code, thanks!

It's sad OSS SDK does not have multipart logic without file size. It indeed gives another reason to change our interface and add size to it. Not for this PR for sure though.

Can you confirm the acceptance e2e is successfully working with this provider? (

func TestObjStore_AcceptanceTest_e2e(t *testing.T) {
)

bucket *alioss.Bucket
}

func NewTestBucket(t testing.TB) (objstore.Bucket, func(), error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this is the idiom for now. This is because _test.go files even exported methods are not exported between packages.

On the other hand this method is really the same across objectstores so we might want to refactor it to one (:

Makefile Outdated Show resolved Hide resolved
@@ -60,6 +60,7 @@ Current object storage client implementations:
| [Azure Storage Account](./storage.md#azure) | Stable (production usage) | yes | @vglafirov |
| [OpenStack Swift](./storage.md#openstack-swift) | Beta (working PoCs, testing usage) | no | @sudhi-vm |
| [Tencent COS](./storage.md#tencent-cos) | Beta (testing usage) | no | @jojohappy |
| [AliYun OSS](./storage.md#aliyun-oss) | Beta (testing usage) | no | @shaulboozhiao,@wujinhu |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -336,3 +337,20 @@ config:
```

Set the flags `--objstore.config-file` to reference to the configuration file.

## AliYun OSS Configuration
In order to use AliYun OSS object storage, you should first create a bucket with proper Storage Class , ACLs and get the access key on the AliYun cloud. Go to [https://www.alibabacloud.com/product/oss](https://www.alibabacloud.com/product/oss) for more detail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What proper Storage Class means?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same meaning with S3 https://aws.amazon.com/s3/storage-classes/?nc1=h_ls
It defines the storage type of data.

pkg/objstore/oss/oss.go Outdated Show resolved Hide resolved
return NewTestBucketFromConfig(t, c, false)
}

func calculateChunks(name string, r io.Reader) (int, int64, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/objstore/oss/oss.go Outdated Show resolved Hide resolved
pkg/objstore/oss/oss.go Show resolved Hide resolved
Signed-off-by: wujinhu <[email protected]>
Signed-off-by: wujinhu <[email protected]>
@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 23, 2019

=== RUN TestObjStore_AcceptanceTest_e2e
--- PASS: TestObjStore_AcceptanceTest_e2e (1.68s)
=== RUN TestObjStore_AcceptanceTest_e2e/inmem
--- PASS: TestObjStore_AcceptanceTest_e2e/inmem (0.00s)
foreach.go:49: THANOS_SKIP_GCS_TESTS envvar present. Skipping test against GCS.
foreach.go:70: THANOS_SKIP_S3_AWS_TESTS envvar present. Skipping test against S3 AWS.
foreach.go:86: THANOS_SKIP_AZURE_TESTS envvar present. Skipping test against Azure.
foreach.go:102: THANOS_SKIP_SWIFT_TESTS envvar present. Skipping test against swift.
foreach.go:118: THANOS_SKIP_TENCENT_COS_TESTS envvar present. Skipping test against Tencent COS.
=== RUN TestObjStore_AcceptanceTest_e2e/AliYun_oss
--- PASS: TestObjStore_AcceptanceTest_e2e/AliYun_oss (0.63s)
PASS

Process finished with exit code 0

@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 23, 2019

=== RUN TestBucketStore_e2e
--- PASS: TestBucketStore_e2e (26.03s)
=== RUN TestBucketStore_e2e/inmem
--- PASS: TestBucketStore_e2e/inmem (10.17s)
bucket_e2e_test.go:398: Test with no index cache
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
bucket_e2e_test.go:402: Test with large, sufficient index cache
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
bucket_e2e_test.go:411: Test with small index cache
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
foreach.go:49: THANOS_SKIP_GCS_TESTS envvar present. Skipping test against GCS.
foreach.go:70: THANOS_SKIP_S3_AWS_TESTS envvar present. Skipping test against S3 AWS.
foreach.go:86: THANOS_SKIP_AZURE_TESTS envvar present. Skipping test against Azure.
foreach.go:102: THANOS_SKIP_SWIFT_TESTS envvar present. Skipping test against swift.
foreach.go:118: THANOS_SKIP_TENCENT_COS_TESTS envvar present. Skipping test against Tencent COS.
=== RUN TestBucketStore_e2e/AliYun_oss
--- PASS: TestBucketStore_e2e/AliYun_oss (14.41s)
bucket_e2e_test.go:398: Test with no index cache
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
bucket_e2e_test.go:402: Test with large, sufficient index cache
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
bucket_e2e_test.go:411: Test with small index cache
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
=== RUN TestBucketStore_ManyParts_e2e
--- PASS: TestBucketStore_ManyParts_e2e (17.94s)
=== RUN TestBucketStore_ManyParts_e2e/inmem
--- PASS: TestBucketStore_ManyParts_e2e/inmem (6.57s)
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
foreach.go:49: THANOS_SKIP_GCS_TESTS envvar present. Skipping test against GCS.
foreach.go:70: THANOS_SKIP_S3_AWS_TESTS envvar present. Skipping test against S3 AWS.
foreach.go:86: THANOS_SKIP_AZURE_TESTS envvar present. Skipping test against Azure.
foreach.go:102: THANOS_SKIP_SWIFT_TESTS envvar present. Skipping test against swift.
foreach.go:118: THANOS_SKIP_TENCENT_COS_TESTS envvar present. Skipping test against Tencent COS.
=== RUN TestBucketStore_ManyParts_e2e/AliYun_oss
--- PASS: TestBucketStore_ManyParts_e2e/AliYun_oss (9.57s)
bucket_e2e_test.go:372: Run 0
bucket_e2e_test.go:372: Run 1
bucket_e2e_test.go:372: Run 2
bucket_e2e_test.go:372: Run 3
bucket_e2e_test.go:372: Run 4
bucket_e2e_test.go:372: Run 5
bucket_e2e_test.go:372: Run 6
bucket_e2e_test.go:372: Run 7
bucket_e2e_test.go:372: Run 8
bucket_e2e_test.go:372: Run 9
bucket_e2e_test.go:372: Run 10
=== RUN TestBucketStore_TimePartitioning_e2e
--- PASS: TestBucketStore_TimePartitioning_e2e (6.19s)
bucket_e2e_test.go:520: Run 0
PASS

Process finished with exit code 0

    bucket_e2e_test.go:372: Run  2
    bucket_e2e_test.go:372: Run  3
    bucket_e2e_test.go:372: Run  4
    bucket_e2e_test.go:372: Run  5
    bucket_e2e_test.go:372: Run  6
    bucket_e2e_test.go:372: Run  7
    bucket_e2e_test.go:372: Run  8
    bucket_e2e_test.go:372: Run  9
    bucket_e2e_test.go:372: Run  10

=== RUN TestBucketStore_TimePartitioning_e2e
--- PASS: TestBucketStore_TimePartitioning_e2e (6.19s)
bucket_e2e_test.go:520: Run 0
PASS

Process finished with exit code 0

@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 23, 2019

@bwplotka acceptance_e2e_test.go and bucket_e2e_test.go both are ok.

Copy link
Member

@jojohappy jojohappy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! We can wait @povilasv to review the next round.

return nil, err
}

if _, err := resp.Read(nil); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we reading nil bytes here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I ran oss and s3 tests and there is no need here because oss sdk is not same with s3 sdk.

// NotFoundObject error is revealed only after first Read. This does the initial GetRequest. Prefetch this here
// for convenience.
if _, err := r.Read(nil); err != nil {
	runutil.CloseWithLogOnErr(b.logger, r, "s3 get range obj close")

	// First GET Object request error.
	return nil, err
}

Copy link
Member

@povilasv povilasv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment other than that LGTM 👍

Signed-off-by: wujinhu <[email protected]>
@wujinhu
Copy link
Contributor Author

wujinhu commented Oct 23, 2019

@bwplotka @povilasv @jojohappy I think each comment has been addressed. 😆

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nice work, thanks!

return NewTestBucketFromConfig(t, c, false)
}

func calculateChunks(name string, r io.Reader) (int, int64, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this discussion here: #678

@bwplotka bwplotka merged commit 66b3d21 into thanos-io:master Oct 24, 2019
GiedriusS pushed a commit that referenced this pull request Oct 28, 2019
* add oss support

Signed-off-by: wujinhu <[email protected]>

* fix docs

Signed-off-by: wujinhu <[email protected]>

* fix Makefile

Signed-off-by: wujinhu <[email protected]>

* review comments

Signed-off-by: wujinhu <[email protected]>

* fix style

Signed-off-by: wujinhu <[email protected]>

* review comments

Signed-off-by: wujinhu <[email protected]>

* review comments

Signed-off-by: wujinhu <[email protected]>

* review comments

Signed-off-by: wujinhu <[email protected]>

* review comments

Signed-off-by: wujinhu <[email protected]>
Signed-off-by: Giedrius Statkevičius <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants