-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Alibaba Cloud backend OSS with lock #16927
Conversation
75d5061
to
4a12685
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for putting the work in to creating the OSS remote state. There's a few issues that need to be addressed before we can evaluate this further.
Since this is an entirely new backend, basing it on s3 isn't entirely appropriate. You may want to look at some newer backends for inspiration too, like manta, or etcdv3. Specifically the handling of the default workspace doesn't need to be a special case, since there are no existing oss remote states to maintain compatibility with. Removing that special case will simplify a lot of this code.
Regarding locks, if the storage layer is fully consistent, we still need a way to conditionally write the lock. Checking for a locks existence then writing the value doesn't provide any safety guarantees on its own.
backend/remote-state/oss/backend.go
Outdated
Default: "", | ||
}, | ||
|
||
"workspace_key_prefix": &schema.Schema{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you used the S3 backend as a template for this. That implementation contains a lot of backwards compatibility shims like workspace_key_prefix
, and probably shouldn't be used as a reference. Since this is an entirely new implementation, it's easier to just treat the default
workspace the same as every other rather than special-case it.
backend/remote-state/oss/backend.go
Outdated
Description: "The name of the OSS bucket", | ||
}, | ||
|
||
"key": &schema.Schema{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You most likely want the configuration to be a path or a prefix, not a specific key, as that can't really work for named states (workspaces).
|
||
// extract the object name from the OSS key | ||
func (b *Backend) keyEnv(key string) string { | ||
// we have 3 parts, the workspace key prefix, the workspace name, and the state key name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was copied from an older version of the s3 backend that still had prefix related bugs.
Removing the special casing of the default workspace and having a single path/prefix removes the need for this.
backend/remote-state/oss/client.go
Outdated
return "", fmt.Errorf("Error getting bucket: %#v", err) | ||
} | ||
|
||
log.Printf("Lock info:%#v", info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a log level. We haven't transitioned to fully structured logging yet, so you can continue to use the text prefixes, like [DEBUG]
backend/remote-state/oss/client.go
Outdated
} | ||
|
||
var hashChannel = make(chan []byte, 1) | ||
sum := md5.Sum(buf.Bytes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for a channel here
backend/remote-state/oss/client.go
Outdated
} | ||
|
||
func (c *RemoteClient) Unlock(id string) error { | ||
log.Printf("UnLock info %s", id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs log level
@xiaozhu36 This feature is absolutely what I want integrated with Alicloud. Just wondering when can we have this pull request merged into the master. 👍 |
fa1096c
to
7cf5584
Compare
Any update on this? Really want to this feature support. 👍 |
HI @georgedriver I have updated it but the ci test happened a problem. Please be patient. Thanks a lot. |
Update on when this might be ready for inclusion in Terraform? Thanks in advance. |
HI @heatmiser Thanks for your attention. I am trying my best for it. Please be patient. HI @jbardin Is there any progress for the PR? Looking forward to your new review. Thanks a lot. |
Hi @xiaozhu36, Sorry about the delay here. The main branch of terraform is frozen at the moment, while we work on some in-depth refactoring for the 0.12 release. Once the 0.12 development reaches a more stable state I can take a look at this PR again ASAP. |
Any update on this? This backend is getting increasingly necessary here, @jbardin |
Any update for this? is there any plan for merge? |
0826485
to
8bb6ebc
Compare
Hi Everyone, Sorry about the delay here. Unfortunately the updated PR missed the freeze we had in core in preparing for the major work being done for 0.12. This PR will be reviewed again in due time, but since there are no further planned 0.11 releases in which to include this, we are waiting until we can provide a stable branch to merge into. Thank you for your patience while we work towards 0.12! |
66c7c72
to
7e6bb98
Compare
@xiaozhu36 @jbardin should I wait or look for workaround, please advise. Thanks |
HI @toamitkumar I think this pr is ok, and if @jbardin still has other suggestion, I can fix it in the next version. Thanks a lot. |
backend/remote-state/oss/client.go
Outdated
|
||
info.Path = c.lockFile | ||
if exist, err := bucket.IsObjectExist(info.Path); err != nil { | ||
return "", fmt.Errorf("Estimating object %s is exist got an error: %#v", info.Path, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error text needs to be corrected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
backend/remote-state/oss/client.go
Outdated
if exist, err := bucket.IsObjectExist(info.Path); err != nil { | ||
return "", fmt.Errorf("Estimating object %s is exist got an error: %#v", info.Path, err) | ||
} else if !exist { | ||
if err := c.putObj(info.Path, infoJson); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the existence then putting the lock object creates a race condition.
Is there some way to write the object and have it rejected the request if it exists? (this is one of the reasons the s3 backend has to delegate locking to DynamoDB). Basically, will putObj
fail if the object already exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry for that now there is no better way, like s3 DynamoDB, to reject the request . At present, the method putObj
will overwrite the existing object, see details. I think this question is not blocking issue for this PR, and this PR can be merged firstly. In the next, I will continue to find the better way like s3 DynamoDB to avoid some request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbardin Do you have other ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to advertise the storage as safe for concurrent use if the locking mechanism isn't actually safe. For now it's probably better to remove the locking altogether so there is no confusion about what this backend can provide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @jbardin Thanks for your suggestion. I want to make the locking mechanism safe by using alibaba cloud tablestore. But, when I tried to make a test using a custom terraform package, I happened a issue: Error: Failed to instantiate provider "alicloud" to obtain schema: Incompatible API version with plugin. Plugin version: 4, Client versions: [5]
. Can you give me some help? Thanks in advance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiaozhu36, the provider needs to be updated to work with terraform 0.12 (the master branch).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @jbardin Thanks for your feedback. I have updated but the error still exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the error says Plugin version: 4
, that means the plugin you are executing was compiled with an old version of the terraform packages. The only way to fix that is to compile and execute the provider plugin using the latest version of the core package (which for now is master
), and we suggest using go modules to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @jbardin I have add tablestore config to store the state lock. Please review it.
backend/remote-state/oss/client.go
Outdated
return nil, fmt.Errorf("Error getting bucket: %#v", err) | ||
} | ||
|
||
if exist, err := bucket.IsObjectExist(key); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to check existence first?
This of course introduces the possibility of the object being deleted before the GetObject call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. You are right. Even if the object is not exist, DeleteObject will not return a error.
002eb2d
to
7fcbfb1
Compare
HI @jbardin Is there any new comments for this PR? |
Thanks for the updates @xiaozhu36. Before we merge I would ask that you make a separate commit for the module and vendor updates, and remove the stringer changes that you have in here as well. If the ACC tests look OK with the new locking enabled (please add the results here too), we can move on with this PR 👍 |
HI @jbardin Thanks for your review. The following is the PR ACC tests results and please check it:
|
HI @jbardin This is just a reminding that if there is no any questions, please merge it. Thanks a lot. |
Hi @xiaozhu36, @jbardin is currently on vacation, and will review this again when he returns. Because the release process for Terraform v0.12.0 is already in progress, this cannot be merged until after v0.12.0 final is released and any urgent bugs have been addressed in minor releases, but once @jbardin has approved it we'll merge it for a v0.12.x minor release. The "v0.12.1" milestone is a marker for us to revisit this once v0.12.0 is stable; it may actually be a later minor release if we need to make urgent minor releases to address bugs in v0.12.0, but either way we'll use that milestone to mark this for inclusion once the major release is stable. |
HI @apparentlymart Thanks for you feedback. There are many customers are waiting for this PR. I hope this PR can be merged and published asap. If there is a minor release can support it. I think it also can be accepted. |
Are there any news about merging this MR ? cc @jbardin |
This is included in 0.12.2, but the documentation for OSS backend seems not live on https://www.terraform.io/docs/backends/types/index.html |
this is great, thanks for the push guys :) |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
The PR provides an new cloud backend oss that terraform can store state remotely in OSS and lock that state with OSS.
The result of running test case as follows:
TF_ACC=1 go test -v ./backend/remote-state/oss/ -run=TestBackend -timeout 120m
=== RUN TestBackend_impl
--- PASS: TestBackend_impl (0.00s)
=== RUN TestBackendConfig
2017/12/15 18:09:59 [DEBUG] Instantiate OSS client using endpoint: "oss-cn-beijing.aliyuncs.com"
--- PASS: TestBackendConfig (0.26s)
=== RUN TestBackendConfig_invalidKey
--- PASS: TestBackendConfig_invalidKey (0.00s)
=== RUN TestBackend
2017/12/15 18:09:59 [DEBUG] Instantiate OSS client using endpoint: "oss-cn-beijing.aliyuncs.com"
(some logs)
2017/12/15 18:10:03 Delete Object workspaces/foo/testTFState successfully.
--- PASS: TestBackend (4.35s)
backend_test.go:253: creating OSS bucket terraform-remote-oss-test-5a339f77 in http://oss-cn-beijing.aliyuncs.com
=== RUN TestBackendLocked
2017/12/15 18:10:03 [DEBUG] Instantiate OSS client using endpoint: "oss-cn-beijing.aliyuncs.com"
(some logs)
2017/12/15 18:10:08 Delete Object tfTest/mystate.tflock successfully.
--- PASS: TestBackendLocked (5.40s)
backend_test.go:253: creating OSS bucket terraform-remote-oss-test-5a339f7b in http://oss-cn-beijing.aliyuncs.com
backend_test.go:122: TestBackend: testing state locking for *oss.Backend
=== RUN TestBackendExtraPaths
2017/12/15 18:10:09 [DEBUG] Instantiate OSS client using endpoint: "oss-cn-beijing.aliyuncs.com"
(some logs)
2017/12/15 18:10:10 [TRACE] Preserving existing state lineage "ff0bbc96-824c-418b-9f88-866d32f6b7ea"
--- PASS: TestBackendExtraPaths (2.07s)
backend_test.go:253: creating OSS bucket terraform-remote-oss-test-5a339f81 in http://oss-cn-beijing.aliyuncs.com
PASS
ok github.com/hashicorp/terraform/backend/remote-state/oss 12.120s
TF_ACC=1 go test -v ./backend/remote-state/oss/ -run=TestRemoteClient -timeout 120m
=== RUN TestRemoteClient_impl
--- PASS: TestRemoteClient_impl (0.00s)
=== RUN TestRemoteClient
2017/12/15 18:04:32 [DEBUG] Instantiate OSS client using endpoint: "oss-cn-beijing.aliyuncs.com"
(some logs)
2017/12/15 18:04:33 State testState has no data.
--- PASS: TestRemoteClient (1.44s)
backend_test.go:253: creating OSS bucket terraform-remote-oss-test-5a339e30 in http://oss-cn-beijing.aliyuncs.com
=== RUN TestRemoteClientLocks
2017/12/15 18:04:33 [DEBUG] Instantiate OSS client using endpoint: "oss-cn-beijing.aliyuncs.com"
(some logs)
2017/12/15 18:04:36 Delete Object testState.tflock successfully.
--- PASS: TestRemoteClientLocks (3.09s)
backend_test.go:253: creating OSS bucket terraform-remote-oss-test-5a339e31 in http://oss-cn-beijing.aliyuncs.com
PASS
ok github.com/hashicorp/terraform/backend/remote-state/oss 4.566s