Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write secret files atomically #440

Merged
merged 1 commit into from
Mar 2, 2022
Merged

Write secret files atomically #440

merged 1 commit into from
Mar 2, 2022

Conversation

szh
Copy link
Contributor

@szh szh commented Feb 16, 2022

Desired Outcome

This functionality is described here:

https://github.com/cyberark/secrets-provider-for-k8s/blob/secrets-rotation-design/design/secret_rotation_design.md#atomic-file-writes-for-push-to-file-mode

Implemented Changes

Files are written to a temp file first. If that succeeds, the temp file is moved to the destination in an atomic operation.

Connected Issue/Story

CyberArk internal issue link: ONYX-16710

Definition of Done

Changelog

  • The CHANGELOG has been updated, or
  • This PR does not include user-facing changes and doesn't require a
    CHANGELOG update

Test coverage

  • This PR includes new unit and integration tests to go with the code
    changes, or
  • The changes in this PR do not require tests

Documentation

  • Docs (e.g. READMEs) were updated in this PR
  • A follow-up issue to update official docs has been filed here: insert issue ID
  • This PR does not require updating any documentation

Behavior

  • This PR changes product behavior and has been reviewed by a PO, or
  • These changes are part of a larger initiative that will be reviewed later, or
  • No behavior was changed with this PR

Security

  • Security architect has reviewed the changes in this PR,
  • These changes are part of a larger initiative with a separate security review, or
  • There are no security aspects to these changes

@szh szh self-assigned this Feb 16, 2022
@szh szh force-pushed the atomic-file-writes branch 2 times, most recently from 36ddae1 to 3c2dd02 Compare February 16, 2022 21:15
@szh szh added the enhancement New feature or request label Feb 17, 2022
Copy link
Contributor

@doodlesbykumbi doodlesbykumbi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the comment, I was wondering about testing and so did some reading on why we might need atomic writes. I'm still not sure how we can simulate a write that gets messed up mid-way. The only thing that comes to mind is using a buffer to write then having the buffer error midway... still thinking about this one.

NOTICES.txt Outdated Show resolved Hide resolved
@szh szh mentioned this pull request Feb 17, 2022
13 tasks
@szh szh force-pushed the atomic-file-writes branch from 9ac63ec to e2c2ff1 Compare February 22, 2022 22:05
@szh szh force-pushed the atomic-file-writes branch 4 times, most recently from 960122c to 4f57b0a Compare February 23, 2022 15:00
@szh szh marked this pull request as ready for review February 23, 2022 15:02
@szh szh requested a review from a team as a code owner February 23, 2022 15:02
@szh szh force-pushed the atomic-file-writes branch 5 times, most recently from f189415 to a68a5e4 Compare February 23, 2022 18:00
pkg/atomicwriter/atomic_writer.go Outdated Show resolved Hide resolved
Comment on lines 53 to 76
defer func() {
go w.Cleanup()
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why does this need retry logic ?
  2. Deferring a call to a go routine feels like trouble

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. See this thread
  2. Why? Is there a preferred way to do this? My assumption is that a regular defer() function will block execution, which is a problem if we're doing timed retries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I'm not sure about the appropriateness of retry here. I think this is an instance of an application assuming the reliability responsibilities of something like the OS/file system. I think we ought to be able to safely assume reliability here. I understand that deletion might fail, but I think a single attempt + errors should be enough. An alternative might be to try and delete and if it doesn't work Truncate the file to 0, which is highly likely to work if done after successful writes.
  2. It's sort of a code smell, at least in my eyes. Defer creating Go routines that are not tracked by anything... An alternative, if we really decide to keep an eye on those tmp file and make sure they are cleaned up might be to a have some sort of temp file manager that keeps track of a collection (maybe a queue) of temp files and in the background in a single goroutine shared between all the files can carry out the regular task of cleaning up the temp files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andytinkham What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like truncate to 0 solves the problem of possibly leaking secrets which is my main concern. If that's easier, I'm ok with it. For #2, I don't think I have any specific thoughts - seems OK.

Copy link
Contributor Author

@szh szh Feb 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So instead of the whole retry logic, we just

  1. Attempt to delete the file
  2. If that fails, attempt to truncate it (and maybe log it)
  3. If that fails, log the error and exit

Is that right?

Copy link
Contributor

@diverdane diverdane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szh I've added some comments re. one way to mock the os.Remove function for testing to your query on the about-golang Teams channel (along with the os.Chmod and os.Rename functions).

@szh szh force-pushed the atomic-file-writes branch from 8a95610 to f493ff3 Compare February 24, 2022 14:25
@szh szh requested a review from andytinkham February 24, 2022 14:25
@szh
Copy link
Contributor Author

szh commented Feb 24, 2022

@szh I've added some comments re. one way to mock the os.Remove function for testing to your query on the about-golang Teams channel (along with the os.Chmod and os.Rename functions).

Thanks! @diverdane! I've added some unit tests the ideas you posted. That brought code coverage up into the 90s.

@szh szh force-pushed the atomic-file-writes branch from f493ff3 to 5cd38c3 Compare February 24, 2022 14:30
@szh szh force-pushed the atomic-file-writes branch 2 times, most recently from a4864f9 to 8b68411 Compare February 25, 2022 14:45
@szh szh removed the enhancement New feature or request label Feb 25, 2022
@szh szh requested a review from a team February 28, 2022 14:25
Copy link
Contributor

@diverdane diverdane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks terrific!
One minor comment/question, otherwise good-to-go!!!

pkg/atomicwriter/atomic_writer_test.go Show resolved Hide resolved
@szh szh force-pushed the atomic-file-writes branch from 8b68411 to 20ab1b3 Compare February 28, 2022 16:29
@szh
Copy link
Contributor Author

szh commented Feb 28, 2022

Requesting security review from @andytinkham

Copy link
Contributor

@diverdane diverdane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!!

Copy link
Contributor

@andytinkham andytinkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@szh szh force-pushed the atomic-file-writes branch 2 times, most recently from 56105e8 to 311c61f Compare March 1, 2022 17:24
@szh szh force-pushed the atomic-file-writes branch from 311c61f to ed7fb5f Compare March 2, 2022 14:16
@codeclimate
Copy link

codeclimate bot commented Mar 2, 2022

Code Climate has analyzed commit ed7fb5f and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 92.1% (50% is the threshold).

This pull request will bring the total coverage in the repository to 93.2% (0.1% change).

View more on Code Climate.

Copy link
Contributor

@doodlesbykumbi doodlesbykumbi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple comments. Otherwise PR looking good

Comment on lines +246 to +255
func mockErrorLog() *bytes.Buffer {
buf := &bytes.Buffer{}
log.ErrorLogger.SetOutput(buf)
return buf
}

// Unmocks the logger output, sending it back to os.Stderr
func unmockErrorLog() {
log.ErrorLogger.SetOutput(os.Stderr)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it doesn't seem that we have to worry about running tests in parallel but if we did I suspect the mocking and un-mocking might breakdown. To make testing the logs simpler I would have dependency injected the error logger function, much like you did the os funcs, since the atomic writer only uses log.Error.

@@ -143,7 +144,7 @@ func TestSecretProvider(t *testing.T) {
description: "sidecar container, happy path",
mode: "sidecar",
interval: time.Duration(10) * time.Millisecond,
testTime: time.Duration(65) * time.Millisecond,
testTime: time.Duration(65+atomicWriteDelayMsecs) * time.Millisecond,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's all this about ? (Just curious)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where we're doing atomic writes, it adds a very small delay which causes the retry testing to get thrown off (though only on Jenkins, not when running locally). This fixes that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants