-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Base64 encode the file names in the 'file' physical backend #2203
Conversation
I think in case of |
c576fce
to
080bc92
Compare
I see a race-y situation in |
|
||
// For backwards compatibility, try to delete the file without base64 URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think probably this should actually be the other way around; we should default to doing the faster thing for new installations and fall back if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done here and for other places as well.
basePath, fileName := b.path(k) | ||
fullPath := filepath.Join(basePath, "_"+fileName) | ||
|
||
// If non-encoded file name is a valid storage entry, read it out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment (and in the other functions)
// Otherwise, encode the file name and look for the storage entry. | ||
f, err := os.Open(fullPath) | ||
if err != nil && os.IsNotExist(err) { | ||
fullPath = filepath.Join(basePath, "_"+base64.URLEncoding.EncodeToString([]byte(fileName))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than do the munging with base64-encoding and _
everywhere, I'd make the path
function return three values instead of two: the two it originally returned, and a third value with _
plus the base64 path. This removes a lot of boilerplate elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. Done.
// New storage entries will have their file names base64 URL encoded. If a | ||
// file with a non-encoded file name exists, it indicates that this is an | ||
// update operation. To avoid duplication of storage entries, delete the | ||
// old entry first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the old entry shouldn't be deleted first; if there is an error writing the new entry then you've just had data loss.
If we prefer the base64'd version, having an old version lying around isn't really a problem unless you delete it and the old version isn't deleted.
My suggestion is that this deletion should actually be in a deferred function that is invoked iff the stat against a previous non-base64'd value is successful. This would also have to take place with the lock, so it would need to be after the lock acquisition (and also, defers are LIFO so the defer statement itself must be after the deferred unlock).
In Delete, you could help avoid any issues by also stating the non-base64 version, and deleting whichever version(s) are found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done for most of it. I am not sure about the advantages of stating both versions and deleting it. What could go wrong if we try to delete the encoded one and if that fails, try deleting the old one? Is it that we'll not be able to detect the failure of deletion properly?
080bc92
to
b6d32e5
Compare
I restored the branch so it can be used as the base for a new set of changes as we needed to roll this back. |
Fixes https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/vault-tool/D2GStAl1f1k/AoURH_uUEAAJ in a generic manner.
This will deprecate #2198.
This PR will base64 URL encodes the file name part for all the paths stored at the
file
backend and decodes when appropriate with backwards compatibility. This will ensure that file names are compatible with Windows platform.Note: However, if Windows disallowed characters are present as part of the
basePath
, then it will not work. If we choose to encode the entire path, the directory hierarchy will flatten, which may not be desired.Tested on Windows manually to verify the reported bug in PKI.