Skip to content

Commit

Permalink
Add sanitise-artifact-names action
Browse files Browse the repository at this point in the history
Sanitise filenames for GitHub Actions artifacts.
  • Loading branch information
markgoddard committed Sep 4, 2024
1 parent c97c844 commit e35e4eb
Show file tree
Hide file tree
Showing 4 changed files with 115 additions and 1 deletion.
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Reusable GitHub workflows and actions for StackHPC OpenStack.
The following reusable workflows are provided in the `.github/workflows/`
directory.

## `multinode.yml`
### `multinode.yml`

The `multinode.yml` workflow can be used to create a multinode test cluster and
run tests and/or operations against it.
Expand All @@ -17,3 +17,11 @@ Features:
* Inject an SSH key to access the cluster
* Break (pause) the workflow on failure
* Upgrade from one OpenStack release to another

## Actions

The following actions are provided in the top-level directory.

### `sanitise-artifact-filenames`

Sanitise filenames for GitHub Actions artifacts.
26 changes: 26 additions & 0 deletions sanitise-artifact-filenames/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Sanitise Artifact Filenames

This action sanitises directory and file names for GitHub Actions artifacts.
Example error from the upload-artifact action if you have an invalid path:

> Error: The path for one of the files in artifact is not valid:
> /tempest-artifacts.2024-08-29T18:18+00:00/docker.log. Contains the following
> character: Colon :
>
> Invalid characters include: Double quote ", Colon :, Less than <, Greater than
> >, Vertical bar |, Asterisk *, Question mark ?, Carriage return \r, Line feed
> \n
>
> The following characters are not allowed in files that are uploaded due to
> limitations with certain file systems such as NTFS. To maintain file system
> agnostic behavior, these characters are intentionally not allowed to prevent
> potential problems with downloads on different file systems.
## Usage

```yaml
- name: Sanitise filenames for artifacts
uses: stackhpc/stackhpc-openstack-gh-workflows/sanitise-artifact-filenames@main
with:
path: path/to/artifact/
```
14 changes: 14 additions & 0 deletions sanitise-artifact-filenames/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
name: Sanitise filenames for GitHub Actions artifacts
description: >
Renames files and directories to be accepted by the GitHub Actions
upload-artifact action.
inputs:
path:
description: The directory containing files to be sanitised
required: true
runs:
using: composite
steps:
- name: Sanitise filenames for GitHub Actions artifacts
run: python3 sanitise-artifact-filenames.py ${{ inputs.path }}
66 changes: 66 additions & 0 deletions sanitise-artifact-filenames/sanitise-artifact-filenames.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#!/usr/bin/python3

"""
This script sanitises directory and file names for GitHub Actions artifacts.
Example error from the upload-artifact action if you have an invalid path:
Error: The path for one of the files in artifact is not valid:
/tempest-artifacts.2024-08-29T18:18+00:00/docker.log. Contains the following
character: Colon :
Invalid characters include: Double quote ", Colon :, Less than <, Greater than
>, Vertical bar |, Asterisk *, Question mark ?, Carriage return \r, Line feed
\n
The following characters are not allowed in files that are uploaded due to
limitations with certain file systems such as NTFS. To maintain file system
agnostic behavior, these characters are intentionally not allowed to prevent
potential problems with downloads on different file systems.
"""

import os
import sys
import typing as t


def main() -> None:
if len(sys.argv) != 2:
usage()
sys.exit(1)

sanitise(sys.argv[1])


def usage() -> None:
print(f"Usage: {sys.argv[0]} <path>")


def sanitise(path: str) -> None:
# Recursively walk a directory, sanitising subdirectories and files as we go.
# Walk bottom-up to avoid directory renames breaking subsequent paths.
table = translation_table()
for dirpath, dirnames, filenames in os.walk(path, topdown=False, followlinks=False):
for filename in filenames:
sanitise_file_or_dir(filename, table, dirpath)
for dirname in dirnames:
sanitise_file_or_dir(dirname, table, dirpath)


def translation_table() -> t.Dict:
# Return a translation table that translates all disallowed characters to a dash.
disallowed = "\":<>|*?\r\n"
return str.maketrans(disallowed, "-" * len(disallowed))


def sanitise_file_or_dir(path: str, table: t.Dict, dirpath: str) -> None:
# Sanitise a single file or directory.
sanitised = path.translate(table)
if path != sanitised:
print(f"Sanitising {path} as {sanitised} in {dirpath}")
path = os.path.join(dirpath, path)
dirpath = os.path.join(dirpath, sanitised)
os.rename(path, dirpath)


if __name__ == "__main__":
main()

0 comments on commit e35e4eb

Please sign in to comment.