Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear the data/log for jobs in Gitea Actions that are too old #24256

Open
wolfogre opened this issue Apr 21, 2023 · 13 comments
Open

Clear the data/log for jobs in Gitea Actions that are too old #24256

wolfogre opened this issue Apr 21, 2023 · 13 comments
Labels
topic/gitea-actions related to the actions of Gitea type/feature Completely new functionality. Can only be merged if feature freeze is not active. type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@wolfogre
Copy link
Member

wolfogre commented Apr 21, 2023

Feature Description

Jobs that have been done for too long no longer need to retain data/log, so it is necessary to regularly clean them up.

Details

#26275 (comment) : Some logic which could help:

  • What you want delete is a run, so you need to delete a row of ActionRun.
  • A run could include multiple jobs, so you need to delete multiple rows of ActionRunJob.
  • When a job has been picked up by a runner to execute, there will be a task, however, a job could be rerun many times, so you need to delete multiple rows of ActionTask.
  • Every task could include
    • multiple steps, so ActionTaskStep
    • multiple outputs, so ActionTaskOutput
  • The log of the task is stored as a file (in DBFS or storage), so you need to detele it according to ActionTask.LogFilename or ActionTask.LogInStorage.
  • In addition, there may be some artifacts uploaded to the run, so you need to delete multiple rows of ActionArtifact.
    • The files of artifacts are stored in storage, so you need to remove them.
  • Finally, update NumActionRuns and NumClosedActionRuns of Repository.
@wolfogre wolfogre added type/proposal The new feature has not been accepted yet but needs to be discussed first. type/feature Completely new functionality. Can only be merged if feature freeze is not active. topic/gitea-actions related to the actions of Gitea labels Apr 21, 2023
@yp05327
Copy link
Contributor

yp05327 commented Apr 25, 2023

Maybe a related question?
To save the storage usages, is it worth compressing these data/log?

@ghnp5
Copy link

ghnp5 commented Jul 13, 2023

I was now looking for this 😊
Thanks!!

@lunny
Copy link
Member

lunny commented Jul 13, 2023

Maybe we can compress as zip, so that when display via HTTP, it can be unzip by web browser.

@ghnp5
Copy link

ghnp5 commented Jul 13, 2023

Maybe, but I think most important is to be able to purge too, with some Retention setting (e.g. older than 1 month, keep last 3, etc), so that the storage doesn't grow forever.

@nicorac
Copy link

nicorac commented Jul 20, 2023

+1 for purge ability; this is my use case:

I'm managing an internal Gitea server, and I'm going to make Actions available to my co-worker devs.

Since there's no other way to test push Actions than... pushing, I've played with them by "trial and error" 😉.
My tests caused the repo to actually contain 150+ completely useless job executions, so I'd like to purge them all before going to "production".

To make it worst, I suppose that once Actions will be made available to anyone on my server, my co-workers will also start testing them the same way... with the same result.

And finally, once in production, there's no need for us to keep any of repos Actions logs longer then 2wks.

@nodiscc
Copy link
Contributor

nodiscc commented Nov 21, 2023

Related #26219 (Delete workflow runs) (manually)

@diamante0018
Copy link

about the Artifacts would it be possible to add a little rubbish icon next to the file name so they can be deleted manually? (talking about artifacts created by actions and uploaded via actions/upload-artifact)

@bilogic
Copy link
Contributor

bilogic commented Mar 24, 2024

For me, 7 days is more than enough

@f1mishutka
Copy link

For me, 7 days is more than enough

Would be much better to have this interval configurable per repository.

@bilogic
Copy link
Contributor

bilogic commented Mar 24, 2024

image

After deleting the old action logs, the spinners don't stop and there are no messages to indicate the log no longer exists.

@diamante0018
Copy link

image

After deleting the old action logs, the spinners don't stop and there are no messages to indicate the log no longer exists.

Append to me as well when the runner dies because it could not download an action.
So the wheels keep spinning forever

@wolfogre
Copy link
Member Author

After deleting the old action logs, the spinners don't stop and there are no messages to indicate the log no longer exists.

@bilogic How do you clear the old action logs? Just remove them from disk/storage? If so, I would say it's safe to do that to free up space. However, the old jobs cannot be displayed properly on the UI since there's no code for the inconsistency.

@bilogic
Copy link
Contributor

bilogic commented Mar 25, 2024

@wolfogre Yes, I just removed from storage, but that was because I had a backup. And yes, my image was to illustrate the need to report the situation if the logs are deleted.

lunny pushed a commit that referenced this issue Aug 2, 2024
Part of #24256.

Clear up old action logs to free up storage space.

Users will see a message indicating that the log has been cleared if
they view old tasks.

<img width="1361" alt="image"
src="https://github.com/user-attachments/assets/9f0f3a3a-bc5a-402f-90ca-49282d196c22">

Docs: https://gitea.com/gitea/docs/pulls/40

---------

Co-authored-by: silverwind <[email protected]>
wolfogre added a commit that referenced this issue Aug 9, 2024
Support compression for Actions logs to save storage space and
bandwidth. Inspired by
#24256 (comment)

The biggest challenge is that the compression format should support
[seekable](https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md).
So when users are viewing a part of the log lines, Gitea doesn't need to
download the whole compressed file and decompress it.

That means gzip cannot help here. And I did research, there aren't too
many choices, like bgzip and xz, but I think zstd is the most popular
one. It has an implementation in Golang with
[zstd](https://github.com/klauspost/compress/tree/master/zstd) and
[zstd-seekable-format-go](https://github.com/SaveTheRbtz/zstd-seekable-format-go),
and what is better is that it has good compatibility: a seekable format
zstd file can be read by a regular zstd reader.

This PR introduces a new package `zstd` to combine and wrap the two
packages, to provide a unified and easy-to-use API.

And a new setting `LOG_COMPRESSION` is added to the config, although I
don't see any reason why not to use compression, I think's it's a good
idea to keep the default with `none` to be consistent with old versions.

`LOG_COMPRESSION` takes effect for only new log files, it adds `.zst` as
an extension to the file name, so Gitea can determine if it needs
decompression according to the file name when reading. Old files will
keep the format since it's not worth converting them, as they will be
cleared after #31735.

<img width="541" alt="image"
src="https://github.com/user-attachments/assets/e9598764-a4e0-4b68-8c2b-f769265183c9">
DennisRasey pushed a commit to DennisRasey/forgejo that referenced this issue Aug 13, 2024
Support compression for Actions logs to save storage space and
bandwidth. Inspired by
go-gitea/gitea#24256 (comment)

The biggest challenge is that the compression format should support
[seekable](https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md).
So when users are viewing a part of the log lines, Gitea doesn't need to
download the whole compressed file and decompress it.

That means gzip cannot help here. And I did research, there aren't too
many choices, like bgzip and xz, but I think zstd is the most popular
one. It has an implementation in Golang with
[zstd](https://github.com/klauspost/compress/tree/master/zstd) and
[zstd-seekable-format-go](https://github.com/SaveTheRbtz/zstd-seekable-format-go),
and what is better is that it has good compatibility: a seekable format
zstd file can be read by a regular zstd reader.

This PR introduces a new package `zstd` to combine and wrap the two
packages, to provide a unified and easy-to-use API.

And a new setting `LOG_COMPRESSION` is added to the config, although I
don't see any reason why not to use compression, I think's it's a good
idea to keep the default with `none` to be consistent with old versions.

`LOG_COMPRESSION` takes effect for only new log files, it adds `.zst` as
an extension to the file name, so Gitea can determine if it needs
decompression according to the file name when reading. Old files will
keep the format since it's not worth converting them, as they will be
cleared after #31735.

<img width="541" alt="image"
src="https://github.com/user-attachments/assets/e9598764-a4e0-4b68-8c2b-f769265183c9">

(cherry picked from commit 33cc5837a655ad544b936d4d040ca36d74092588)

Conflicts:
	assets/go-licenses.json
	go.mod
	go.sum
  resolved with make tidy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/gitea-actions related to the actions of Gitea type/feature Completely new functionality. Can only be merged if feature freeze is not active. type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

No branches or pull requests

9 participants