Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vacuum): add vacuum management for cleaning expired packages #3442

Open
wants to merge 73 commits into
base: main
Choose a base branch
from

Conversation

NikitaCOEUR
Copy link

@NikitaCOEUR NikitaCOEUR commented Jan 7, 2025

Check List

Introduce Vacuum Feature for Expired Packages

Addresses #2942 and Discussion #3086.

This pull request introduces a new feature for vacuuming expired packages.

Workflow

  1. Environment Variable Requirement
    The feature activates only when the environment variable AQUA_VACUUM_DAYS is set with a positive integer greater than 0.

  2. Implementation in InstallPackage

    • During the execution of commands through exec or install (but not aqua update or aqua cp, which currently use provideNilVacuumController in their controller initialization), package-related information is added or updated in a BoltDB (vacuum.db) located in ROOT_DIR.
    • The stored information includes:
      • Package Type
      • Package Name
      • Version
      • Timestamp (LastUsage)
      • PkgPaths
  3. Expiration Criteria

    • Packages that are not used do not have their timestamp updated.
    • A package is considered expired if its last usage exceeds the number of days defined in AQUA_VACUUM_DAYS. Expired packages become eligible for "vacuum."
  4. Execution of aqua vacuum Command

    • The vacuum command performs the following operations:
      • Deletes the package and its version from ROOT_DIR/pkgs/type.../pkg/version.
      • Removes the corresponding entry from the BoltDB.
  5. Optional - Not Implemented: Automatic Vacuum Execution

  • Currently, the vacuum process must be executed manually. However, it is possible to implement a step during the aqua install command to trigger vacuum expiration automatically, enabling "auto-cleaning."

Visualizing Managed Packages

The aqua vacuum command includes two flags, -list and -expired:

  • -list lists all packages.
  • -expired lists only expired packages.

These flags enable the use of a fuzzy finder in the console to explore the vacuum.db database and retrieve package-related information:
Fuzzy Finder Example

Coverage

cmdx c pkg/controller/vacuum
+ bash scripts/coverage.sh pkg/controller/vacuum
ok      github.com/aquaproj/aqua/v2/pkg/controller/vacuum       9.178s  coverage: 80.6% of statements

A high latence is observed for test du to tests of failling access to vacuum.db

Performance Considerations

Significant effort has been made to minimize the overhead introduced by this feature on Aqua commands. Benchmark tests in the exec controller attest to this.

goos: linux
goarch: amd64
pkg: github.com/aquaproj/aqua/v2/pkg/controller/exec
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
Benchmark_controller_Exec/normal-20         	    2608	    482220 ns/op	  706238 B/op	     732 allocs/op
Benchmark_controller_Exec/vacuumEnabled-20  	    2740	    462216 ns/op	  706194 B/op	     731 allocs/op
PASS
ok  	github.com/aquaproj/aqua/v2/pkg/controller/exec	3.639s

Three methods for adding packages to the database were implemented (storePackage, storesPackages, and asyncStorePackage). However, the asynchronous method (asyncStorePackage) shows the least overhead and is preferred. I think i'll remove all others.

Benchmark with adding only one package
=== RUN   BenchmarkVacuum_OnlyOneStorePackage
BenchmarkVacuum_OnlyOneStorePackage
=== RUN   BenchmarkVacuum_OnlyOneStorePackage/Sync
BenchmarkVacuum_OnlyOneStorePackage/Sync
BenchmarkVacuum_OnlyOneStorePackage/Sync-20                  398           3323185 ns/op           26262 B/op        120 allocs/op
=== RUN   BenchmarkVacuum_OnlyOneStorePackage/SyncMultipleSameTime
BenchmarkVacuum_OnlyOneStorePackage/SyncMultipleSameTime
BenchmarkVacuum_OnlyOneStorePackage/SyncMultipleSameTime-20                  375           3227867 ns/op           26312 B/op         120 allocs/op
=== RUN   BenchmarkVacuum_OnlyOneStorePackage/Async
BenchmarkVacuum_OnlyOneStorePackage/Async
BenchmarkVacuum_OnlyOneStorePackage/Async-20                              103395             15436 ns/op             360 B/op           7 allocs/op
=== RUN   BenchmarkVacuum_OnlyOneStorePackage/AsyncMultiple
BenchmarkVacuum_OnlyOneStorePackage/AsyncMultiple
BenchmarkVacuum_OnlyOneStorePackage/AsyncMultiple-20                       76678             15373 ns/op             360 B/op           7 allocs/op
PASS
ok      github.com/aquaproj/aqua/v2/pkg/controller/vacuum       6.106s

Benchmark when adding 100 packages at the same time :

=== RUN   BenchmarkVacuum_StorePackages
BenchmarkVacuum_StorePackages
=== RUN   BenchmarkVacuum_StorePackages/Sync
BenchmarkVacuum_StorePackages/Sync
BenchmarkVacuum_StorePackages/Sync-20                  3         348535122 ns/op         3619378 B/op      13535 allocs/op
=== RUN   BenchmarkVacuum_StorePackages/SyncMultipleSameTime
BenchmarkVacuum_StorePackages/SyncMultipleSameTime
BenchmarkVacuum_StorePackages/SyncMultipleSameTime-20                386           4835653 ns/op           62134 B/op         714 allocs/op
=== RUN   BenchmarkVacuum_StorePackages/Async
BenchmarkVacuum_StorePackages/Async
BenchmarkVacuum_StorePackages/Async-20                               254           3973761 ns/op           50124 B/op         750 allocs/op
=== RUN   BenchmarkVacuum_StorePackages/AsyncMultiple
BenchmarkVacuum_StorePackages/AsyncMultiple
BenchmarkVacuum_StorePackages/AsyncMultiple-20                      1050           1135216 ns/op           36244 B/op         602 allocs/op
PASS

The asynchronous method is the best option in all cases, especially when adding one package at a time (the typical use case for Aqua).

Benchmark of exec command via hyperfine :

❯ AQUA_VACUUM_DAYS=5 hyperfine -N --warmup 3 '/home/nikitac/Github/aqua/dist/aqua exec -- cmdx -v' '/home/nikitac/.local/share/aquaproj-aqua/bin/aqua exec -- cmdx -v'
Benchmark 1: /home/nikitac/Github/aqua/dist/aqua exec -- cmdx -v
  Time (mean ± σ):      47.3 ms ±   2.8 ms    [User: 38.9 ms, System: 10.2 ms]
  Range (min … max):    42.5 ms …  57.9 ms    67 runs
 
Benchmark 2: /home/nikitac/.local/share/aquaproj-aqua/bin/aqua exec -- cmdx -v
  Time (mean ± σ):      46.0 ms ±   3.2 ms    [User: 40.2 ms, System: 7.3 ms]
  Range (min … max):    41.9 ms …  60.2 ms    49 runs
 
Summary
  /home/nikitac/.local/share/aquaproj-aqua/bin/aqua exec -- cmdx -v ran
    1.03 ± 0.09 times faster than /home/nikitac/Github/aqua/dist/aqua exec -- cmdx -v

Main impact of storing packages is the method PackageInfo.PkgPaths, but i think the best choice to have all Paths concerned by each package. There are used to remove package from system during vacuum operation.

pkg/controller/wire.go Outdated Show resolved Hide resolved
@NikitaCOEUR NikitaCOEUR marked this pull request as ready for review January 8, 2025 07:25
@suzuki-shunsuke
Copy link
Member

Thank you for your contribution!

@suzuki-shunsuke
Copy link
Member

suzuki-shunsuke commented Jan 8, 2025

Thank you for your great work!
Detailed explanation and high test coverage, and benchmark test are very helpful.

According to the benchmark test, it looks like the overhead is really small.

Command interface

I don't think --list and --expired options are intuitive.
These options change the behavior completely.
Without these options, the command removes packages, but with these options the command shows the fuzzy finder.
I think it's better to separate the commands.

e.g.

aqua vacuum show [--expired]
aqua vacuum run

The command aqua vacuum run is a bit long, but we don't run this command often, so I think we can accept it.

DB schema

  • key: {pkg.Type},{pkg.Name}@{pkg.Version}
  • value:
{
  "LastUsageTime": "2025 ...",
  "PkgPath": [""]
}

The key is not good because we can change package name freely.
So we should use package install paths as a key because they must be unique and they are not changed even if package names are changed.

I think PkgPath should be a string, not an array.

I think we can add metadata such as type, name, etc to the value.
Do you have any reason not to add them?
For performance or file size?

Code review

  • I think VacuumDays should be an integer, not a pointer.
  • I think the vacuum method should be separated by method, not by mode.
  • The variable name should be plural if it is an array. (e.g. vacuumPkg -> vacuumPkgs)
  • The log field should be snake case. (e.g. PkgPath -> package_path)
    • name -> package_name
    • version -> package_version
  • You should use logerr.WithError when outputting logs
  • failed to is unnecessary in the log message

When we return an error or output error or warn log, definitely any operation fails.
So failed to has no meaning.
In Go, we wrap error at many times. So if we add failed to to error message,
the error message would become failed to ...: failed to ....: failed to ...: ....
failed to is noisy.

  • The log should not start with a capital letter
  • Variable names in functions should not start with a capital letter
    • ConfigPackageToRemove -> configPackageToRemove
  • Other packages should not depend on vacuum.Controller directly. It should depend on an interface.
    • Then we can replace it with a mock when vacuum is disabled.
  • Returning a list of errors is not common and looks strange.
  • errCh looks like a channel, but it is actually an array.
  • The function returns a list of errors, but it only checks the length and does not look at them.
  • If any error occurs, no data is removed from DB. This is not good.
    • Even if any error occurs, packages that can be removed from file system should be removed from DB.

Check

We should confirm if the treat of locale has no problem.

func (vc *Controller) isPackageExpired(pkg *PackageVacuumEntry) bool {
	const secondsInADay = 24 * 60 * 60
	threshold := int64(*vc.Param.VacuumDays) * secondsInADay
	return time.Since(pkg.PackageEntry.LastUsageTime).Seconds() > float64(threshold)
}

@NikitaCOEUR
Copy link
Author

I’ll revisit my draft and get back to you, taking your feedback into account!

Refactored code to exclusively use `StorePackage` as asynchronous functions. It is important to note that from now on, **you must ensure all tasks are finalized by calling the `Close` functions after invoking `StorePackage`.**

Replaced the generated key with `pkgPath` to ensure uniqueness.

Changed `pkgPath` type from an array of strings to a single string.
@NikitaCOEUR
Copy link
Author

NikitaCOEUR commented Jan 10, 2025

Rewrite is completed, take a look at this @suzuki-shunsuke. I think you'll prefer this implementation.

Command interface

Commands have been refund :

aqua vacuum run
aqua vacuum list
aqua vacuum list --expired

DB schema

PkgPath is know the key of the package in vacuum database.

We can add/remove metadata of corresponding value if needed.
Values are, for now :

type Package struct {
	Type    string // Type of package (e.g. "github_release")
	Name    string // Name of package (e.g. "cli/cli")
	Version string // Version of package (e.g. "v1.0.0")
	PkgPath string // Path to the install path without the rootDir/pkgs/ prefix
}

Check

We should confirm if the treat of locale has no problem

func (vc *Controller) isPackageExpired(pkg *PackageVacuumEntry) bool {
	const secondsInADay = 24 * 60 * 60
	threshold := int64(*vc.Param.VacuumDays) * secondsInADay
	return time.Since(pkg.PackageEntry.LastUsageTime).Seconds() > float64(threshold)
}

To :

func (vc *Controller) isPackageExpired(pkg *PackageVacuumEntry) bool {
	const secondsInADay = 24 * 60 * 60
	threshold := vc.Param.VacuumDays * secondsInADay

	lastUsageTime := pkg.PackageEntry.LastUsageTime
	if lastUsageTime.Location() != time.UTC {
		lastUsageTime = lastUsageTime.In(time.UTC)
	}

	timeSinceLastUsage := time.Since(lastUsageTime).Seconds()
	return timeSinceLastUsage > float64(threshold)
}

Added convertion to UTC trying to resolve location issue.

Visualizing Managed Packages

FuzzyFinder looks like :
{C5D72B8A-9E2D-458A-9929-10CB56CC4DAE}

Coverage

❯ cmdx c pkg/controller/vacuum
+ bash scripts/coverage.sh pkg/controller/vacuum
ok      github.com/aquaproj/aqua/v2/pkg/controller/vacuum       15.537s coverage: 80.4% of statements

A high latence is observed for test du to tests of failling access to vacuum.db

Performances :

The rewrite of this implementation has a noticeably similar impact based on the hyperfine results. However, the benchmark highlights a significant improvement in the average operation duration and the number of allocations per operation.

❯ AQUA_VACUUM_DAYS=5 hyperfine -r 100 --warmup 3 '/home/nikitac/Github/aqua/dist/aqua exec -- cmdx -v'   '/home/nikitac/.local/share/aquaproj-aqua/bin/aqua exec -- cmdx -v'
Benchmark 1: /home/nikitac/Github/aqua/dist/aqua exec -- cmdx -v
  Time (mean ± σ):      46.1 ms ±   1.9 ms    [User: 39.3 ms, System: 9.7 ms]
  Range (min … max):    43.0 ms …  53.7 ms    100 runs
 
Benchmark 2: /home/nikitac/.local/share/aquaproj-aqua/bin/aqua exec -- cmdx -v
  Time (mean ± σ):      41.8 ms ±   1.9 ms    [User: 38.7 ms, System: 9.3 ms]
  Range (min … max):    39.2 ms …  52.2 ms    100 runs
 
Summary
  /home/nikitac/.local/share/aquaproj-aqua/bin/aqua exec -- cmdx -v ran
    1.10 ± 0.07 times faster than /home/nikitac/Github/aqua/dist/aqua exec -- cmdx -v
Running tool: /usr/local/go/bin/go test -benchmem -run=^$ -bench ^BenchmarkVacuum_OnlyOneStorePackage$ github.com/aquaproj/aqua/v2/pkg/controller/vacuum

goos: linux
goarch: amd64
pkg: github.com/aquaproj/aqua/v2/pkg/controller/vacuum
cpu: 13th Gen Intel(R) Core(TM) i7-13700H
=== RUN   BenchmarkVacuum_OnlyOneStorePackage
BenchmarkVacuum_OnlyOneStorePackage
=== RUN   BenchmarkVacuum_OnlyOneStorePackage/Sync
BenchmarkVacuum_OnlyOneStorePackage/Sync
BenchmarkVacuum_OnlyOneStorePackage/Sync-20             11378278               101.4 ns/op           144 B/op            3 allocs/op
PASS
ok      github.com/aquaproj/aqua/v2/pkg/controller/vacuum       1.270s

@suzuki-shunsuke
Copy link
Member

Apologies for keeping you waiting.

@NikitaCOEUR
Copy link
Author

Don't worry, it's your tool, and I know you take care of it, so take the time you need to make sure everything's ok if you think it's necessary! It's an interesting feature in my opinion, but it's not going to revolutionize the use of this tool. It can still wait without any problems on my side. (It's just that if we could avoid having to redo all this work if there were ever any conflicts...)

if err != nil {
return fmt.Errorf("get the package install path: %w", err)
}

if err := is.vacuum.StorePackage(logE, pkg, pkgPath); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InstallPackage skips installing a package if the package has already been installed.
In that case, I think we should not update the timestamp.
Timestamp should be updated only when the package is actually downloaded and installed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the principle, but as the install function is systematically called in the exec and install processes, I thought it would be simpler to position the timestamp update process here.
But yes, the risk would be to modify this behavior and break the operation without realizing it. It would be more coherent to carry the timestamp update action further upstream, during installation and execution.

@suzuki-shunsuke
Copy link
Member

I'm considering if we can simplify asynchronous process.
StoreQueue is a bit complicated and hard to understand and maintain it.
If it's difficult to simplify, we can keep the current implementation.

@@ -54,5 +56,6 @@ func (i *command) action(c *cli.Context) error {
if err != nil {
return fmt.Errorf("parse args: %w", err)
}
return ctrl.Exec(c.Context, i.r.LogE, param, exeName, args...) //nolint:wrapcheck
defer ctrl.CloseVacuum(logE)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I don't think this works here, because aqua exec can use: https://github.com/aquaproj/aqua/blob/main/pkg/controller/exec/exec.go#L126.
And this (on UNIX) uses the "Exec" function of syscall_unix, which, when executed, replaces the calling process with the called process.
This cuts short the rest of the code, and the functions called in defer are not played. Finally, asynchronous processes are not necessarily terminated either...

Copy link
Member

@suzuki-shunsuke suzuki-shunsuke Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Thank you.
I've confirmed defer function isn't executed.

package main

import (
	"log"
	"os"

	"golang.org/x/sys/unix"
)

func main() {
	if err := core(); err != nil {
		log.Fatal(err)
	}
}

func core() error {
	defer func() {
		log.Println("defer")
		if err := os.WriteFile("test.txt", []byte(`hello`), 0o644); err != nil {
			log.Print(err)
		}
	}()
	return exe()
}

func exe() error {
	return unix.Exec("/usr/bin/curl", []string{"curl", "--version"}, os.Environ())
}
$ go run main.go
curl 8.7.1 (x86_64-apple-darwin24.0) libcurl/8.7.1 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.63.0
Release-Date: 2024-03-27
Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz MultiSSL NTLM SPNEGO SSL threadsafe UnixSockets

$ ls # test.txt isn't created.
go.mod  go.sum  main.go

https://pkg.go.dev/golang.org/x/sys/unix#Exec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@suzuki-shunsuke
Copy link
Member

suzuki-shunsuke commented Jan 21, 2025

$ AQUA_VACUUM_DAYS=5 hyperfine -N --warmup 3 '/Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v' '/Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v'
Benchmark 1: /Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v
  Time (mean ± σ):      50.5 ms ±   1.4 ms    [User: 4.2 ms, System: 2.0 ms]
  Range (min … max):    47.7 ms …  56.3 ms    60 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: /Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v
  Time (mean ± σ):      40.4 ms ±   1.4 ms    [User: 6.4 ms, System: 1.8 ms]
  Range (min … max):    38.7 ms …  44.6 ms    73 runs
 
Summary
  /Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v ran
    1.25 ± 0.05 times faster than /Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v

1.25 is a bit slow.

Without AQUA_VACUUM_DAYS:

$ hyperfine -N --warmup 3 '/Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v' '/Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v'

Benchmark 1: /Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v
  Time (mean ± σ):      38.7 ms ±   1.2 ms    [User: 5.1 ms, System: 1.8 ms]
  Range (min … max):    36.9 ms …  42.1 ms    76 runs
 
Benchmark 2: /Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v
  Time (mean ± σ):      40.7 ms ±   1.3 ms    [User: 6.1 ms, System: 1.8 ms]
  Range (min … max):    39.2 ms …  45.7 ms    72 runs
 
Summary
  /Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v ran
    1.05 ± 0.05 times faster than /Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v

abd0bf4

$ git checkout abd0bf400a42f9a313324c0024fb6dec92b6ddcc
$ AQUA_VACUUM_DAYS=5 hyperfine -N --warmup 3 '/Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v' '/Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v'
Benchmark 1: /Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v
  Time (mean ± σ):      56.3 ms ±   8.6 ms    [User: 5.1 ms, System: 2.3 ms]
  Range (min … max):    49.5 ms … 101.7 ms    55 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: /Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v
  Time (mean ± σ):      44.3 ms ±   1.5 ms    [User: 7.6 ms, System: 2.1 ms]
  Range (min … max):    42.0 ms …  48.1 ms    66 runs
 
Summary
  /Users/shunsukesuzuki/.local/share/aquaproj-aqua/internal/pkgs/github_release/github.com/aquaproj/aqua/v2.42.2/aqua_darwin_arm64.tar.gz/aqua exec -- cmdx -v ran
    1.27 ± 0.20 times faster than /Users/shunsukesuzuki/go/bin/aqua exec -- cmdx -v

@suzuki-shunsuke
Copy link
Member

suzuki-shunsuke commented Jan 22, 2025

I've done the performance test using some revision on my M3 Pro, but the performance is worse than the result you shared. #3442 (comment)
I also tested the revision abd0bf4 , which I haven't added any changes, so my recent commits aren't related.
I'm not sure if this is related to the difference of environment.
I should have done the performance test before refactoring... This is my bad.

Idea of new architecture

Maybe we need to change the architecture.
We can't accept the performance regression of aqua exec, but we can accept the regression of commands such as aqua vacuum and aqua i to some extent.
When any packages are installed (downloaded), a few overhead is not so problem.

I came up with an idea.

  1. Write the package data and timestamp to a random file in $(aqua root-dir)/timestamps/<random file name>.csv. To avoid any conflict, the filename must be random. I guess this is so fast.
  2. When vacuuming packages, aqua reads all files in $(aqua root-dir)/timestamps and inserts data into DB (boltdb or other DBs like SQLite) and removes files in $(aqua root-dir)/timestamps. This is slower than the current architecture, but I guess we can accept

This is just an idea.
If we can resolve the performance issue using the current architecture, the new architecture is unnecessary.

@suzuki-shunsuke
Copy link
Member

$(aqua root-dir)/timestamps/<random file name>.csv. To avoid any conflict, the filename must be random.

Probably $(aqua root-dir)/timestamps/<pkgPath>/timestamp.csv is better.
This has a risk to break files, but it rarely happens. And even if files are broken, it's not critical.
$(aqua root-dir)/timestamps/<random file name>.csv has drawback that too many files are created.

@suzuki-shunsuke
Copy link
Member

Oh, this approach doesn't need database.
And I mentioned this idea before. 😅 https://github.com/orgs/aquaproj/discussions/3086#discussioncomment-10643619

Create a file per the pair of package and version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants