Skip to content
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.

ArchiveAsync interface changes #368

Closed
ncw opened this issue Feb 2, 2023 · 3 comments · Fixed by #369
Closed

ArchiveAsync interface changes #368

ncw opened this issue Feb 2, 2023 · 3 comments · Fixed by #369

Comments

@ncw
Copy link
Contributor

ncw commented Feb 2, 2023

What would you like to have changed?

I've been trying to use the ArchiveAsync interface to make an rclone backend.

It is good, but I'm having a problem with it: I need to know when the archiver has finished with the File I've passed in.

The problem is that at the time I create the File for the Archiver I have an open handle to pass to the archiver. However if the archiver go routine takes the handle after my function has returned, the open file handle will have been closed by layers above.

I think what is needed is instead of passing a File to the archiver, pass a struct like this

type AsyncArchiveJob struct {
	File   File
	Result chan<- error
}

I think the archiver loops should then look something like this (here is the zip one for no particular reason)

func (z Zip) ArchiveAsync(ctx context.Context, output io.Writer, jobs <-chan AsyncArchiveJob) error {
	zw := zip.NewWriter(output)
	defer zw.Close()

	var i int
	for job := range jobs {
		err := z.archiveOneFile(ctx, zw, i, job.File)
		job.Result <- err
		if err != nil {
			if z.ContinueOnError && ctx.Err() == nil { // context errors should always abort
				log.Printf("[ERROR] %v", err)
				continue
			}
			return err
		}
		i++
	}

	return nil
}

It may be that it doesn't actually need the continue on error at all here since we are now returning the errors to the user so this might be better:

func (z Zip) ArchiveAsync(ctx context.Context, output io.Writer, jobs <-chan AsyncArchiveJob) error {
	zw := zip.NewWriter(output)
	defer zw.Close()

	var i int
	for job := range jobs {
		job.Result <- z.archiveOneFile(ctx, zw, i, job.File)
		i++
	}

	return nil
}

Thoughts?

If you think this is a good idea I'd be happy to make a PR.

This would make a backwards incompatible change :-(

We could

  1. Do the change anyway and not update the /v4 since there are few/no users of ArchiveAsync and it is marked as being in alpha
  2. Go to /v5
  3. Work around by making a new ArchiveAsyncError interface (or something like that)
@mholt
Copy link
Owner

mholt commented Feb 3, 2023

Oh, interesting point!

This would make a backwards incompatible change :-(

That's OK, since archiver v4 is still in alpha. We can just make the change without having to worry too much about it (we'll document it in release notes and stuff).

I'm traveling today, but I don't want to be a blocker. I think your initial proposal is along the right lines. Want to submit a PR? Even if it needs an iteration or two I would be happy to review one!

ncw added a commit to ncw/archiver that referenced this issue Feb 3, 2023
After experience with the current ArchiveAsync interface while
creating an rclone backend it became clear that it wasn't quite right.

With the old interface, callers did not know when the File had been
archived and thus did not know when to release resources associated
with that file.

This patch changes the interface so that it returns the error
archiving each File. This means the caller can know when the File has
been archived and when to release resources. It returns the error for
each file archived which is useful too.

Fixes mholt#368
@ncw
Copy link
Contributor Author

ncw commented Feb 3, 2023

I have sent a couple of PRs - one for this and one for some other stuff I needed :-)

rclone is working with archiver now. It can write any of the archive formats (.zip, .tar, .tar.gz etc).

It can read from the .zip format, but there is a problem with .tar which I'll make another issue about as I'm not sure of the solution.

@mholt
Copy link
Owner

mholt commented Feb 6, 2023

Very cool. Thanks for the PRs, I finally got around to reviewing them.

ncw added a commit to ncw/archiver that referenced this issue Feb 7, 2023
After experience with the current ArchiveAsync interface while
creating an rclone backend it became clear that it wasn't quite right.

With the old interface, callers did not know when the File had been
archived and thus did not know when to release resources associated
with that file.

This patch changes the interface so that it returns the error
archiving each File. This means the caller can know when the File has
been archived and when to release resources. It returns the error for
each file archived which is useful too.

Fixes mholt#368
@mholt mholt closed this as completed in #369 Feb 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants