Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Rename #222

Draft
wants to merge 3 commits into
base: 3.0.x
Choose a base branch
from
Draft

Refactor Rename #222

wants to merge 3 commits into from

Conversation

ramchale
Copy link
Contributor

@ramchale ramchale commented Jan 8, 2025

Q A
Documentation no
Bugfix no
BC Break yes
New Feature no
RFC no
QA no

Description

Refactoring for #177

I had a couple of questions around the intent for v3;

  • I'm assuming removal of getFile and setFile, but wanted to check if we're also removing addFile
  • Currently the input for addFile can be a string, or array of target, etc, or an array of arrays of target, etc. Which of these do we want to support going forward?

@gsteel
Copy link
Member

gsteel commented Jan 8, 2025

I'm not entirely sure because I haven't had a decent look at it myself, but:

  • Make use of Laminas\Filter\File\FileInformation value object which exists to handle the 3 possible argument types of string|PHP $_FILES|PSRUpload
  • I think ideally, expect a single input, i.e. 1 path, or 1 PSR Upload or 1 PHP Upload and return a string representing the new, modified file path (Or un-filtered input)
  • We may have revise, that, i.e. we might need to accept multiple inputs for PSR/$_FILES but I'd rather avoid that if possible because we then have awkward return type expectations, IMO, successful filtering should === (string) $newFilePath
  • If there is a public method other than filter or __invoke, kill it with 🔥

My reasoning here is that we shouldn't need to worry too much about compat with input filter and form, because they are next in line for major release, and, if you want to deal with multiple uploads, then, from input filters perspective, we should be looking at an 'ArrayInput' wrapping n 'FileInput's rather than all this type juggling.

Options:

  • I think target should exclusively mean "Target Directory"
  • source does not make sense and I think it should be removed - the "source" is the filter input
  • overwrite and randomize seem sensible enough

Further reference:

cc @froschdesign @weierophinney

@froschdesign
Copy link
Member

froschdesign commented Jan 8, 2025

@gsteel

  • source does not make sense and I think it should be removed - the "source" is the filter input

The source option is relevant if several files are to be filtered, see the example in the documentation:

$filter = new \Laminas\Filter\File\Rename([
    [
        'source'    => 'fileA.txt'
        'target'    => '/dest1/newfileA.txt',
        'overwrite' => true,
    ],
    [
        'source'    => 'fileB.txt'
        'target'    => '/dest2/newfileB.txt',
        'randomize' => true,
    ],
]);

But I would follow your idea and the source option is no longer needed:

I think ideally, expect a single input, i.e. 1 path, or 1 PSR Upload or 1 PHP Upload and return a string representing the new, modified file path (Or un-filtered input)

I also think that is the right way to go:

if you want to deal with multiple uploads, then, from input filters perspective, we should be looking at an 'ArrayInput' wrapping n 'FileInput's rather than all this type juggling.


If several values are to be filtered, a decorator can still be created for this purpose.

@gsteel gsteel mentioned this pull request Jan 8, 2025
54 tasks
Signed-off-by: ramchale <[email protected]>
@ramchale
Copy link
Contributor Author

ramchale commented Jan 10, 2025

Thanks both.

Started moving things in that direction. Just a few more questions;

  • It seems like 'source' is really only useful for restricting what the filter input can be, which would be fairly key if we're looping through an array of these filters. With that in mind, do we want to rename it to make that more obvious? (Something like 'match_input')
  • Currently exceptions are thrown for issues actually renaming the file. This is inconsistent with the other filters, but also this one is slightly unusual in that it modifies things on the server, so double checking that we want to remove these exceptions and return the unfiltered input before I change it?
  • Should the file extension be taken from the source file/filter input if not in the target?

@gsteel
Copy link
Member

gsteel commented Jan 10, 2025

OK, so branching configuration based on matching the input to an option set has merit.

Perhaps this can be made more explicit with a match option instead of source and use fnmatch under the bonnet on the basename of the input.

i.e.

[
    [
        'match' => '*.txt',
        'target' => '/text/files',
    ],
    [
        'match' => '*.pdf',
        'target' => '/pdf/files',
    ],
];

We should be receiving a file path (Or PSR Upload, or PHP Upload). All of those 3 evaluate to a single file path input, so the filter's job should be, (assuming input matches an options set):

  • Move the file to a target directory with the name unchanged
  • Move the file to a target directory and rename it
  • Rename the file in-place

In which case, I think that the options to configure that behaviour should be:

  • ['target' => '/some/dir', 'renameTo' => '*']
  • ['target' => '/some/dir', 'renameTo' => 'someName.ext']
  • ['target' => '*', 'renameTo' => 'someName.ext']

I think that target should be validated with is_dir && is_writable in the constructor (Unless it's *).

From there, randomize/overwrite is trivial, as are the expected outcomes and return value.

The above behaviour makes more sense to me and I can see the filter being useful when processing an HTTP upload, or some other input such as, say, a list of file paths on a local disk.

So, options proposal:

/**
 * @psalm-type OptionsSet = array{
 *   match?: non-empty-string, // default '*'
 *   target?: non-empty-string, // default '*'
 *   renameTo?: non-empty-string, // default '*'
 *   overwrite?: bool, // default false
 *   randomize?: bool, // default false
 * }
 * @psalm-type Options = OptionsSet|list<OptionsSet>
 */
  • match is '*', i.e. match any file name, or, something fnmatch can handle
  • target is '*' i.e. re-name in-place or an existing target directory
  • renameTo is '*' i.e. don't rename, or, something like *.txt, prefix_* where * is replaced with pathinfo()['filename'] or pathinfo()['basename'] depending on the presence of a . in renameTo

When match, target and renameTo are all *, then the filter (by default) does absolutely nothing.

This is all just opinion, so feel free to point out glaring stupidity in my suggestions. We have an opportunity to break BC and make it useful and predictable.

  • Currently exceptions are thrown for issues actually renaming the file. This is inconsistent with the other filters, but also this one is slightly unusual in that it modifies things on the server, so double checking that we want to remove these exceptions and return the unfiltered input before I change it?

I think that if the file cannot be successfully moved/renamed, that's an exceptional condition. It means the developer probably hasn't got a writable directory in the right place, so it's more of a configuration error that an exception reacting to user input. If we've been given theoretically filterable input and the filter fails, then it's an exception. For input that cannot be filtered based solely on the input, then return the un-filtered value.

cc @froschdesign

@ramchale
Copy link
Contributor Author

@gsteel That all makes sense thanks.

I've put in a single input version, to check the behavior is as expected and ask if there's any further test scenarios anyone can think that need covering. (Ignore the static analysis errors in RenameTest, I just haven't got to them yet).

I've renamed "target" to "target_directory" just to be a bit more explicit.

In terms of passing in an array for configuration for multiple matches, is this something we could do by just adding multiple Rename filters to a field (possibly in a Chain) rather than having an array of arrays for the config? (Happy to add config array handling, but worth checking first)

@gsteel
Copy link
Member

gsteel commented Jan 21, 2025

In terms of passing in an array for configuration for multiple matches, is this something we could do by just adding multiple Rename filters to a field (possibly in a Chain) rather than having an array of arrays for the config? (Happy to add config array handling, but worth checking first)

I was thinking that the more complex config would be useful for "If it's a word doc, put in /docs, if it's a pdf put in /pdf-files", but you're right that multiple rename filters chained after each other could achieve the same outcome and would also simplify handling configuration internally.

I'm on the fence but leaning towards a single set of options (And multiple filters if required)

What do you think @froschdesign?

@froschdesign
Copy link
Member

Would it then have to be ensured within the chain that different "matches" are defined so that all filters in the chain do not process all files? Or is this simply a misconfiguration?


I think one thing we must not overlook is that this filter is not intended for files that come from an upload.

@ramchale
Copy link
Contributor Author

True it does have the issue that Chains won't stop at the first match. A potential alternative is to have the Rename filter for single cases, and add a RenameList wrapper that can iterate through multiple Rename filters until it finds a match?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants