Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider making File.join handle absolute paths #5904

Closed
obskyr opened this issue Apr 3, 2018 · 22 comments
Closed

Consider making File.join handle absolute paths #5904

obskyr opened this issue Apr 3, 2018 · 22 comments

Comments

@obskyr
Copy link

obskyr commented Apr 3, 2018

In Crystal 0.24.2, File.join doesn't handle absolute paths properly. Take the following example:

File.join("any/path", "/absolute/path")

That expression currently evaluates to "any/path/absolute/path", when the expected value would be "/absolute/path".

One argument against this is that Ruby's File.join also does not handle absolute paths - but every other high-level programming language I've used does (see Python's behavior, for example). Ruby's behaviour here is somewhat infamously cumbersome, so I'm certain it'd be an overall positive change to make.

@ysbaddaden
Copy link
Contributor

It's funny, because I have the opposite expectation: File.join is joining many paths, not expanding paths. From your example I expect the result to be any/path/absolute/path.

Futhermore, having File.join(a, b) become an absolute path because b starts with a directory separator, but I expected the joined path to always be under a can easily lead to security issues.

I believe the current behavior is a correct, expected and safer behavior by default.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

File.join is joining many paths, not expanding paths.

I agree - File.join("dir", "../another_dir") would still return "dir/../another_dir". This isn't about expanding paths, though: File.expand_path can't handle this, since it's a question of joining two or more paths, and not operating on an already existing one.

When running cd /x/y/z in your shell of choice, it doesn't take you to ./x/y/z. It takes you to /x/y/z, because joining a path (in that case, your CWD) with an absolute one makes the absolute path take precedent. It's expected behavior in most contexts, and Crystal not having a simple way to handle the normally expected path joining behavior is annoying at best.

@RX14
Copy link
Contributor

RX14 commented Apr 3, 2018

I don't agree. File.join should be dumb and simple to understand. I'm certain you should be using expand here.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

@RX14 So what's your recommended method to join a path with another, possibly absolute path using expand_path?

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

I should mention, also, that the approach I'm talking about is indeed very simple to understand - it mirrors every cd you've ever done. If you're joining with an absolute path, that absolute path is used, otherwise you get the relative behavior. No fuss.

@ysbaddaden
Copy link
Contributor

We don't provide anything for that. You'll have to come up with your solution. Possibly iterating paths until an absolute one is found and start joining there. For example:

def join_paths(*args)
  (args.size - 1).downto(1) do |i|
    if args[i].starts_with?(File::SEPARATOR)
      return File.join(args.to_a[i..-1])
    end
  end
  File.join(args)
end

p join_paths("dir", "sub/path")                # => "dir/sub/path"
p join_paths("dir", "/absolute", "sub/path")   # => "/absolute/sub/path"
p join_paths("dir", "/whoops", "sub", "/abso") # => "/abso"

If all you need is cd like:

def cd(path)
  if path.starts_with?(File::SEPARATOR)
    Dir.cd(path)
  else
    Dir.cd(File.join(Dir.current, path))
  end
end

I.e. expecting one path to be an absolute path and act upon it in the best manner. This is better than forgetting to sanitize a File.join input that includes a maliciously introduced absolute path and silently accept it.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

Shouldn't there be a way to join paths intelligently in the stdlib, then?

After all, there's a File.join that only joins strings with File.SEPARATOR, and there's an expand_path that expands special names like .. - to complement those, there should be a way to join paths like paths usually work. To only have ways to handle part of common path resolution behavior (special names but not join semantics) is mighty odd. Picture a separate method named File.coalesce or similar (forgive the silly name idea), which then also handles drive letters when Crystal supports Windows.

@ysbaddaden
Copy link
Contributor

If you can come up with a good, reliable, well named solution, that serves a variety of use cases, then please propose. Yet, from my point of view, this is an edge case, closely tied to use cases (e.g. cd will join the current dir iff the path argument isn't absolute), and I don't think the stdlib should provide a facility for it.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

It's absolutely not an edge case, seeing how many languages (Python, Rust, C#, C++17, Haskell...) implement it as default. It comes in handy whenever you want the user to be able to provide a path that's either relative (to CWD or another directory, e.g. an include path) or absolute in an intuitive way, which is often in my experience.

@RX14
Copy link
Contributor

RX14 commented Apr 3, 2018

How is expand_path with dir: Dir.current not the same as cd?

$ crystal eval 'p File.expand_path("/foo", dir: Dir.current)'
"/foo"

$ crystal eval 'p File.expand_path("foo", dir: Dir.current)'
"/home/rx14/foo"

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

Oh hey, there's a dir argument that handles absolute paths! That's useful - thanks. However, it also makes the path both absolute and canonical, which isn't always desired.

@RX14
Copy link
Contributor

RX14 commented Apr 3, 2018

@obskyr if you could actually provide a usecase for which expand_path doesn't work it'd help greatly.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

@RX14 Sure - whenever you want a relative path if possible. Off the top of my head, this can occur in two main categories of cases:

  • You want to display a path to the user. Having an easily readable path is desirable for debug messages, generating config files, generating file lists...
  • Your paths are going to be used on another system. One typical example of this is git repos: if you're generating paths and commiting them, you definitely want paths to in-repo things to be relative.

In my particular use case that spurred this discussion, I'm generating Makefile dependencies which I'd like to be both readable and reusable on another system. While absolute dependencies (in /lib, for example) should be absolute, paths to local dependencies need to be relative. If they weren't, they would be neither readable nor reusable.

@RX14
Copy link
Contributor

RX14 commented Apr 3, 2018

Surely a better fix there would be Path#relative_to once we get #5550.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

That wouldn't help at all - the entire point is you don't want a relative path if the path you're joining with is absolute.

@RX14
Copy link
Contributor

RX14 commented Apr 3, 2018

Well then you're "stuck" using the 5 line solution @ysbaddaden posted above. It seems a lot nicer than making a complex method to do exactly what you want inside the stdlib.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

Except it isn't complex. Both readable and reusable paths are common use cases, and the path joining behavior is commonly used and easy to understand. The Python stdlib's implementation is 17 lines, and that's with boilerplate and error handling - that's not complex at all.

I have to stress, once again: Python, Rust, C#, C++17, and Haskell all do this. It's not complex, it's not uncommon, and its definitely not just "exactly what I want".

@ysbaddaden
Copy link
Contributor

Well, we disagree. We don't want to change the current behavior of File.join.

I'm probably stupid and confused but your use cases seem like a perfect fit for File.expand_path. Please try to deal with this constraint in your design for the time being —or reimplement a custom method that suits you.

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

Welp. If I have the time and inclination, I'll put together a proper proposal for a new path joining function, with details on expected behavior and examples where neither File.join nor File.expand_path can be used. I'll close the issue of changing the existing method, though.

@obskyr obskyr closed this as completed Apr 3, 2018
@straight-shoota
Copy link
Member

I don't think it makes sense to change the behaviour of File.join or add an additional method for this. File.join is very simple, it lexically concats individual components to a normalized path. A component starting with a separator doesn't mean it's an absolute path, because it is just interpreted as a component. This is useful for joining paths which start with a separator to designate them as being relative to a project root.

You propose a behaviour similar to calling cd multiple times where an argument as relative path traverses the path from the current directory and an absolute path changes the current directory to that path.

This can be implemented quite easily by calling File.expand_path for each component:

components = ["any/path", "/absolute/path"]
components.reduce(".") { |path, component| File.expand_path(component, dir: path) } # => "/absolute/path"

@obskyr
Copy link
Author

obskyr commented Apr 3, 2018

File.join is very simple, it lexically concats individual components to a normalized path.

That's not true - File.join does no normalization at all beyond eliminating multiple separators between components. Except that one thing, it's just like calling String#join with File.SEPARATOR.

You propose a behaviour similar to calling cd multiple times where an argument as relative path traverses the path from the current directory and an absolute path changes the current directory to that path.

This can be implemented quite easily by calling File.expand_path for each component:

There seems to be a bit of a misunderstanding - that's not a solution, as that'll always result in an absolute path. Joining any/path and relative/path should result in a relative path. If you haven't yet, have a look at the docs for the path joining behavior in any of the languages I mentioned before, and I think it might become clearer.

@straight-shoota
Copy link
Member

That's not true - File.join does no normalization at all beyond eliminating multiple separators between components.

Well, it does normalize then, doesn't it?

Turning a resulting absolute path into a relative one shouldn't be an issue, really. In my above example you can just use a custom, unique path as starting point instead of "." and lchop it off at the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants