Skip to content
This repository has been archived by the owner on Jul 13, 2023. It is now read-only.

MIME type detection of Paperclip::MediaTypeSpoofDetector doesn't work with old versions of file #2527

Closed
vakuum opened this issue Jan 11, 2018 · 2 comments

Comments

@vakuum
Copy link

vakuum commented Jan 11, 2018

In old versions of file a space is used to separate the MIME type from the encoding:

$ file --version
file-4.24

$ file -b --mime test.txt
text/html charset=us-ascii

$ cat /etc/SuSE-release 
SUSE Linux Enterprise Server 11 (x86_64)

In new versions of file a semicolon is used to separate the MIME type from the encoding:

$ file --version
file-5.25

$ file -b --mime test.txt
text/html; charset=us-ascii

$ lsb_release -d
Description:    Ubuntu 16.04.3 LTS

In media_type_spoof_detector.rb the split method is used to separate the MIME type from the encoding:

def type_from_file_command
...
	Paperclip.run("file", "-b --mime :file", :file => @file.path).split(/[:;]\s+/).first
...
end

But this doesn't work for old versions of file because the regexp doesn't match the space correctly:

> 'text/html charset=us-ascii'.split(/[:;]\s+/).first
=> "text/html charset=us-ascii"

> 'text/html; charset=us-ascii'.split(/[:;]\s+/).first
=> "text/html"

Moving the whitespace character into the character class solves the problem:

> 'text/html charset=us-ascii'.split(/[:;\s]+/).first
=> "text/html

> 'text/html; charset=us-ascii'.split(/[:;\s]+/).first
=> "text/html"

Monkey patch

Currently I am using the following monkey patch as a workaround:

module Paperclip
  class MediaTypeSpoofDetector
    private

    def type_from_file_command
      begin
        Paperclip.run('file', '-b --mime-type :file', :file => @file.path)
      rescue Cocaine::CommandLineError
        ''
      end
    end
  end
end
@vakuum
Copy link
Author

vakuum commented Jan 11, 2018

Fun fact 1

The file_command_content_type_detector.rb contains another implementation of the type_from_file_command method:

def type_from_file_command
  type = begin
    Paperclip.run("file", "-b --mime :file", :file => @filename)
    ...
  end
  ...
  type.split(/[:;\s]+/)[0]
end

The regexp of this implementation works correctly, because the whitespace character is already part of the character class.

Fun fact 2

The original implementation of Paperclip::MediaTypeSpoofDetector used the --mime-type parameter like the monkey patch above:

vakuum pushed a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
… with old versions of file.

Please see thoughtbot#2527 for details.
vakuum pushed a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
… with old versions of file.

Please see thoughtbot#2527 for details.
vakuum pushed a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
… with old versions of file.

Please see thoughtbot#2527 for details.
vakuum pushed a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
… with old versions of file.

Please see thoughtbot#2527 for details.
vakuum pushed a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
… with old versions of file.

Please see thoughtbot#2527 for details.
vakuum pushed a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
… with old versions of file.

Please see thoughtbot#2527 for details.
vakuum pushed a commit to vakuum/paperclip that referenced this issue Jan 13, 2018
… with old versions of file.

Please see thoughtbot#2527 for details.
@vakuum
Copy link
Author

vakuum commented Jan 15, 2018

I created a pull request for this issue: #2528.

sidraval pushed a commit that referenced this issue Jan 30, 2018
… with old versions of file.

Please see #2527 for details.
sidraval pushed a commit that referenced this issue Jan 30, 2018
… with old versions of file.

Please see #2527 for details.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants