Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read CSV file because of false postive in mime type detection. #429

Closed
ravibpatel opened this issue Mar 18, 2018 · 14 comments
Closed

Comments

@ravibpatel
Copy link

ravibpatel commented Mar 18, 2018

What is the expected behavior?

CSV file should be readable even if the mime type of the file doesn't match.

What is the current behavior?

When I try to read a CSV file PhpSpreadsheet was unable to read it cause canRead function returns false.
I am getting "text/x-fortran" as a mime type for my CSV file.

What are the steps to reproduce?

You can find CSV file here.

Possible solution to the problem.

$type = mime_content_type($pFilename);

I suggest you should not rely on this function to test the file. I removed the mime type check and it was able to read the file successfully with "," as a Delimiter.

Which versions of PhpSpreadsheet are affected?

PhpSpreadsheet 1.2.0

@xsdhy
Copy link

xsdhy commented Mar 20, 2018

I had the same problem.
Call to undefined function PhpOffice\PhpSpreadsheet\Reader\mime_content_type() in /vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Csv.php:479

@ravibpatel
Copy link
Author

ravibpatel commented Mar 20, 2018

@xsdhy You can fix it by removing following lines from the file.

https://github.com/PHPOffice/PhpSpreadsheet/blob/064052b7280169eedce60dbfee7637bee6d1283e/src/PhpSpreadsheet/Reader/Csv.php#L479-#L486

and replace it with "return true" or if you want to keep the check then you can replace the lines with following.

if(function_exists("mime_content_type")) {
    $type = mime_content_type($pFilename);
    $supportedTypes = [
        'text/csv',
        'text/plain',
        'inode/x-empty',
    ];
    return in_array($type, $supportedTypes, true);
}
return true;

@xsdhy
Copy link

xsdhy commented Mar 20, 2018

@ravibpatel ok,thanks
Hopefully the author can fix this bug as soon as possible.

@xsdhy
Copy link

xsdhy commented Mar 20, 2018

I found that this was not a bug, or that it was not a particularly serious bug.
The reason I got this bug was that a file suffix named excel2007 was changed to XLS.

@abunch
Copy link

abunch commented Mar 20, 2018

+1 - same issue here

@abunch
Copy link

abunch commented Mar 20, 2018

Instead of editing the source, we just locked our version to
dev-master#fb5f8d4763962f1c464b44b2788871903edc43ed

That version successfully recognizes and imports .csv files with no issues - it's the December 23, 2017 release (fb5f8d4) just prior to the 1.0 release. I could probably have used a newer check-in than that but I don't have time at the moment to work backwards through the check-ins to figure out where this broke.

IMO this is a better solution than directly editing the code.

@PowerKiKi
Copy link
Member

PowerKiKi commented Apr 1, 2018

@ravibpatel, we must test somehow that it is some kind of CSV looking file. Otherwise we are back at #167 where CSV reader tried to parse any kind of file, including binary one, and failed in all sort of miserable ways. What we could do though is trust the file extension if it is CSV and hope for the best. And if it is not, then fallback to mime detection.

@xsdhy your issue is that you miss the PHP extension for mime. You should install it, and we should declare it as a dependency.

@abunch, the change was introduced on 1.2.0, https://github.com/PHPOffice/PhpSpreadsheet/releases/tag/1.2.0. So you should be able to safely pin 1.1.0

@stale
Copy link

stale bot commented May 31, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If this is still an issue for you, please try to help by debugging it further and sharing your results.
Thank you for your contributions.

@topclaudy
Copy link

Any update on this?

@yupmin
Copy link

yupmin commented Nov 17, 2018

'mime_content_type' in 'canRead' is unsafe. I think you should change this. tsv file must passing.

/src/PhpSpreadsheet/Reader/Csv.php

    public function canRead($pFilename)
    {
        ...
        // Trust file extension if any
        if (strtolower(pathinfo($pFilename, PATHINFO_EXTENSION)) === 'csv' ||
            strtolower(pathinfo($pFilename, PATHINFO_EXTENSION)) === 'tsv'
        ) {
            return true;
        }
       ...

@MGParisi
Copy link

Im also having an issue. The problem is not fixed with an early true.

PowerKiKi added a commit that referenced this issue May 26, 2019
guillaume-ro-fr pushed a commit to guillaume-ro-fr/PhpSpreadsheet that referenced this issue Jun 12, 2019
@dreamsavior
Copy link

Strictly speaking this is not PhpSpreadsheet's problem, but your PHP version problem.
mime_content_type was marked as deprecated on PHP 7. And was probably removed on latter version of PHP.

So, to tackle this problem you can just polyfill the mime_content_type with the more recommended way to detect file type, which is using finfo library (make sure finfo library exist and is activated in your php.ini). Then put this script somewhere in your main script before you execute phpSpreadsheet :

if(!function_exists("mime_content_type")) {
	function mime_content_type($path) {
		// Create a FileInfo object
		$finfo = finfo_open(FILEINFO_MIME_TYPE);

		// Get the MIME type of the file
		$mime_type = finfo_file($finfo, $path);

		// Close FileInfo resource
		finfo_close($finfo);

		return $mime_type;
	}
}

As for improvement for the PhpSpreadsheet itself, I suggest to migrate from mime_content_type to finfo library.

@oleibman
Copy link
Collaborator

Function mime_content_type is neither removed nor deprecated.

@oleibman
Copy link
Collaborator

Csv Reader now passes files with csv or tsv extension, which is probably good enough. Unable to download file associated with this issue any longer. However, if someone has a new file with this problem, please open a new issue.

@oleibman oleibman removed the stale label Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

9 participants