Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better content type detection for urls. #90

Merged
merged 3 commits into from
Jun 18, 2016
Merged

Better content type detection for urls. #90

merged 3 commits into from
Jun 18, 2016

Conversation

vangorra
Copy link
Contributor

Uses the content type supplied by the server when

vangorra added 2 commits June 16, 2016 20:07
Uses the content type supplied by the server when
@dbashford
Copy link
Owner

Changing the signature for fromFileWithPath will likely break the tests and would at the very least require doc updates. And I rather not complicate this.

Can you instead just add a typeOverride to the options object and use that?

@vangorra
Copy link
Contributor Author

Good call! Here you go.

@dbashford
Copy link
Owner

Could you give me an idea of a file where the mime type wasn't being figured out properly? I can build a test around that later.

@vangorra
Copy link
Contributor Author

Without this PR, mime.lookup uses the aspx extension to resolve an octet stream as the mimetype. Which textract will not parse. The following URL will demonstrate this.
http://apps.leg.wa.gov/billinfo/summary.aspx?bill=1276

@dbashford
Copy link
Owner

👍

I'm deeply hoping to get another chunk of work done towards 2.0 this weekend. I intend to merge this, but I want to do it when I'm ready to write a test or two for it. So bear with me. =)

@dbashford dbashford merged commit 006e5fe into dbashford:master Jun 18, 2016
dbashford pushed a commit that referenced this pull request Jun 18, 2016
@vangorra vangorra deleted the vangorra-header_content_type branch June 18, 2016 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants