Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support short meta charset tag #15

Closed
tzi opened this issue Nov 6, 2012 · 2 comments
Closed

Support short meta charset tag #15

tzi opened this issue Nov 6, 2012 · 2 comments

Comments

@tzi
Copy link

tzi commented Nov 6, 2012

Hi !

Unless I am mistaken, I think weasyprint des not support the short meta charset tag:

<meta charset="utf-8">

But works fine with the complete one:

<meta http-equiv="Content-Type" content="Type=text/html; charset=utf-8">

Most of html5 website use the short one.
Because it is simpler and I think enought according to the specification.

Thanks for sharing weasyprint,
Thomas.

@SimonSapin
Copy link
Member

Hi,

Thank you for this report. WeasyPrint just uses libxml2 (through lxml) to parse HTML. The handling of <meta> elements to detect character encoding is there. The good news is, this bug is fixed in version 2.8.0 of libxml2:

http://xmlsoft.org/news.html

Add HTML parser support for HTML5 meta charset encoding declaration (Denis Pauk)

If you can upgrade libxml2 on your system, it should just work.

If you can not upgrade for some reason, another option is to use the html5lib parser. You should be able to do so with the git version of WeasyPrint and the workaround in #12 (comment)

@tzi
Copy link
Author

tzi commented Nov 6, 2012

Thanks for your answer !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants