You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a bug in libxml2’s HTML parser (the one used by lxml.html), which does not yet conform to HTML5 parsing. AFAIK this was just unspecified in HTML4.
>>> print(lxml.html.tostring(lxml.html.parse('http://www.stripey.com/demo/weasyprint/missing_image.html')))
<!DOCTYPE html>
<html><head><title>Missing Image</title><img src="200px-Donkey_cartoon_04.svg.png" alt="[an arbitrary image]"></head><body><p>There should be
<a href="http://commons.wikimedia.org/wiki/File:Donkey_cartoon_04.svg">a cartoon
donkey</a> above this paragraph.
</p></body></html>
As you can see, the parser adds the implied <head> and <body> elements (as expected) but in some cases considers the image to be part of the former instead of the latter.
See #12 about using the html5lib parser instead, which does not have this issue but is tricky to use at this point because of broken namespace handling in cssselect.
>>> print(lxml.html.tostring(lxml.html.html5parser.parse('http://www.stripey.com/demo/weasyprint/missing_image.html')))
<!DOCTYPE html>
<html:html xmlns:html="http://www.w3.org/1999/xhtml"><html:head><html:title>Missing Image</html:title>
</html:head><html:body><html:img src="200px-Donkey_cartoon_04.svg.png" alt="[an arbitrary image]"></html:img>
<html:p>There should be
<html:a href="http://commons.wikimedia.org/wiki/File:Donkey_cartoon_04.svg">a cartoon
donkey</html:a> above this paragraph.
</html:p></html:body></html:html>
If using html5lib is impractical, I recommend adding an explicit <body> tag. Alternatively, try to ask libxml2 for fixing this in their parser.
Closing. #12 is the one to follow for html5lib support.
This document starts with an image: http://www.stripey.com/demo/weasyprint/missing_image.html
But WeasyPrint doesn't show it: http://www.stripey.com/demo/weasyprint/missing_image.pdf
It can be made to appear by doing any of these:
<body>
tag<img>
in a block element, such as<div>
But not by:
<img>
in an inline element, such as<span>
<img>
in a<span>
The text was updated successfully, but these errors were encountered: