You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When upgrading from PHP8.1.22 to PHP8.3.4, there is inconsistent output between versions -- specifically for how blank nodes are handled.
Example
Input
<table><caption>
Cool table
</caption><tfoot><tr><th>I can do so much!</th></tr></tfoot><tr><tdstyle="font-size:16pt; color:#F00;font-family:sans-serif; text-align:center;">Wow</td></tr></table>
PHP8.1.22 output
<table><caption>
Cool table
</caption><tfoot><tr><th>I can do so much!</th></tr></tfoot><tr><tdstyle="font-size:16pt;color:#F00;font-family:sans-serif;text-align:center;">Wow</td></tr></table>
PHP8.3.4 output
<table><caption>
Cool table
</caption><tfoot><tr><th>I can do so much!</th></tr></tfoot><tr><tdstyle="font-size:16pt;color:#F00;font-family:sans-serif;text-align:center;">Wow</td></tr></table>
Impact
A strong case can be made that the PHP8.3.4 output is "more correct", and I wouldn't argue. The issue is that there is a ton of existing code and applications that maybe relying on the old behavior in order to "work". Having an optional backwards-compatible solution would ease the transition as many upgrade beyond PHP8.1.
Investigation
These steps have been performed:
verified that both PHP versions used for testing have the same version of libxml (2.9.1)
localized the behavior change to the loadHtml call here
verified that passing the LIBXML_NOBLANKS option fixed the output discrepancy
I think this php-src commit changed the default behavior of "blank" parsing from "don't keep" to "keep".
Suggested Fix
Much like LIBXML_PARSEHUGE is an optional configuration value that can be supplied here, I propose adding LIBXML_NOBLANKS as an optional value in order to better handle backwards compatibility as mentioned above without impacting existing use cases.
Description
When upgrading from PHP8.1.22 to PHP8.3.4, there is inconsistent output between versions -- specifically for how blank nodes are handled.
Example
Input
PHP8.1.22 output
PHP8.3.4 output
Impact
A strong case can be made that the PHP8.3.4 output is "more correct", and I wouldn't argue. The issue is that there is a ton of existing code and applications that maybe relying on the old behavior in order to "work". Having an optional backwards-compatible solution would ease the transition as many upgrade beyond PHP8.1.
Investigation
These steps have been performed:
loadHtml
call hereLIBXML_NOBLANKS
option fixed the output discrepancyI think this php-src commit changed the default behavior of "blank" parsing from "don't keep" to "keep".
Suggested Fix
Much like
LIBXML_PARSEHUGE
is an optional configuration value that can be supplied here, I propose addingLIBXML_NOBLANKS
as an optional value in order to better handle backwards compatibility as mentioned above without impacting existing use cases.Similar issues
#237
#269
The text was updated successfully, but these errors were encountered: