You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Perl 5.30.2 replaces some invalid UTF-8 byte sequences inconsistent with current best practices.
The Unicode specification says:
An increasing number of implementations are adopting the handling of
ill-formed subsequences as specified in the W3C standard for encoding
to achieve consistent U+FFFD replacements.
Perl 5.30.2 replaces some invalid UTF-8 byte sequences inconsistent with current best practices.
The Unicode specification says:
See:
ecification -- Conformance page 126, section 3.9.
For example, the hex byte sequence:
<e0 80 7f>
gets encoded as:
<ef bf bd 7f>
instead of:
<ef bf bd ef bf bd 7f>
Here are a few more examples:
Perl decode: e0 80 80
expected: ef bf bd ef bf bd ef bf bd
got: ef bf bd
Perl decode: f0 80 80 80
expected: ef bf bd ef bf bd ef bf bd ef bf bd
got: ef bf bd
Perl decode: ed ae 80 ed b0 80
expected: ef bf bd ef bf bd ef bf bd ef bf bd ef bf bd ef bf bd
got: ef bf bd ef bf bd
See https://github.com/flenniken/utf8tests for more information.
The text was updated successfully, but these errors were encountered: