Skip to content

Commit

Permalink
tr/abc/de/: Properly handle longer lhs in in-place calc
Browse files Browse the repository at this point in the history
A tr/// can be done in-place if the target string doesn't contain a
character whose transliterated representation is longer than the
original.  Otherwise, writing the new value would destroy the next
character we need to read.

In general, we can't know if a particular string contains such a
character without keeping a list of the problematic characters, and
scanning it ahead of time for occurrences of those.  Instead, we
determine at compilation time if, for a given transliteration, if there
exists any possible target string that could have an overwriting
problem.  If none exist, we edit in place.  Otherwise, we first make a
copy.

Prior to this commit, the code failed to account for the case where the
rhs is shorter than the left, so that any unmatched lhs characters map
to the final rhs one.  The reason the code didn't consider this is that
I didn't think of this possibility when writing it.

This fixes #17654 and #17643
  • Loading branch information
khwilliamson committed Apr 1, 2020
1 parent f533d04 commit e35c7ef
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 2 deletions.
7 changes: 6 additions & 1 deletion op.c
Original file line number Diff line number Diff line change
Expand Up @@ -7475,7 +7475,12 @@ S_pmtrans(pTHX_ OP *o, OP *expr, OP *repl)
t_cp_end = MIN(IV_MAX, t_cp + span - 1);

if (r_cp == TR_SPECIAL_HANDLING) {
r_cp_end = TR_SPECIAL_HANDLING;

/* If unmatched lhs code points map to the final map, use that
* value. This being set to TR_SPECIAL_HANDLING indicates that
* we don't have a final map: unmatched lhs code points are
* simply deleted */
r_cp_end = (del) ? TR_SPECIAL_HANDLING : final_map;
}
else {
r_cp_end = MIN(IV_MAX, r_cp + span - 1);
Expand Down
8 changes: 7 additions & 1 deletion t/op/tr.t
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ BEGIN {

use utf8;

plan tests => 314;
plan tests => 315;

# Test this first before we extend the stack with other operations.
# This caused an asan failure due to a bad write past the end of the stack.
Expand Down Expand Up @@ -1187,4 +1187,10 @@ for ("", nullrocow) {
is($d, "\x{105}", '104 -> 105');
}

{
my $c = "cb";
eval '$c =~ tr{aabc}{d\x{d0000}}';
is($c, "\x{d0000}\x{d0000}", "Shouldn't generate valgrind errors");
}

1;

0 comments on commit e35c7ef

Please sign in to comment.