Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do_join(): Take a copy of delimiter string (fixes #21458) #21484

Closed
wants to merge 3 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion doop.c
Original file line number Diff line number Diff line change
Expand Up @@ -699,16 +699,18 @@ Perl_do_join(pTHX_ SV *sv, SV *delim, SV **mark, SV **sp)
}

if (delimlen) {
bool delim_copied = false;
const U32 delimflag = DO_UTF8(delim) ? SV_CATUTF8 : SV_CATBYTES;
for (; items > 0; items--,mark++) {
STRLEN len;
const char *s;
if(*mark == delim) {
if(*mark == delim && !delim_copied) {
/* Take a copy in case delim SV is a tied SV with a
* self-modifying FETCH [GH #21458]
*/
delims = savepvn(delims, delimlen);
SAVEFREEPV(delims);
delim_copied = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below this still looks at SvUTF8(delim) to check whether delim is utf8 or not:

tony@venus:.../git/perl6$ git log -n1
commit 02d9e5762b1b8d5260c7279df232ff5c4a47912a (HEAD -> 21484-join-copy-delim)
Author: Paul Evans <[email protected]>
Date:   Fri Sep 15 11:40:58 2023 +0100

    Rename 'delims' to 'delimpv' for clarity
tony@venus:.../git/perl6$ cat ../21484.pl
use v5.38;
use Devel::Peek;
my $n = 1;
my $sep = "\x{100}" x $n;
package MyOver {
  use overload '""' => sub { $sep = "\xFF" x $n; "x" };
}

my $x = bless {}, "MyOver";
my $y = join $sep, "a", $x, "b";
Dump($y);
tony@venus:.../git/perl6$ ./perl -Ilib ../21484.pl 
SV = PV(0x55aeebdfb110) at 0x55aeebe2a090
  REFCNT = 1
  FLAGS = (POK,pPOK,UTF8)
  PV = 0x55aeebe952e0 "a\xC4\x80x\xFF\x00b"\0Malformed UTF-8 character: \xff\x00\x62 (too short; 3 bytes available, need 13) in Dump at ../21484.pl line 11.
Malformed UTF-8 character: \xff\x00 (unexpected non-continuation byte 0x00, immediately after start byte 0xff; need 13 bytes, got 1) in Dump at ../21484.pl line 11.
 [UTF8 "a\x{100}x\x{0}"]
  CUR = 7
  LEN = 16

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some though I wondered if magic or overloads on the first element in the list could invalidate delims before it is copied too - e.g. leaving delims as a dangling pointer.

It turned out to be possible:

tony@venus:.../git/perl6$ cat ../21484b.pl
use v5.38;
use Devel::Peek;
my $n = 1;
my $sep = "\x{100}" x $n;
package MyOver {
  use overload '""' => sub { $sep = "\xFF" x ($n+20); "x" };
}

my $x = bless {}, "MyOver";
my $y = join $sep, $x, "a";
Dump($y);
tony@venus:.../git/perl6$ PERL_DESTRUCT_LEVEL=2 ./perl -Ilib ../21484b.pl
=================================================================
==1859550==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020000132f0 at pc 0x7f5086e47cf9 bp 0x7ffd8dbf8fc0 sp 0x7ffd8dbf8770
READ of size 2 at 0x6020000132f0 thread T0
    #0 0x7f5086e47cf8 in __interceptor_memmove ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:810
    #1 0x5637bb5405de in Perl_sv_catpvn_flags /home/tony/dev/perl/git/perl6/sv.c:5574
    #2 0x5637bb661eeb in Perl_do_join /home/tony/dev/perl/git/perl6/doop.c:715
    #3 0x5637bb4f8453 in Perl_pp_join /home/tony/dev/perl/git/perl6/pp_hot.c:1936
    #4 0x5637bb4e7b67 in Perl_runops_standard /home/tony/dev/perl/git/perl6/run.c:41
    #5 0x5637bb1b4fde in S_run_body /home/tony/dev/perl/git/perl6/perl.c:2811
    #6 0x5637bb1b41b3 in perl_run /home/tony/dev/perl/git/perl6/perl.c:2726
    #7 0x5637bb13d478 in main /home/tony/dev/perl/git/perl6/perlmain.c:127
    #8 0x7f5086c461c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #9 0x7f5086c46284 in __libc_start_main_impl ../csu/libc-start.c:360
    #10 0x5637bb13d170 in _start (/home/tony/dev/perl/git/perl6/perl+0x112170)
...

This isn't a new issue from what I can see, but it means the fix is incomplete.

}
sv_catpvn_flags(sv,delims,delimlen,delimflag);
s = SvPV_const(*mark,len);
Expand Down