shifting of bitvecs considered broken #164

p5pRT · 1999-07-07T08:20:23Z

Migrated from rt.perl.org#969 (status was 'resolved')

Searchable as RT969$

p5pRT · 1999-07-07T08:20:23Z

From @jhi

Resubmitting a bug report of mine back from February;
this time through the proper channel (perlbug, not p5p).

[paste]

While editing my Damn Book I re-remembered that couple of months back
I ran into an anomaly in the handling of bitvectors.

Fiction: you have a bitvector which you want to shift.

The current (5.005_03-MT5) fact:

perl -wle '$b = ""; vec($b, 0, 1) = 1;print unpack("b*", $b);$b<<=1;print unpack("b*", $b)'
10000000
00001100

Huh? Adding -w tells more, as usual:
Argument "^A" isn't numeric in left_shift at -e line 1.

So left_shift assumes that the argument to shift is a number, but "^A"
isn't, so it gets converted to string zero "0" (48, 0x30, 0b0000110).

I consider this behaviour to be rather broken. I think

$b <<= 1

should shift the whole bitvector left by one position and

vec($b, 2, 8) >>= 2

should shift the bits 16..23 right by two positions (of course not
leaking into bits 8..15).

Back then I asked Larry whether the current behaviour is by design or
something else.

larry: I suppose << and >> ought to be made to work. There's a little more
larry: chance of breaking things than with & and | because you can only
larry: guess based on the stringiness of the first argument. But that's
larry: no worse than ~, I guess.
larry:
larry: Still, there may be some programs out there that expect to be
larry: able to coerce a string to integer and then shift it. Maybe we
larry: should pre-deprecate it.

P.S. My original problem can be recoded with "use integer", but the
original point was to demonstrate the similarity of bit vectors and
integers vectors by coding the same algorithm using the two
approaches. Now I found out that there are not similar enough.
Besides, the original formulation of the algorithm does not use
integers, it uses bit vectors. Sigh. I think I will drop the
bitvector approach from the book.

--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen

Perl Info



Site configuration information for perl 5.00557:

Configured by jhi at Tue Jul  6 23:51:13 EET DST 1999.

Summary of my perl5 (revision 5.0 version 5 subversion 57) configuration:
  Platform:
    osname=dec_osf, osvers=4.0, archname=alpha-dec_osf
    uname='osf1 alpha.hut.fi v4.0 878 alpha '
    config_args='-ders'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef useperlio=undef d_sfio=undef
    use64bits=undef usemultiplicity=undef
  Compiler:
    cc='cc', optimize='-O4', gccversion=
    cppflags='-std -ieee -D_INTRINSICS -DLANGUAGE_C'
    ccflags ='-std -fprm d -ieee -D_INTRINSICS -DLANGUAGE_C'
    stdchar='unsigned char', d_stdstdio=define, usevfork=false
    intsize=4, longsize=8, ptrsize=8, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8
    alignbytes=8, usemymalloc=y, prototype=define
  Linker and Libraries:
    ld='ld', ldflags =''
    libpth=/usr/shlib /usr/ccs/lib /usr/lib/cmplrs/cc /usr/lib /var/shlib
    libs=-lgdbm -ldbm -ldb -lm -lrt
    libc=, so=so, useshrplib=true, libperl=libperl.so
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='  -Wl,-rpath,/usr/local/lib/perl5/5.00557/alpha-dec_osf/CORE'
    cccdlflags=' ', lddlflags='-shared -expect_unresolved "*" -msym -s'

Locally applied patches:
    


@INC for perl 5.00557:
    lib
    /u/vieraat/vieraat/jhi/Perl/lib
    /usr/local/lib/perl5/5.00557/alpha-dec_osf
    /usr/local/lib/perl5/5.00557
    /usr/local/lib/perl5/site_perl/5.00557/alpha-dec_osf
    /usr/local/lib/perl5/site_perl/5.00557
    .


Environment for perl 5.00557:
    HOME=/u/vieraat/vieraat/jhi
    LANG=C
    LANGUAGE (unset)
    LC_ALL=fi_FI.ISO8859-1
    LC_CTYPE=fi_FI.ISO8859-1
    LD_LIBRARY_PATH=/u/vieraat/vieraat/jhi/pp4/cfgperl
    LOGDIR (unset)
    PATH=/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/.s:/u/vieraat/vieraat/jhi/.b/OSF1:/c/bin:/p/bin:/p/adm/bin:/usr/bin:/usr/sbin:/sbin:/bin:/usr/ccs/bin:/usr/lib:/etc:/lib:/p/X6/bin:/p/X5/bin:/usr/bin/X11:/usr/lbin:/usr/sbin/acct:/usr/tcb/bin:/tcb/bin:/usr/field:/u/vieraat/vieraat/jhi
    PERLLIB=/u/vieraat/vieraat/jhi/Perl/lib
    PERL_BADLANG (unset)
    SHELL=/bin/zsh

p5pRT · 2003-07-06T22:56:13Z

From @floatingatoll

I can confirm this is present in @18374.

p5pRT · 2004-12-10T02:47:45Z

From @schwern

Still present in 5.8.5 and blead@22511

p5pRT · 2005-07-20T12:56:11Z

From @smpeters

Resubmitting a bug report of mine back from February;
this time through the proper channel (perlbug, not p5p).

[paste]

While editing my Damn Book I re-remembered that couple of months back
I ran into an anomaly in the handling of bitvectors.

Fiction: you have a bitvector which you want to shift.

The current (5.005_03-MT5) fact:

perl -wle '$b = ""; vec($b, 0, 1) = 1;print unpack("b*",
$b);$b<<=1;print unpack("b*", $b)'
10000000
00001100

Huh? Adding -w tells more, as usual:
Argument "^A" isn't numeric in left_shift at -e line 1.

So left_shift assumes that the argument to shift is a number, but "^A"
isn't, so it gets converted to string zero "0" (48, 0x30, 0b0000110).

I consider this behaviour to be rather broken. I think
    $b \<\<= 1
should shift the whole bitvector left by one position and
    vec$$b\, 2\, 8$ >>= 2
should shift the bits 16..23 right by two positions (of course not
leaking into bits 8..15).

Just so we're clear, and to add a proper TODO test case, what would you
consider the proper output to be?

p5pRT · 2005-07-20T17:37:48Z

From @iabyn

On Wed, Jul 20, 2005 at 05:56:12AM -0700, Steve Peters via RT wrote:

Resubmitting a bug report of mine back from February;
this time through the proper channel (perlbug, not p5p).

[paste]

While editing my Damn Book I re-remembered that couple of months back
I ran into an anomaly in the handling of bitvectors.

Fiction: you have a bitvector which you want to shift.

The current (5.005_03-MT5) fact:

perl -wle '$b = ""; vec($b, 0, 1) = 1;print unpack("b*",
$b);$b<<=1;print unpack("b*", $b)'
10000000
00001100

Huh? Adding -w tells more, as usual:
Argument "^A" isn't numeric in left_shift at -e line 1.

So left_shift assumes that the argument to shift is a number, but "^A"
isn't, so it gets converted to string zero "0" (48, 0x30, 0b0000110).

I consider this behaviour to be rather broken. I think
    $b \<\<= 1
should shift the whole bitvector left by one position and
    vec$$b\, 2\, 8$ >>= 2
should shift the bits 16..23 right by two positions (of course not
leaking into bits 8..15).
Just so we're clear, and to add a proper TODO test case, what would you
consider the proper output to be?

I am opposed to such a change in behaviour.

<<= operates on numeric values, while vec operates on strings.

vec($b,0,1) = 1

is just the same as

$b = "\x0\x0\x0\x01"

(ignoring endianess and int size complications)

If the << and >> operators were to take any string as a bit pattern, then
it would break code like:

$b="1"; $b <<= 2; print $b

which should print 4, not "\xc4".

--
The Enterprise is captured by a vastly superior alien intelligence which
does not put them on trial.
-- Things That Never Happen in "Star Trek" #10

p5pRT · 2005-07-21T07:38:17Z

From @jhi

Just so we're clear, and to add a proper TODO test case, what would you
consider the proper output to be?

I am opposed to such a change in behaviour.

Rest peacefully, so is Larry, I once cornered him on this :-)

<<= operates on numeric values, while vec operates on strings.
vec$$b\,0\,1$ = 1
is just the same as
$b = "\\x0\\x0\\x0\\x01"
(ignoring endianess and int size complications)

If the << and >> operators were to take any string as a bit pattern, then
it would break code like:
$b="1"; $b \<\<= 2; print $b
which should print 4, not "\xc4".

But on the other hand Larry could see the argumentation my way too,
that it should be possible to use <<= and >>= as bit shifters (looking
at it from the C heritage it is strange that & | ~ ^ operate on strings
as they were bitvectors, but the shift ops don't). So an unfortunate
murky corner of the dual (strings and numbers), err, trefoil (strings
and bitvectors and numbers), err, quatrefoil (byte strings and Unicode
strings and bitvectors and numbers) nature of Perl strings.

How to solve this, if this is to be solved? *I* see as an ugly
asymmetry blemish, but since enraged hordes of people have not
ascended upon yes over all these years, I think I am in a minority,
and I can live with it. *IF* someone wants to fix this, maybe a pragma.
Or maybe borgify Bit::Vector :-)

--
Jarkko Hietaniemi <jhi@iki.fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen

p5pRT · 2005-08-06T11:08:03Z

From @jhi

But on the other hand Larry could see the argumentation my way too,
that it should be possible to use <<= and >>= as bit shifters (looking
at it from the C heritage it is strange that & | ~ ^ operate on strings
as they were bitvectors, but the shift ops don't). So an unfortunate
murky corner of the dual (strings and numbers), err, trefoil (strings
and bitvectors and numbers), err, quatrefoil (byte strings and Unicode
strings and bitvectors and numbers) nature of Perl strings.

How to solve this, if this is to be solved? *I* see as an ugly
asymmetry blemish, but since enraged hordes of people have not
ascended upon yes over all these years, I think I am in a minority,
and I can live with it. *IF* someone wants to fix this, maybe a pragma.
Or maybe borgify Bit::Vector :-)

How is the fixing of lexical pragmas progressing? If there were
a working solution for that, I could implement a pragma for shifting
of scalars as bitvecs.

--
Jarkko Hietaniemi <jhi@iki.fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen

p5pRT · 2005-08-06T16:30:19Z

From @rgarcia

On 8/6/05, Jarkko Hietaniemi <jhietaniemi@gmail.com> wrote:

How is the fixing of lexical pragmas progressing? If there were
a working solution for that, I could implement a pragma for shifting
of scalars as bitvecs.

I've an unfinished patch, and I've planned to work on this after the
next 5.9.x is released. (Development plans are to be explained in my
OSCON talk slides, which I've to put on line soon. Must wait to be
home to do that, though, due to my server having crashed or
something.)

p5pRT · 2005-09-28T02:50:10Z

From @smpeters

[stmpeters - Wed Jul 20 05:56:11 2005]:
Resubmitting a bug report of mine back from February;
this time through the proper channel (perlbug, not p5p).

[paste]

While editing my Damn Book I re-remembered that couple of months back
I ran into an anomaly in the handling of bitvectors.

Fiction: you have a bitvector which you want to shift.

The current (5.005_03-MT5) fact:

perl -wle '$b = ""; vec($b, 0, 1) = 1;print unpack("b*",
$b);$b<<=1;print unpack("b*", $b)'
10000000
00001100

Huh? Adding -w tells more, as usual:
Argument "^A" isn't numeric in left_shift at -e line 1.

So left_shift assumes that the argument to shift is a number, but "^A"
isn't, so it gets converted to string zero "0" (48, 0x30, 0b0000110).

I consider this behaviour to be rather broken. I think
    $b \<\<= 1
should shift the whole bitvector left by one position and
    vec$$b\, 2\, 8$ >>= 2
should shift the bits 16..23 right by two positions (of course not
leaking into bits 8..15).
Just so we're clear, and to add a proper TODO test case, what would you
consider the proper output to be?

This thread sort of went off on a tangent. What should the expected
results be?

p5pRT · 2005-09-28T06:01:57Z

From @jhi

Steve Peters via RT wrote:

[stmpeters - Wed Jul 20 05:56:11 2005]:
Resubmitting a bug report of mine back from February;
this time through the proper channel (perlbug, not p5p).

[paste]

While editing my Damn Book I re-remembered that couple of months back
I ran into an anomaly in the handling of bitvectors.

Fiction: you have a bitvector which you want to shift.

The current (5.005_03-MT5) fact:

perl -wle '$b = ""; vec($b, 0, 1) = 1;print unpack("b*",
$b);$b<<=1;print unpack("b*", $b)'
10000000
00001100

Huh? Adding -w tells more, as usual:
Argument "^A" isn't numeric in left_shift at -e line 1.

So left_shift assumes that the argument to shift is a number, but "^A"
isn't, so it gets converted to string zero "0" (48, 0x30, 0b0000110).

I consider this behaviour to be rather broken. I think
   $b \<\<= 1
should shift the whole bitvector left by one position and
   vec$$b\, 2\, 8$ >>= 2
should shift the bits 16..23 right by two positions (of course not
leaking into bits 8..15).
Just so we're clear, and to add a proper TODO test case, what would you
consider the proper output to be?
This thread sort of went off on a tangent. What should the expected
results be?

Well, ASSUMING that there will be in future a way to make shifting of
bitvecs to work as, well, shifting of bitvecs, instead of the shift ops
assuming their arguments are numbers (which must be kept as the default
way of doing things because of hysterical raisins)... assuming this new
pragma is called "bitvec":

perl -Mbitvec -wle '$b = "";
vec($b, 0, 1) = 1;
print unpack("b*",$b);
$b<<=1;
print unpack("b*", $b)'
10000000
01000000

--
Jarkko Hietaniemi <jhi@iki.fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen

p5pRT · 2005-09-28T20:10:31Z

From @ysth

On Wed, Sep 28, 2005 at 09:01:12AM +0300, Jarkko Hietaniemi wrote:

Steve Peters via RT wrote:

This thread sort of went off on a tangent. What should the expected
results be?

Well, ASSUMING that there will be in future a way to make shifting of
bitvecs to work as, well, shifting of bitvecs, instead of the shift ops
assuming their arguments are numbers (which must be kept as the default
way of doing things because of hysterical raisins)... assuming this new
pragma is called "bitvec":

perl -Mbitvec -wle '$b = "";
vec($b, 0, 1) = 1;
print unpack("b*",$b);
$b<<=1;
print unpack("b*", $b)'
10000000
01000000

I'd rather see a pragma that allowed changing how &,|, etc. work also.
Maybe:
use bitvec auto => -shiftops;

to allow shifting as bitvecs if they've never been used in numeric context

use bitvec string => -shiftops;

to have the shift ops always shift as bitvecs, and

use bitvec numeric => -shiftops;

being the default, making the ops always assume numbers.

It would also allow affecting the bitwise ops (~, &, |, ^) which
would default to:

use bitvec auto => -bitwise;

And
use bitvec default => -all;

would restore the defaults.

I know I'd use

use bitvec string => -all;

often in limited lexical scopes.

p5pRT · 2008-11-14T01:30:15Z

From @chipdude

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

p5pRT · 2008-11-14T01:30:16Z

From [Unknown Contact. See original ticket]

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

p5pRT · 2008-11-17T08:36:25Z

From @rgs

2008/11/14 Chip Salzenberg via RT <perlbug-comment@perl.org>:

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

Except the obligatory bikeshedding session on the new module name,
(which I like, by the way), I think that's a good idea.

p5pRT · 2008-11-17T09:08:59Z

From @chipdude

On Mon, Nov 17, 2008 at 09:36:07AM +0100, Rafael Garcia-Suarez wrote:

2008/11/14 Chip Salzenberg via RT <perlbug-comment@perl.org>:

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

Except the obligatory bikeshedding session on the new module name,
(which I like, by the way), I think that's a good idea.

I'm having a hard time deciding which way is "left". The convention for
numbers holds the low bit at the right, but the convention for bit vectors
in Perl, as strings. holds the low bit at the left. Perhaps we should call
these functions "insert_low_bits" and "remove_low_bits". Awkward, tho.
--
Chip Salzenberg <chip@pobox.com>

p5pRT · 2008-11-17T11:22:22Z

From @chipdude

On Mon, Nov 17, 2008 at 09:36:07AM +0100, Rafael Garcia-Suarez wrote:

2008/11/14 Chip Salzenberg via RT <perlbug-comment@perl.org>:

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

Except the obligatory bikeshedding session on the new module name,
(which I like, by the way), I think that's a good idea.

Well ... does it count as bikeshedding if it's your own module? Here's a
first cut at the 'vec' module. Please don't commit it just yet, it needs
review. So ... review, please? (including the module name, I suppose)

=item insert_low_bits STRING, COUNT

Accept a bitvector STRING, a la L<vec>, and an integral bit COUNT. Return a
new bitvector that is a copy of the original STRING but with COUNT zero bits
inserted at the low end of the vector; that is, at the front of the string.
COUNT must be nonnegative.

=item remove_low_bits STRING, COUNT

Accept a bitvector STRING, a la L<vec>, and an integral bit COUNT. Return a
new bitvector that is a copy of the original STRING but with COUNT bits
removed from the low end of the vector; that is, from the front of the
string.

Inline Patch

diff --git a/ext/vec/Makefile.PL b/ext/vec/Makefile.PL
new file mode 100644
index 0000000..ff8910a
--- /dev/null
+++ b/ext/vec/Makefile.PL
@@ -0,0 +1,7 @@
+use ExtUtils::MakeMaker;
+
+WriteMakefile(
+    VERSION_FROM    => "vec.pm",
+    NAME            => "vec",
+    OPTIMIZE        => '-g',
+);
diff --git a/ext/vec/t/vec.t b/ext/vec/t/vec.t
new file mode 100644
index 0000000..57a21b2
--- /dev/null
+++ b/ext/vec/t/vec.t
@@ -0,0 +1,30 @@
+#!./perl
+
+BEGIN {
+    unless (-d 'blib') {
+	chdir 't' if -d 't';
+	@INC = '../lib';
+    }
+}
+
+use Test::More tests => 15;
+use vec;
+
+ok(vec::->VERSION);
+
+ok(insert_low_bits("",     1) eq "\x00"    );
+ok(insert_low_bits("",     8) eq "\x00"    );
+ok(insert_low_bits("",     9) eq "\x00\x00");
+ok(insert_low_bits("\x01", 0) eq "\x01"    );
+ok(insert_low_bits("\x01", 1) eq "\x02"    );
+ok(insert_low_bits("\x01", 7) eq "\x80"    );
+ok(insert_low_bits("\x01", 8) eq "\x00\x01");
+
+ok(remove_low_bits("",          1) eq ""    );
+ok(remove_low_bits("\x01",      0) eq "\x01");
+ok(remove_low_bits("\x01",      1) eq ""    );
+ok(remove_low_bits("\x00\x01",  8) eq "\x01");
+ok(remove_low_bits("\x00\x01",  9) eq ""    );
+ok(remove_low_bits("\x80",      7) eq "\x01");
+ok(remove_low_bits("\x00\x80", 15) eq "\x01");
+
diff --git a/ext/vec/vec.pm b/ext/vec/vec.pm
new file mode 100644
index 0000000..1f6e06c
--- /dev/null
+++ b/ext/vec/vec.pm
@@ -0,0 +1,69 @@
+# vec.pm
+#
+# Copyright (c) 2008 Chip Salzenberg <[email protected]>.  All rights reserved.
+# This program is free software; you can redistribute it and/or modify it
+# under the same terms as Perl itself.
+
+package vec;
+
+use strict;
+
+require Exporter;
+
+our $VERSION    = 0.01;
+our $XS_VERSION = $VERSION;
+our @ISA        = qw(Exporter);
+our @EXPORT     = qw(insert_low_bits remove_low_bits);
+our @EXPORT_OK  = @EXPORT;
+
+require XSLoader;
+XSLoader::load('vec', $XS_VERSION);
+
+1;
+
+__END__
+
+=head1 vec
+
+vec - Bit-vector functions not provided by the base language
+
+=head1 SYNOPSIS
+
+    use vec qw(insert_low_bits remove_low_bits);
+    use vec;  # same as above
+
+=head1 DESCRIPTION
+
+The C<vec> module provides some bit vector functionality that perhaps could
+have been part of the base language, but aren't, and for reasons of backward
+compatibility now cannot be.
+
+=over 4
+
+=item insert_low_bits STRING, COUNT
+
+Accept a bitvector STRING, a la L<vec>, and an integral bit COUNT.  Return a
+new bitvector that is a copy of the original STRING but with COUNT zero bits
+inserted at the low end of the vector; that is, at the front of the string.
+COUNT must be nonnegative.
+
+=item remove_low_bits STRING, COUNT
+
+Accept a bitvector STRING, a la L<vec>, and an integral bit COUNT.  Return a
+new bitvector that is a copy of the original STRING but with COUNT bits
+removed from the low end of the vector; that is, from the front of the
+string.
+
+=back
+
+=head1 SEE ALSO
+
+L<vec>
+
+=head1 COPYRIGHT
+
+Copyright (c) 2008 Chip Salzenberg <[email protected]>. All rights reserved.
+This program is free software; you can redistribute it and/or modify it
+under the same terms as Perl itself.
+
+=cut
diff --git a/ext/vec/vec.xs b/ext/vec/vec.xs
new file mode 100644
index 0000000..41d45e8
--- /dev/null
+++ b/ext/vec/vec.xs
@@ -0,0 +1,83 @@
+/* Copyright (c) 2008 Graham Barr <[email protected]>. All rights reserved.
+ * This program is free software; you can redistribute it and/or
+ * modify it under the same terms as Perl itself.
+ */
+
+#include <EXTERN.h>
+#include <perl.h>
+#include <XSUB.h>
+
+MODULE=vec	PACKAGE=vec
+
+PROTOTYPES: DISABLE
+
+SV *
+insert_low_bits(ssv, shift)
+	SV *		ssv
+	IV		shift
+    PREINIT:
+	size_t len;
+	const char * const s = SvPV_const(ssv, len);
+	UV ibytes, ibits, iextra;
+	char *d;
+    CODE:
+	if (shift < 0)
+	    croak("invalid left shift");
+	ibytes = shift >> 3;
+	ibits  = shift & 7;
+	iextra  = ibits && (!len || ((unsigned char)s[len - 1] >> (8 - ibits)));
+	RETVAL = newSV(len + ibytes + iextra + 1);
+	d = SvPVX(RETVAL);
+	Zero(d, ibytes, char);
+	d += ibytes;
+	if (!ibits) {
+	    Copy(s, d, len, char);
+	    d += len;
+	}
+	else {
+	    size_t i;
+	    *d++ = (unsigned char)s[0] << ibits;
+	    for (i = 1; i < len + iextra; ++i)
+		*d++ = ((unsigned char)s[i  ] <<      ibits ) |
+		       ((unsigned char)s[i-1] >> (8 - ibits));
+	}
+	*d = '\0';
+	SvCUR_set(RETVAL, d - SvPVX_const(RETVAL));
+	SvPOK_on(RETVAL);
+    OUTPUT:
+	RETVAL
+
+SV *
+remove_low_bits(ssv, shift)
+	SV *		ssv
+	IV		shift
+    PREINIT:
+	size_t len;
+	const char * const s = SvPV_const(ssv, len);
+	UV rbytes, rbits, rextra;
+	char *d;
+    CODE:
+	if (shift < 0)
+	    croak("invalid left shift");
+	rbytes = shift >> 3;
+	rbits  = shift & 7;
+	rextra = rbits && len && !((unsigned char)s[len - 1] >> rbits);
+	if (len <= rbytes + rextra)
+	    XSRETURN_PVN("", 0);
+	RETVAL = newSV(len - (rbytes + rextra) + 1);
+	d = SvPVX(RETVAL);
+	if (!rbits) {
+	    Copy(s + rbytes, d, len - rbytes, char);
+	    d += len - rbytes;
+	}
+	else {
+	    size_t i;
+	    for (i = rbytes; i < len - rextra; ++i)
+		*d++ = ((unsigned char)s[i  ] >>      rbits ) |
+		       ((unsigned char)s[i+1] << (8 - rbits));
+	}
+	*d = '\0';
+	SvCUR_set(RETVAL, d - SvPVX_const(RETVAL));
+	SvPOK_on(RETVAL);
+    OUTPUT:
+	RETVAL


--

Chip Salzenberg <chip@pobox.com>

p5pRT · 2008-11-17T11:44:03Z

From @rgs

2008/11/17 Chip Salzenberg <chip@pobox.com>:

On Mon, Nov 17, 2008 at 09:36:07AM +0100, Rafael Garcia-Suarez wrote:

2008/11/14 Chip Salzenberg via RT <perlbug-comment@perl.org>:

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

Except the obligatory bikeshedding session on the new module name,
(which I like, by the way), I think that's a good idea.

Well ... does it count as bikeshedding if it's your own module? Here's a
first cut at the 'vec' module. Please don't commit it just yet, it needs
review. So ... review, please? (including the module name, I suppose)

I expect this one will need to be dual-lived. At which point occurs the
question, is it really needed in the core...

Some minor nits:

=item insert_low_bits STRING, COUNT

Accept a bitvector STRING, a la L<vec>, and an integral bit COUNT. Return a

à la L<perlfunc/vec>
(same link to be fixed in vec.pm)

new bitvector that is a copy of the original STRING but with COUNT zero bits
inserted at the low end of the vector; that is, at the front of the string.
COUNT must be nonnegative.

=item remove_low_bits STRING, COUNT

Accept a bitvector STRING, a la L<vec>, and an integral bit COUNT. Return a
new bitvector that is a copy of the original STRING but with COUNT bits
removed from the low end of the vector; that is, from the front of the
string.

What about "shift" and "unshift" instead of insert and remove ?

diff --git a/ext/vec/Makefile.PL b/ext/vec/Makefile.PL
new file mode 100644
index 0000000..ff8910a
--- /dev/null
+++ b/ext/vec/Makefile.PL
@@ -0,0 +1,7 @@
+use ExtUtils::MakeMaker;
+
+WriteMakefile(
+ VERSION_FROM => "vec.pm",
+ NAME => "vec",
+ OPTIMIZE => '-g',

Add to that C<MAN3PODS => {}> (to avoid converting the manpage needlessly)
if $ENV{PERL_CORE} is true.

diff --git a/ext/vec/t/vec.t b/ext/vec/t/vec.t
new file mode 100644
index 0000000..57a21b2
--- /dev/null
+++ b/ext/vec/t/vec.t

Maybe add some tests with strings flagged as utf8 ? Just to be sure it
won't break ?

diff --git a/ext/vec/vec.xs b/ext/vec/vec.xs
new file mode 100644
index 0000000..41d45e8
--- /dev/null
+++ b/ext/vec/vec.xs
@@ -0,0 +1,83 @@
+/* Copyright (c) 2008 Graham Barr <chip@pobox.com>. All rights reserved.

You're Graham Barr in disguise ? :)

p5pRT · 2008-11-17T12:02:42Z

From [email protected]

Quoth chip@pobox.com (Chip Salzenberg):

On Mon, Nov 17, 2008 at 09:36:07AM +0100, Rafael Garcia-Suarez wrote:

2008/11/14 Chip Salzenberg via RT <perlbug-comment@perl.org>:

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

Except the obligatory bikeshedding session on the new module name,
(which I like, by the way), I think that's a good idea.

I'm having a hard time deciding which way is "left". The convention for
numbers holds the low bit at the right, but the convention for bit vectors
in Perl, as strings. holds the low bit at the left. Perhaps we should call
these functions "insert_low_bits" and "remove_low_bits". Awkward, tho.

"shiftdown" and "shiftup"? Although, given that it only works on
strings, I think 'left' and 'right' are pretty clear. Indeed, for
strings I think 'left' is clearer than 'low' to mean the (textually)
first bit in the string.

Ben

p5pRT · 2008-11-17T13:03:48Z

From @timbunce

On Mon, Nov 17, 2008 at 03:21:29AM -0800, Chip Salzenberg wrote:

On Mon, Nov 17, 2008 at 09:36:07AM +0100, Rafael Garcia-Suarez wrote:

2008/11/14 Chip Salzenberg via RT <perlbug-comment@perl.org>:

How about adding leftshift() and rightshift() as functions in a standard
bitvec.pm, rather than fiddling with the meaning of >> and << ?

Except the obligatory bikeshedding session on the new module name,
(which I like, by the way), I think that's a good idea.

Well ... does it count as bikeshedding if it's your own module? Here's a
first cut at the 'vec' module. Please don't commit it just yet, it needs
review. So ... review, please? (including the module name, I suppose)
=item insert\_low\_bits STRING\, COUNT

The function name makes me think I can supply the bits that will be
inserted.

Instead of focussing on just these two shift operations I wonder if an
analogy with arrays of bits, or strings of bits, could be developed into a
more general interface. Like a splicebits() analogous with splice() or
subbits() analogous with substr().

splicebits VEC,OFFSET,LENGTH,LIST
splicebits VEC,OFFSET,LENGTH
splicebits VEC,OFFSET
splicebits VEC

subbits VEC,OFFSET,LENGTH,REPLACEMENT
subbits VEC,OFFSET,LENGTH
subbits VEC,OFFSET

With more specialized functions implemented, or at least specified, in
terms of the general one.

Just a thought.

Tim.

p5pRT · 2008-11-17T13:24:17Z

From [email protected]

Instead of focussing on just these two shift operations I wonder if an
analogy with arrays of bits, or strings of bits, could be developed into a
more general interface. Like a splicebits() analogous with splice() or
subbits() analogous with substr().

With more specialized functions implemented, or at least specified, in
terms of the general one.

Just a thought.

Tim.

I concur. When I was writing Scalar::Vec::Util, I felt like the atomic
operation for vec strings really was the copy of bits, from an arbitrary
position to another. Shifting, unshifting, splicing and such then seem to
be more or less compositions of those.

Vincent.

p5pRT · 2008-11-17T13:31:33Z

From @nwc10

On Mon, Nov 17, 2008 at 12:43:39PM +0100, Rafael Garcia-Suarez wrote:

I expect this one will need to be dual-lived. At which point occurs the
question, is it really needed in the core...

Yes, this was my thought too. If it exists, and it's not the default, and you
know that you need to use it, what is wrong with CPAN?

Nicholas Clark

p5pRT · 2008-11-17T17:04:50Z

From @chipdude

On Mon, Nov 17, 2008 at 02:27:17PM +0100, Vincent Pit wrote:

When I was writing Scalar::Vec::Util, I felt like the atomic operation for
vec strings really was the copy of bits, from an arbitrary position to
another. Shifting, unshifting, splicing and such then seem to be more or
less compositions of those.

Your Calar::Vec::Util::vcopy method ... does it work with overlapping
ranges? I ass_u_med not ... but if so, then I'll just mark the bug fixed
with a pointer to your module.
--
Chip Salzenberg <chip@pobox.com>

p5pRT · 2008-11-17T17:07:57Z

From @chipdude

On Mon, Nov 17, 2008 at 01:31:11PM +0000, Nicholas Clark wrote:

Yes, this was my thought too. If it exists, and it's not the default, and you
know that you need to use it, what is wrong with CPAN?

I agree. Since the original reporter was Jarkko, and he asked for a language
feature ... well, I figured I'd at least _start_ with a core module. But it
seems to me now that this is better handled via CPAN.
--
Chip Salzenberg <chip@pobox.com>

p5pRT · 2008-11-17T17:17:47Z

From @chipdude

On Mon, Nov 17, 2008 at 09:04:12AM -0800, Chip Salzenberg wrote:

On Mon, Nov 17, 2008 at 02:27:17PM +0100, Vincent Pit wrote:

When I was writing Scalar::Vec::Util, I felt like the atomic operation for
vec strings really was the copy of bits, from an arbitrary position to
another. Shifting, unshifting, splicing and such then seem to be more or
less compositions of those.

Your Calar::Vec::Util::vcopy method ... does it work with overlapping
ranges? I ass_u_med not ... but if so, then I'll just mark the bug fixed
with a pointer to your module.

Silly me, RTFM:

vcopy $t, 10, $t, 20, 30; # Overalapping areas DWIM.

OK, I'm calling the bug closed. Thanks, Vincent.
--
Chip Salzenberg <chip@pobox.com>

p5pRT · 2008-11-17T17:45:45Z

@chipdude - Status changed from 'open' to 'resolved'

p5pRT closed this as completed Nov 17, 2008

p5pRT added Wishlist affects-5.6 labels Oct 18, 2019

p5pRT added distro-unknown type-core labels Oct 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shifting of bitvecs considered broken #164

shifting of bitvecs considered broken #164

p5pRT commented Jul 7, 1999

p5pRT commented Jul 7, 1999

p5pRT commented Jul 6, 2003

p5pRT commented Dec 10, 2004

p5pRT commented Jul 20, 2005

p5pRT commented Jul 20, 2005

p5pRT commented Jul 21, 2005

p5pRT commented Aug 6, 2005

p5pRT commented Aug 6, 2005

p5pRT commented Sep 28, 2005

p5pRT commented Sep 28, 2005

p5pRT commented Sep 28, 2005

p5pRT commented Nov 14, 2008

p5pRT commented Nov 14, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

p5pRT commented Nov 17, 2008

shifting of bitvecs considered broken #164

shifting of bitvecs considered broken #164

Comments

p5pRT commented Jul 7, 1999

p5pRT commented Jul 7, 1999

From @jhi

p5pRT commented Jul 6, 2003

From @floatingatoll

p5pRT commented Dec 10, 2004

From @schwern

p5pRT commented Jul 20, 2005

From @smpeters

p5pRT commented Jul 20, 2005

From @iabyn

p5pRT commented Jul 21, 2005

From @jhi

p5pRT commented Aug 6, 2005

From @jhi

p5pRT commented Aug 6, 2005

From @rgarcia

p5pRT commented Sep 28, 2005

From @smpeters

p5pRT commented Sep 28, 2005

From @jhi

p5pRT commented Sep 28, 2005

From @ysth

p5pRT commented Nov 14, 2008

From @chipdude

p5pRT commented Nov 14, 2008

From [Unknown Contact. See original ticket]

p5pRT commented Nov 17, 2008

From @rgs

p5pRT commented Nov 17, 2008

From @chipdude

p5pRT commented Nov 17, 2008

From @chipdude

p5pRT commented Nov 17, 2008

From @rgs

p5pRT commented Nov 17, 2008

From [email protected]

p5pRT commented Nov 17, 2008

From @timbunce

p5pRT commented Nov 17, 2008

From [email protected]

p5pRT commented Nov 17, 2008

From @nwc10

p5pRT commented Nov 17, 2008

From @chipdude

p5pRT commented Nov 17, 2008

From @chipdude

p5pRT commented Nov 17, 2008

From @chipdude

p5pRT commented Nov 17, 2008