Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sign-extending signed integer types? #236

Closed
korrat opened this issue Sep 30, 2021 · 3 comments · Fixed by #238
Closed

Sign-extending signed integer types? #236

korrat opened this issue Sep 30, 2021 · 3 comments · Fixed by #238
Labels
enhancement New feature or request

Comments

@korrat
Copy link
Contributor

korrat commented Sep 30, 2021

Currently, deku deserializes signed integers with limited bit width like unsigned integers, as evidenced by the following (passing) test:

use deku::{bitvec::BitSlice, ctx::Size, DekuRead as _};

#[test]
fn sign_extending() {
    let bits = &BitSlice::from_slice(&[0x1Fu8]).unwrap()[3..];
    let (_rest, num) = i16::read(bits, Size::Bits(5)).unwrap();

    assert_eq!(num, 31);
}

I would expect num to be -1 instead, similar to how bit fields in C behave (Compiler Explorer example).

Is this something that could be supported? Perhaps using another context argument?

@sharksforarms sharksforarms added the enhancement New feature or request label Oct 5, 2021
@sharksforarms
Copy link
Owner

Essentially, we're reading from a byte. Do you have an example on how to get the result you're looking for in rust/bitvec? It may be possible!

https://github.com/sharksforarms/deku/blob/master/src/impls/primitive.rs#L88

@korrat
Copy link
Contributor Author

korrat commented Oct 7, 2021

You can use Rust's arithmetic right shifts for sign-extension:

fn sign_extend(num: i16, bits: u32) -> i16 {
    let shift = i16::BITS - bits;
    (num << shift) >> shift
}

Playground

This can be combined with the map attribute to achieve the desired behavior, but I think it would be more ergonomic, if deku supported the use case directly.

I can also open a PR for this. Just let me know if this is something you want to support and whether this should be the default for signed integers.

@myrrlyn
Copy link
Contributor

myrrlyn commented Oct 11, 2021

This is now the second time I have been asked for signed-integer support in BitField, and I think I have a way to make it work. If I do, I'll provide a patch release to the minor(s) that deku uses.

The behavior as written is valid as long as bits is in the range 1 ..= T::BITS; since bitvec already enforces this requirement by crashing on violation, it can be assumed to be true in your code.


Implementation notes: my experimental solution was to compute a sign mask, then conditionally apply it. Almost all calls into BitField use bit-slices with compile-time-constant lengths, and so that length information is available for const-propagation into values that do not have data dependencies. My suspicion was that the shl; sar sequence in korrat's example caused a data dependency whose opcodes would be slower to evaluate than a test and conditional or.

This comparison code:

pub fn shifty_sext(val: i32, bits: u32) -> i32 {
    check(bits);
    let shamt = 32 - bits;
    (val << shamt) >> shamt
}

pub fn masky_sext(mut val: i32, bits: u32) -> i32 {
    check(bits);
    let mask = !0 << bits;
    if val & (1 << (bits - 1)) != 0 {
        val |= mask;
    }
    val
}

pub fn shifty_sext_12(val: i32) -> i32 {
    shifty_sext(val, 12)
}

pub fn masky_sext_12(val: i32) -> i32 {
    masky_sext(val, 12)
}

fn check(bits: u32) {
    if bits == 0 || bits > 32 {
        panic!("invalid shamt");
    }
}

when compiled under -O produces this assembler:

example::shifty_sext:
        mov     ecx, esi
        lea     eax, [rcx - 1]
        cmp     eax, 32
        jae     .LBB7_1
        neg     cl
        shl     edi, cl
        sar     edi, cl
        mov     eax, edi
        ret
.LBB7_1:
        push    rax
        call    std::panicking::begin_panic
        ud2

example::masky_sext:
        lea     eax, [rsi - 1]
        cmp     eax, 32
        jae     .LBB8_1
        lea     eax, [rsi - 1]
        movzx   r8d, al
        mov     edx, -1
        mov     ecx, esi
        shl     edx, cl
        xor     eax, eax
        bt      edi, r8d
        cmovb   eax, edx
        or      eax, edi
        ret
.LBB8_1:
        push    rax
        call    std::panicking::begin_panic
        ud2

example::shifty_sext_12:
        mov     eax, edi
        shl     eax, 20
        sar     eax, 20
        ret

example::masky_sext_12:
        mov     eax, edi
        shl     eax, 20
        sar     eax, 31
        and     eax, -4096
        or      eax, edi
        ret

While the unpropagated mask signer does not have a data dependency, the version with const-propagated length information does; this metric is now a tie and the shift signer has a shorter instruction stream (and no large immediates), so it wins.


It is a bug in bitvec for unsigned loads to be incorrect, and I have a great deal of confidence that they are not. I am also fairly sure by inspection that your shl; sar implementation is also correct under the invariants that bitvec imposes, so if you want to whip up a test case for signing two integers, one where the top loaded bit is 0 and one where it is 1, and demonstrate that they correctly sign-extend, that should suffice.

I believe that changing the BitField API to increase generality from M: IsUnsigned to M: IsInteger does not constitute an API breaking change, but I am not 100% confident in this. Since I am already working on a full crate rewrite for 1.0, which I hope to complete this calendar year, I would tentatively recommend placing this behavior in deku as soon as you are ready, rather than waiting for me to produce a bitvec release of any version that contains it.

Thanks korrat for the ping, and for tipping me over the edge into providing this support directly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants