-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: bit fields and bit matching #29
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
- Start Date: 2014-04-04 | ||
- RFC PR #: (leave this empty) | ||
- Rust Issue #: (leave this empty) | ||
|
||
# Summary | ||
|
||
This is an RFC to provide better support for bit fields, including simpler notation and matching. | ||
|
||
# Motivation | ||
|
||
Bit fields are commonly used in embedded and system software where rust should be an better option compared to C/C++. Working with bit fields is a hard process full of magic numbers and prone to errors. | ||
|
||
# Detailed design | ||
|
||
The first part of this RFC is defenition of a bit access for integer types. For the sake of simplicity, only unsigned integer types (uint, u8, u16, u32, u64) are supported. | ||
|
||
Bit access operation is defined as | ||
|
||
```rust | ||
let mut val: u32 = ...; | ||
let bits1 = val[4..5]; // equivalent to bits = (val >> 4) & 3 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you explain this syntax please? Its not clear to me from the examples. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it should be limited to strictly sized types, e.g. |
||
let bits2 = val[0,4..5]; // equivalent to bits = ((val >> 4) & 3) | (val & 1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The previous line is fine, but this might be a bit too magical. It’s not obvious that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Still, it would be useful to extract non-continuous bits. Maybe using |
||
|
||
val[2..7] = 10; // equivalent to val = (val & (0xffffffff ^ 0xfc)) | (10 << 2) | ||
val[0] = 3; // doesn't compile, as you can't fit 0b11 into one bit place | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if instead of a literal you have a variable or other expression whose value is not known at compile-time? Do too-big values trigger There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it's reasonable to have run-time support for this feature. It could be solved by something like this: |
||
``` | ||
|
||
The second part of this RFC is matching on bits. It is often required to perform different actions based on few bits of an integer. Currently rust requires `_ => ...` in the end for such cases, as one cannot cover all the integer options (while it's possible to cover all the possible bit options). The proposed solution matched with above is: | ||
|
||
```rust | ||
match val[4..5] { | ||
0b00 => ..., | ||
0b01 => ..., | ||
0b02 => ..., | ||
0b03 => ... | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 2 and 3 are not valid binary digits. This would have to be |
||
} // no match all provided, all variants must be included. | ||
``` | ||
|
||
# Alternatives | ||
|
||
Provide a bit extraction macros that would perform the first part of this RFC. Doesn't solve the problem of second part. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this possible to do with macros? If we can do it with macros, we don't need to add it to the language. I believe it could be done, there's nothing that really precludes it I don't think. I definitely think the match, and possibly the updating could be done, though exraction might need to use methods, and it might not look as nice. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think, it's possible to do things like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it'd actually require that. It can just extract it to the smallest uint size that fits and compare against constants, with a fall-through match arm. When I say "macros", though, I mostly just mean "any syntax extension", which includes procedural ones (written in pure Rust) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would personally prefer a macro approach as well. I initially proposed arbitrary-width integers as a workaround due to some confusion about what constituted valid operations in a macro, and haven't been able to come up with a reasonable way to unify the required types with the existing ones in a pleasing way. While the implementation of this feature would be easier with such arbitrary-width integers, I think it comes with effects on the semantics of too many other language items. In the simplest case, we clutter the 'standard' namespace with a lot of new types that few users will need ( |
||
|
||
Erlang has an even better bit matching: | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you show what this would look like in Rust or explain it in words please? I don't understand the Erlang syntax. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, this one is quite a complex example, actually, I got it from here. I think, rust version would be something along the lines of let IP_VERSION = 4;
let IP_MIN_HDR_LEN = 5;
let DgramSize = byte_size(Dgram);
match Drgam {
(
ref IPVers @ [0..3],
ref HLen @ [4..7],
ref SrvcType @ [8..15],
ref TotLen @ [16..31],
ref ID @ [31..47],
ref Flgs @ [48..50],
ref FragOff @ [51..63],
ref TTL @ [64..71],
ref Proto @ [72..79],
ref HdrChkSum @ [80..95],
ref SrcIP @ [96..127],
ref DestIP @ [128..159],
ref RestDgram @ [160..]
) if IPVers = IP_VERSION && HLen >= 5 && HLen*4 <= DgramSize {
// ...
},
_ => (),
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More formally, a bitstring in Erlang is of the form:
And a segment is:
Please let us rusteceans have it :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @farcaller: I don't understand how to reference bit-aligned data. Also, how would you create and transmute bitstrings? I'm convinced that bit matching should use structs. struct Dgram {
ip_vers: Uint<4>,
hlen: Uint<4>,
srvc_type: u8,
total_len: u16,
id: u16,
flgs: Uint<3>,
frag_off: Uint<13>,
ttl: u8,
proto: u8,
hdr_chksum: u16,
src_ip: u32,
dest_ip: u32,
}
static IP_VERSION = 4;
static IP_MIN_HDR_LEN = 5;
// in fn(dgram: Dgram, rest: Vec<u8>)
let size = size_of::<Dgram>();
match dgram {
Dgram {
ip_vers: IP_VERSION as Uint<4>,
hlen: hlen,
srvc_type: srvc_type, total_len: total_len,
id: id, flgs: flgs, frag_off: frag_off,
ttl: ttl, proto: proto, hdr_chksum: hdr_chksum,
src_ip: src_ip, dest_ip: dest_ip,
} if hlen >= 5 && hlen*4 <= size => {
let opts_len = 4 * (hlen - IP_MIN_HDR_LEN);
let (opts, data) = rest.split_at(opts_len);
// ...
},
_ => (),
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like how this struct looks, but it's getting close to bitfields of C/C++ that are often frowned upon. I guess the main reason is that byte order is not defined in those, so if we can have structs with explicit alignment and byte order, that would work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Struct fields with attributes are certainly possible. I propose the following syntax struct MyData {
a: u8,
#[align(4)] b: u8,
align(16) little { // little endian
c: int,
d: uint,
}
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That would still require support for arbitrary-sized ints, right? In cases of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, and they require support for static generic parameters in turn. Another problem is, would all fields have bit alignment by default? What would happen when an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's |
||
```erlang | ||
-define(IP_VERSION, 4). | ||
-define(IP_MIN_HDR_LEN, 5). | ||
|
||
DgramSize = byte_size(Dgram), | ||
case Dgram of | ||
<<?IP_VERSION:4, HLen:4, SrvcType:8, TotLen:16, | ||
ID:16, Flgs:3, FragOff:13, | ||
TTL:8, Proto:8, HdrChkSum:16, | ||
SrcIP:32, | ||
DestIP:32, RestDgram/binary>> when HLen>=5, 4*HLen=<DgramSize -> | ||
OptsLen = 4*(HLen - ?IP_MIN_HDR_LEN), | ||
<<Opts:OptsLen/binary,Data/binary>> = RestDgram, | ||
... | ||
end. | ||
``` | ||
|
||
# Unresolved questions | ||
|
||
TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only fixed width unsigned integer types (u8, u16, u32, u64)