-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Add bitfields to C structs #3898
Comments
The first notation could work, I don't like the second one. I want to propose a third option, which may look more like a generic type: lib C
struct Foo
foo : Int32(2)
bar : Int32(2)
end
struct Bar
foo, bar : Int32(2)
end
end Many data structures in networking and on-disk files use this to save some bytes here and there. And just like in C, this is not an every-day feature, but one which makes code having to do this so much more readable and safer. Even if none of our notation proposals are accepted, I'd like to see this feature. |
@Papierkorb I thought about doing it like that too, but I think there's too much ambiguity between that and actual generics. Especially since numbers are allowed in real generics (e.g. |
Which was exactly my point:
|
@Papierkorb Ah yeah, I see. Well I wouldn't mind either way, I just think it's kinda confusing. |
Using Int32(2) doesn't make much sense except from c compatibility. It should be BitInt(2). |
@konovod How would Neither of |
How about making this an attribute like |
@RX14 sounds good, but wouldn't the attribute be superfluous? lib C
@[PackedBits]
struct Foo
2 : foo, bar, baz : UInt32
end
end Wouldn't just using |
@SplittyDev no, I meant like this: struct Foo
@[BitSize(2)]
foo, bar, baz : UInt32
end |
@RX14 Oh, yeah I think that's a good idea. |
@SplittyDev well, BitInt isn't ideal, but why UInt32? What will change if it is UInt64 or UInt16? And what will change if it will be signed Int32 as in first post? I think the name should show that it is a special field, something like UInt(N), but Int is already used and tells nothing about N is bits, not bytes. |
@konovod There's a reason for that. Consider the following code: lib C
@[Packed]
struct Foo
@[BitSize(8)]
high, low : UInt16
end
end Now, both Without bit-fields you'd do something like that to get the same effect: struct Foo
def initialize(@value : UInt16)
end
def high
(@value >> 8) & 0xFF
end
def low
@value & 0xFF
end
end |
I know what bitfields are for, i used them. But in this example what is a point of naming |
I'm afraid I don't quite understand what you mean by that. That was only a very basic example though. Here's a more complex one: lib C
@[Packed]
struct Page
@[BitSize(1)] present, rw, user, accessed, dirty : UInt32
@[BitSize(7)] unused : UInt32
@[BitSize(20)] frame : UInt32
end
end Which equals the following C code: struct page {
uint32_t present : 1;
uint32_t rw : 1;
uint32_t user : 1;
uint32_t accessed : 1;
uint32_t dirty : 1;
uint32_t unused : 7;
uint32_t frame : 20;
} __attribute__((packed)); The size of the Page structure equals |
compare with
the size is still 32 bits = 4 bytes. What have changed for a compiler? Nothing. What |
@konovod By adding a generic type to be used with bit-fields you're basically creating an integer that could hold infinitely many bits. Code like that would break on a wide range of CPU models and architectures. Usually, the max size of a bit-field is limited by its type. |
@SplittyDev Well, that depends on implementation. It can e.g. be always 32 (64?) bits, or automatically widens to a minimal size that can hold given number of bits (and fails to compile if |
@konovod In that case you'd still have the problem that you couldn't work with the generic integer easily. What if a function expects an |
@SplittyDev such problem already exists - if something expects Int16 and you pass |
I guess we could fully embrace LLVM's arbitrary-sized integers with stuff like |
@konovod Crystal doesn't support implicit type casts. Why would you add a type that you'd have to convert to another type every time you wanna use it, instead of just using the actual type? I just don't see how that would be helpful, at all. |
@RX14 yeah that would be possible. Although I still don't get how one would benefit from doing it that way. |
@SplittyDev well i'm guessing that the underlying struct will be represented as such in the IR. Even on clang. |
@RX14 it probably will. But I think the compiler should handle this, and not the user. C does it that way for a reason, and I think that it would be better to be consistent with C there. |
@SplittyDev looks like not actually: https://godbolt.org/g/KVQTxJ |
@SplittyDev Behavior will be same as Int16\Int8\Float32 - you have to convert it if you want to pass it somewhere where another type is expected. Completely consistent with the rest of language. And even if you declare the field as UInt32 you will have to use explicit conversion to e.g. write there a 0 or 1 or do anything useful. |
@konovod I've updated the 'implementation details' section of the RFC. Only allowing unsigned integers seems sensible. @RX14, that's very interesting. I'm not very good at reading LLVM code, but it looks like this just allocates a chunk of memory big enough to hold the struct, and uses bitwise arithmetic to do the trick. |
@konovod actually, thinking about it, I don't think that allowing signed values would cause any issues. |
@SplittyDev after some thoughts i agree, they aren't such evil. |
I'm pretty sure it's possible to write a macro that allows bitfield access to a property. Something like: @[Extern]
struct Foo
bitfields @raw : Int32, [
hi: {Int32, 16},
low: {Int32, 16},
]
end The |
I was thinking the same as @asterite. We can move the struct Foo
@foo : Int32
def hi
@foo >> 4
end
def lo
@foo & 0xffff
end
end |
@asterite, @ysbaddaden Let's say I have a struct with a size of 128 bits. I'd need two UInt64 fields to represent that and the macro to support that would quickly become extremely complicated. Also, what about packing? Would the struct really look like a C struct in memory? Imho, a macro is only a temporary solution. Having support for that in the compiler would be a permanent solution and I'm sure that it's gonna be useful for a lot of low-level developers. |
@SplittyDev You can mark a struct with @[Extern]
@[Packed]
struct Foo
bitfields [
hi: {Int32, 16},
low: {Int32, 16},
]
end Here we don't specify the backing field, it can probably be generate by the macro (or we can pass a name for that). Then for the total size we can use I personally think that a general solution that doesn't need to modify the core language, if readable/usable enough, is always superior to modifying the language, specially for functionality that is very rarely needed. |
@asterite That would only work for getting and setting aligned bits though. struct {
uint32_t a : 3
uint32_t b : 3
uint32_t c : 2
} foo_t;
|
@SplittyDev I don't understand why |
Oh, I'm sorry @asterite I wrote that from my mobile and didn't pay enough attention. What I said is wrong. |
Is there something to do here? Do we define a macro in the stdlib or do leave this to a shard? |
That might be solved by using newly introduced (#6063) |
Here's something I wrote inspired by @asterite's suggestion above. It's not actually storing the data in compressed bytes like in C but allows for reading in bitfields and outputting them again with changed values. This is what I needed for my use case of reading in 416 bits with random sized values. I'd appreciate any useful suggestions. |
Anybody here opposed to closing this? It seems like the usecases can be solved through a shard, which we can consider to include into the standard library if it becomes really popular or otherwise necessary. |
Yes, let's close this. |
Summary
This is an RFC to add bit-fields to structures. This only applies to C structs (i.e. structs defined within a
lib
).Motivation
I'm writing an Operating System kernel in Crystal and had to resort to ugly bit twiddling to simulate bit-fields, which significantly reduces readability and maintainability.
I'm aware that this is a very specific use case, but I'm positive that the vast majority of Crystal users, especially those who are used to working with C code, could benefit from this.
Detailed design
Currently, C structs look somewhat like this:
Bit-fields could be added in a backward-compatible way by adding another colon after the type, followed by the size in bits. This syntax is currently illegal, and shouldn't be used in any existing codebase, so doing it that way would probably be safe. This approach would look somewhat like the following:
Another way would be to prepend the size in bits to the field, followed by a colon and the usual field definition. Since identifiers can't start with a number, this would also currently be considered illegal by the compiler, and also shouldn't break any existing codebase. That approach would look somewhat like the following:
I personally prefer the second syntax, because I think that it's more readable, especially when multiple fields are defined on the same line.
Update
@RX14 proposed the following syntax:
All of the aforementioned approaches would roughly translate to the following C code:
Implementation details
Bit-fields should only work with integer primitives for reliability.
The size of the struct should equal the sum of all bits, in bytes (as usual).
That's a little more complicated when using bit-flags and can be confusing if you're not familiar with that concept. The following is an example of how sizes should be calculated:
The text was updated successfully, but these errors were encountered: