-
-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust support #4266
Comments
thump up this for adding support for Rust |
100% in favour, but it's not going to come from me given limited time and experience with rust. We're always open to contributions :) |
Memory leakage issues are of big concern and could crash the app/library. The memory-safe feature of Rust makes the language a good fit for this project. I suggest we fork the project and rewrite it in Rust. If it is a success, we can move to the new library. I can only volunteer a few hours a week. So we will need human resources because AI can't do it by itself yet. |
Maybe a better starting point would be to talk to guys at https://github.com/stepcode/stepcode. They have a narrower scope because they don't need to interface with the C++ geometry libraries. If that's a success we can swap out the C++ parser module in ifcopenshell with your rust port. That would also enable to most use of your effort because it can be more generally applied. For me, highest priority would be reducing indirections and memory overhead. Currently in IfcOpenShell we have, at the parser level, object storage using inheritance and virtual methods. On top of that a schema layer is applied that coerces the tokens according to the schema. This is inefficient, because we have generic storage, but can only access the data through the typed layer that needs to do coercion. It would be interesting if the parser can already be informed on the structure of the entity so that it can at parse time already store the data in an efficiently packed tuple without indirections or virtual calls. The schema language (EXPRESS) doesn't support aggregates with non-uniform elements, so the type information does not need to be stored for every element, which we currently also do. If we can get:
I'm sure people in the community will be eager to adopt it and contribute. I think rust also has some pretty good parser libraries. The P21 grammar that is used for the .ifc files is rather elegant. |
Hey, I came across https://github.com/ricosjp/ruststep and think it might be relevant for this discussion. They're building a rust cad kernel "truck" (see https://github.com/ricosjp/truck). I havent tried it myself, but if anyone of you are keen on learning rust, this could be a way in? |
Hmmm interesting. Right now I am trying to create the IFC Openshell library in But most importantly, failing to find a suitable method to return the IFC element via Looking at how we can better do the latter. Any feedback or suggestion will be appreciated. UpdateYes, it seems like |
Chiming in to register my interest as well.
such as deserializing face sets directly to USD prims? I see two primary components:
logos or nom are potential lower-level options for the first part. Ruststep appears to use Stepcode seems to be as the logical tool for the second part. @TanJunKiat do you have a repo started already? Is ruststep complete enough to handle the parsing step? |
Sorry but I haven't gotten the time to look in-depth into And yes, I have started a repository to test some concepts. I will open the repository once I get it to the point that I think it's ready, hopefully some time around this month. The current methodology that I am using is summarized as such:Trait on IFC entitiesAll entities will be given a Manual parsingRight now the parsing of the data is pretty manual, with slicing and getting each element and storing them in the respective structs. Definitely would be helpful if we can have a dedicated serialiser/desarialiser for our purpose. The biggest challenge right now for me is handling the large volume of IFC entities that needs to be programmed (e.g. to parse) and still figuring out the best way to approach the crate in general (e.g. proper use of traits, generics). I will definitely be happy for the community to come together to do this together. With my limited knowledge on Rust and limited time to work on this, any help and advice is much appreciated. |
No problem and no need to apologize! I was just wanting to document current status for the benefit of the community to minimize duplication of effort.
Indeed, generating the code for rust structs for all entities across multiple schema versions is near-impossible to do by hand. Some combination of stepcode and potentially LLM seems to be a promising option. In the meantime you might consider writing a generic
I think this goes for most of us 😃. |
Yea, while tedious, I do find some incentive of writing the structs manually. Like embbed test cases for the differrent entities as well as documentation from web directly into the codebase. We can definitely discuss and see what is the direction the community wants to go for. |
Another note is that since we are writing the library in Rust, it might be worthwhile to gather feedback on what are the common issues in the existing libraries and see if we can solve them with the |
Yes, this is the main design decision to make. In IfcOpenShell we have the schema types and data storage as separate things. I've recently rewrote quite some stuff. Data storage is a custom type called VariantArray. https://github.com/IfcOpenShell/IfcOpenShell/blob/v0.8.0/src/ifcparse/variantarray.h On top of the schema classes index into this array as functions with a return type. I imagined this would be best of both worlds: not too many code paths in the parser module and still type safety by means of the schema layer (that could potentially be inlined). This also allows to store invalid data types which we actually need for our validator. |
Right now this is the approach that I am adopting for storing the entities, using just Instead of a generic struct, I'm defining the data as a use log::debug;
use std::collections::HashMap;
#[derive(Debug, PartialEq, Eq, Clone)]
pub struct ParseError;
pub trait IfcDataOps {
fn parse(s: &str) -> Result<Self, ParseError>
where
Self: Sized;
}
#[derive(Default)]
pub struct Model {
data: HashMap<usize, Box<dyn IfcDataOps>>,
}
impl Model {
pub fn open(file_path: &str) -> Model {
let mut model = Model {
data: HashMap::new(),
};
let contents = std::fs::read_to_string(file_path)
.expect(format!("File path, {}, not available.", file_path).as_str());
let lines: Vec<&str> = contents.split("\n").collect();
for line in lines.iter() {
if line.contains("=IFCDIRECTION(") {
if let Ok(direction) = IfcDirection::parse(line) {
debug!("Parsed {:?}", direction);
model.data.insert(direction.id, Box::new(direction));
}
}
}
return model;
}
}
#[derive(Debug)]
pub struct IfcDirection {
id: usize,
x: f32,
y: f32,
z: Option<f32>,
}
impl IfcDataOps for IfcDirection {
fn parse(s: &str) -> Result<Self, ParseError> {
let id = s[1..s.find("=").unwrap_or(s.len())]
.parse::<usize>()
.unwrap();
let s = &s[s.find("((").unwrap_or(s.len()) + 2..s.len()];
let s = &s[0..s.find("));").unwrap_or(s.len())];
let mut coordinates = Vec::new();
for (_, value) in s.split(",").enumerate() {
coordinates.push(value.parse::<f32>().unwrap());
}
if coordinates.len() == 2 {
return Ok(IfcDirection {
id,
x: coordinates[0],
y: coordinates[1],
z: None,
});
} else {
return Ok(IfcDirection {
id,
x: coordinates[0],
y: coordinates[1],
z: Some(coordinates[2]),
});
}
}
} While programming, I want to make good use of some of the functions in Rust, such as using |
Have you investigated some of the parsing libraries out there. The step grammar is not that challenging, a little string manipulation goes a long way, but the problem is that newlines can exist anywhere in the file. And if you think about splitting on the semicolon, they can also exist within strings for example. Sorry to be that guy (but I don't know Rust well enough to do it myself) but I had generative AI generate the following for me, just to constrast with an other option:
|
No worries, appreciate the honesty. Yea, I agree that a more rigorous and robust parser is needed. So, hopefully, someone can help and look into that once we get started. Meanwhile I'll check if there are alternatives to improve the existing. Yea, I did run into similar issues with parsing entities with
It screws up the parser logic. |
Some findings for discussion. Using
|
regex | manual | |
---|---|---|
Speed | ~400ms | ~300ms |
Entities parsed (counted 87278) | 87152 | 87278 |
For now, I will stick with the manual method as it is the fastest method to get something working in hopes that we will attract more contributors and devise a better solution for parsing.
I would advise to move forward with a formal grammar as soon as possible. See https://www.steptools.com/stds/step/IS_final_p21e3.html#clause-5 for the definition. String splitting gets more and more unwieldy once you start to incorporate some of the string escaping rules. But I agree it's a good idea to assess performance, and maybe include some of the other libraries such as Secondly, a schema is actually only a nice to have. As a first starting point I would start with an untyped parser. With maybe the simplest API possible as something like:
That try to coerce the value found in the AST into the type requested. That way you can already outline a minimal example application without any schema knowledge embedded into the program. |
Hi everyone. Thanks for all the active support and discussions. I feel like we can really start pushing this and see if we can get it to a level where the community can come in to contribute. I have taken the liberty to start a new GitHub organisation for now, ifcopenshell-rust, with the main repository being here. I am not sure how to navigate this, but maybe eventually we can merge it to this organisation when the project matures. But I believe having a proper channel for discussion and documentation will help push the project forward. I have also created two discussions, which are the two most important tasks to iron out:
Also, I would like to extend an invitation to keen members to take on shared responsibility for managing and overseeing the project. We could definitely get some help in terms of GitHub and project management. Please reach out to me if you are interested. |
Thanks for taking the initiative to set up a working location. On license you might consider LGPL or GPL for future potential compatibility with the overall IfcOpenShell project - but IANAL. Also it will be a few days minimum before I can review or provide meaningful input. |
Yes, we definitely need to set up a committee or at least invite a few of the current IfcOpenShell developers to comment and advise on the licensing. Definitely not a decision to be made lightly or by one single individual. But sure, we can stick to LGPL for now. |
Just one thing to put out there. IFC5 appears to be rather fundamentally different. https://github.com/buildingSMART/IFC5-development We could also prioritize support for IFC5. It requires a simpler stack because it;s not a bespoke grammar, just JSON. Maybe the timing is also nice since it's new and since its integration into ifcopenshell might also be more complementary because we don't have anything for ifc5 yet. |
With the rise of the Rust Programming Language, is it possible for a native Rust IfcOpenShell API?
The text was updated successfully, but these errors were encountered: