-
-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C Implementation #160
Comments
I'd absolutely love to take this on, the kind of implementation I would want to do by default would be something very similar to my json library. However, it's single-header, loads entire files at once, and uses a memory model which might be unfamiliar to some devs. Besides conforming to the specification, what kinds of features would you be looking for in a C implementation? Is a streaming parser a better idea? How would you want serialization to look? How much minimalism, performance, and flexibility should be sacrificed for abstraction? Are there json, xml, or yaml libraries for C that you really like the design of? Obviously I haven't written anything yet and the project would change as it went on of course, but I'd love to hear your thoughts! |
I'm not a C developer so I don't really know what would be best. My thoughts on C are that there are two main use cases:
The first seems like it would work really well with what you're proposing here. The latter might be better server by a very high performance streaming parser, so other languages can use C as a perf boost. I think there's room for both so you can do whichever you find more fun! What do you think? |
I'm more comfortable with the first use case, but writing something focused on performance and language interop sounds absolutely dope! I'll give it a shot and get back to you |
@garrisonhh one thing that might be of interest is https://github.com/lucretiel/kaydle, which has a Rust implementation optimized for streaming parsing. It might be too Rust-flavored to work for C, tho? |
The basic interface for language wrappers is a stack machine. SWIG bindings and FastFFI bindings are a little more complicated, but I took a quick look at the API for the library you linked and that style would be great Your JSON library's access and modification APIs could reasonably be copied here, but it would be even better to have an API informed by the query specification |
Cool, I don't really know rust but the structure gives me some ideas. I'm creating a low-level token stream and then going to build higher-level stuff on top of that. I would love to recreate a similar API to the json lib, but it's only really as convenient as it is due to the DOM structure - I'll see what I can do though. Using the query specification also makes a lot of sense. Thanks for the input! |
Hey guys, a couple questions about the spec:
I also think making type annotations mandatory could make some sense, if they can't be relied upon consistently across languages why use them outside of expressing intent? Outside of that, I'm starting to get somewhere with my implementation, I've arrived at a design that I like, and as far as performance goes it's gonna be pretty goddamn fast. If you are aware of any large-scale KDL files (like 50MB+) that I can use for testing I would really appreciate it! I can always do some python magic if none exist that you know of. |
|
Hello all. It seem garrisonhh and I took the up the challenge to write a C parser for KDL at around the same time. However I didn't see or comment in this thread, so didn't see I was duplicating efforts with garrisonhh. That said, I don't think there's any harm in having multiple implementations available. It sounds like we've got slightly different priorities anyway; I've aimed for maximum portability without worrying much about performance, while garrisonhh's solution sounds like it'd perform better. Mine's more or less "done" and passing a fairly thorough test suite, but is not yet battle hardened. Reviews would be appreciated: https://git.sr.ht/~exec64/freekdl One big difference I noticed with my implementation to the official test suite is that the suite allows for type annotations to be quoted strings, but the spec as I understand it (and my implementation) only permit bare identifiers to be type annotations. |
Very cool! The API looks clean at a glance and from what I can see in the test file. I think it's fascinating that we independently used similar terminology like "consume" and "expect". I think it would be awesome for critique if you also provided more examples of usage in the README or as source files! I'm definitely very zoned in on performance and memory usage so I don't think we're really stepping on each others toes. I'm pretty close to a state I'd feel comfortable sharing, so we can compare notes then! |
Thanks. The main thing I'd like to do is enhance test coverage a bit more (specifically on input that should fail to parse), and improve error message output. Right now error messages at least say where in the input it failed, and roughly what went wrong, but it could definitely be more informative. There's also a few code cleanups that could happen, as the design was developed fairly organically beyond knowing that I'd have a distinct tokeniser stage. The other main thing my implementation is lacking is convenience functions for numbers. I currently only expose them as strings, but should provide a few helper functions to get/set them as Lastly, I'd like to aim a fuzzer, valgrind, and other tooling at it to find any possible crashes or vulnerabilities. |
I thought this sounded like a fun week-end project... ... rather more than a weekend later, I have a library that appears to work (and passes the test suite here), but probably still handles some edge cases incorrectly. In a burst of creative naming, I called it ckdl - yeah, I know. https://github.com/tjol/ckdl I went a very different route to @eXeC64 in terms of API design: instead of reading the document into a C structure, ckdl provides a stream of high-level parse events much like SAX does for XML. The idea here is that while this may be inconvenient for C applications, it's great for wrapper libraries in other languages, because you get to load the data into your own structures pretty directly. Since the idea was to enable wrapper libraries in other languages, I went ahead and coded up minimal wrappers for Python and C++. |
@tjol this is really awesome! Please go ahead and PR this to this repo's README as a C implementation (and do the same for https://github.com/kdl-org/kdl-org.github.io/blob/main/src/_includes/partials/implementations.md)! I'm gonna close this issue for now, since we now actually have a C implementation, even if it'll be nice to have the struct-based one too :) |
oh and you probably wanna add it to Python and C++ too :) |
Some people seem to really want specifically a C implementation that can be fast and widely-used. So if you're interested, this is the issue to take on!
Me, I can't do C for shit so don't look at me, I can't do it :P
The text was updated successfully, but these errors were encountered: