Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 1.73 KB

README.md

File metadata and controls

36 lines (27 loc) · 1.73 KB

Build Coverage Status

CALP

C Abusive Abstract Language Parser

A wanna-be-LL/LR parser in C, featuring:

  • runtime language construction & parser compilation
  • macros for compile-time grammar-like declaration.
  • with both left and right recursive rules support, thanks to
  • abusive error resolution to try to parse at whatever cost

The latter basically means this isn't a fast parser, but an incredibly permissive one instead. Lemme show what that means in practice

Z: expr $
expr: ifelse
    | cmdinc

word: /\w+/

cmdinc: word /\s+/ cmdinc
      | word

ifelse: "if" word "then" cmdinc "else" cmdinc "fi"

Then if potato then say hi parses, but not to what you think - Z(expr(cmdinc(if potato then say hi))).

What happened? Well, word can accept both if and then, and after trying the Z->expr->ifelse rule (which failed), the parser backtracked and went onto it with Z->expr->cmdinc.

In fact, it just so happens that in the above ruleset, as it is, the ifelse will never be produced - eager cmdinc parsing will consume the then and/or else, because they are valid for word.

To mitigate all the issues akin above, CALP provides 2 special features:

  • priority specification at symbol, rule, and group levels (higher priority branches are likelier to be tried first)
  • programmatic symbol-specific lexing/tokenization (each terminal symbol specifies its own lexer given an input decides how much input to consume, and which portion of the consumed input corresponds to the symbol)

CALP still certainly has some 🐛 around!