-
-
Notifications
You must be signed in to change notification settings - Fork 162
Metaprogramming
Papers:
- A Survey of Metaprogramming Languages (2019)
- Taxonomy of the fundamental concepts of metaprogramming (2008)
Idea: Use "Lisp-like AST metaprogramming, but with Syntax".
-
Scheme:
,foo
is(unquote foo)
,,@foo
is unquote-splicing,`(a b)
is quasiquote -
Clojure:
~foo
is(unquote foo)
-
Python-like languages with metaprogramming (TODO: transcribe examples)
- Mython -- and the associated Basil framework. Some of this ended up in PyPy!
-
Converge Guide to Compile-Time Metaprogramming -- uses
$<<>>
and[| x |]
-
Julia --
-
:(expr)
orquote/end
for quotation -
$var
for interpolation -
macro sayhello() ... end
for macro definition -
@sayhello()
for macro invocation - http://docs.julialang.org/en/stable/manual/metaprogramming/
- https://en.wikibooks.org/wiki/Introducing_Julia/Metaprogramming
- LMS-like technique in Julia: A practical relational query compiler in 500 lines of code
-
-
Elixir metaprogramming uses
quote/end
for quotation, andunquote
for interpolation. (In fact the entire Elixir language appears to be done with AST metaprogramming, since it's on top of Erlang.)-
defmacro
for macros, arguments unevaluated - Monkey language macro system is based on Elixir's system:
- https://interpreterbook.com/lost/
-
quote
,unquote
,macro() { }
, andmymacro()
- macro invocation isn't distinguished from function call -- walk the AST and
use a Go dynamic type check for the
object.Macro
type - requires an AST walker because you have to do multiple walks:
- search for
unquote()
within unevaluated AST subtrees - define macros
- expand macros
- search for
- difficulties
- error handling (skipped over, exceptions would be useful here)
- debugging
- modifying token positions -- Lossless Syntax Tree Pattern
- limitations:
- what about lexical modification? like
c=n; echo -e "\$$c"
. Needeval(string, ctx)
too? - statements vs. expressions: Currently, we only allow passing expressions to quote and unquote. One consequence of that is that we can't use a
return
statement or alet
statement as an argument in aquote()
call, for example. The parser won't let us, simply because arguments in call expression can only be of typeast.Expression
.
- what about lexical modification? like
-
-
scalameta.org -- q"" for quotation,
$var
for interpolation. -
R metaprogramming -- everything is quoted implicitly because it's lazily evaluated
-
substitute
,deparse
,eval
, etc. Need to look at examples. - R is unique in that has lazy evaluation? So everything can be metaprogrammed before using it?
- macros are just functions? what about scope?
- Non-standard evaluation by Hadley Wickham
- Oil and the R Language
-
Programming with dplyr
- quo() returns a quosure, which is a special type of formula
- enquo() uses some dark magic to look at the argument, see what the user typed, and return that value as a quosure.
- If you’re familiar with quote() and substitute() in base R, quo() is equivalent to quote() and enquo() is equivalent to substitute().)
- we quote the variable with quo(), then unquoting it in the dplyr call with !!. Notice that we can unquote anywhere inside a complicated expression.
- Use quos() to capture all the
...
as a list of formulas. - Use
!!!
instead of!!
to splice the arguments into group_by(). - Automatic quoting makes dplyr very convenient for interactive use. But if you want to program with dplyr, you need some way to refer to variables indirectly. The solution to this problem is quasiquotation, which allows you to evaluate directly inside an expression that is otherwise quoted.
- The first important operation is the basic unquote, which comes in a functional form,
UQ()
, and as syntactic-sugar,!!
- Its functional form is
UQS()
and the syntactic shortcut is!!!
- The final unquote operation is setting argument names. You’ve seen one way to do that above, but you can also use the definition operator := instead of =. := supports unquoting on both the LHS and the RHS.
- Tidy evaluation, most common actions
- Non-standard evaluation, how tidy eval builds on base R
-
-
C++ Proposal led by Herb Sutter: Metaprogramming in C++ https://www.youtube.com/watch?v=4AfRAVcThyA&t=1649s
- syntax (this is a proposal, so syntax may change):
-
constexpr { }
blocks for things that must be evaluated at compile time. -
-> { }
blocks for runtime code - $ syntax for compile time variables. For types only, not expressions or statements?
-
- Example: getting string names from an enum. Very relevant to lexing/parsing! Many languages have a tiny code generator for tokens and AST nodes.
- He is selling it pretty hard, saying "this is already what we do", "we're not turning C++ into Lisp", etc.
- "constexpr all the things" -- e.g. STL algorithms and data types
- syntax (this is a proposal, so syntax may change):
-
Clang AST
I saw a video where people asked why Clang source tools generate textual changes rather than AST changes... and this is a good example. People for some reason think that ASTs are "cleaner" or more usable, but they can be a pain.
https://news.ycombinator.com/item?id=13630134
- C++ has
constexpr
- Zig has
comptime
(I think the whole language can be evaluated at compile time?)- printf implementation is an example: https://ziglang.org/documentation/0.1.1/#case-study-printf
- Rust: ~2 types of macros and const contexts, which are like
expr
. -
Lua/Terra research language -- great paper
- criticism of both design and implementation: https://erikmcclure.com/blog/a-rant-on-terra/
- Jai also has the ability to evaluate anything at compile time? It's not released but I think there was a YouTube demo of it.
- Spiral Language -- functional language for machine learning with "first-class staging"
Comments on "Outperforming everything with anything Python? Sure, why not?"
- before lexer -- code generation
- before parser -- not sure this exists? Generate tokens? Yes this is how the C preprocessor works! It has roughly the same lexer as C, but a different parser!
- before compiler -- AST metaprogramming.
- at runtime, after compiler -- reflection.
- Do template-like metaprogramming with auto-escaping? That means you need to lex languages rather than parse them? You can do this with HTML, but I'm not sure about other languages.
- Philosophy: Oil is about metaprogramming other languages (primarily), not metaprogramming itself!
- But we do need a syntax for lazy evaluation, for R-like expressions. (I don't think we need statements).
- I think this can just be quotation and interpolation. AST nodes can be opaque/immutable.
- syntax:
filter(df, \uri == uri)
, orfilter(df, \(uri == $$uri)
.$$
or%
could be interpolation.
- And we do have
eval()
-- for feature detection, at the very least.
- Lua/Terra -- one is dynamic and one is static.
- C preprocessor and C (same "lexical context")
- "C with classes" and C++ template metaprogramming -- two different languages
- new "Meta" proposal: C++ and C++
- Oil implementation: Python + C (and C++), via textual code generation.
- Python and TensorFlow -- both are dynamic?
- C++ and C++ template metaprogramming -- Eigen
- Scala?
- OCaml and MetaOCaml -- both are static? I think it's not possible to generate a program that doesn't pass the "normal" OCaml type checker?
- Python and RPython -- The hard to understand part is that Python is a meta-programming language for RPython
-
http://wordsandbuttons.online/outperforming_everything_with_anything.html
- great link that compares C preprocessor and C, C++ template metaprogramming and C++, then Python + LLVM!
- based on Template Haskell
- lexical vs. syntactic (AST) macros
- lexical macros: you need an entirely new language!
- Lisp macro systems require the compiler to recognize macros as different than functions
- languages such as TemplateHaskell distinguish only the macro call itself. macros can be any function in the host language.
- Lisp's syntactic minimalism lends itself to metaprgramming
- Converge language
- similar to OPy -- statically analyzable namespaces! I think this is to distinguish compile-time vs. runtime variables.
- has offline compilation and linking step too!
- uses
func
keyword - no global scope, just local
- section on scoping rules, for compile time and runtime variables
- nice section on error reporting! One of the most significant unresolved problems.
- use cases:
- conditional compilation
- runtime compilation of
printf
(similar to what Python's f-strings now do)
- user experience: compile-time metaprogramming in its rawest form is not likely to be grasped by every potential developer
- language design implications
- must be able to determine names statically
- Compiler architecture is no longer linear? Didn't quite understand this part. There is quasi-quote mode and splicing mode.
- AST design
- heterogeneous vs. homogeneous -- he chooses heterogeneous, somewhat dismisses homogeneous
- ASTs should be immutable! Because of aliasing I guess. Python's are mutable.
- new extension: arbitrary DSLs compiling to converge code! yes.
- DSL embedding in Converge, which describes DSL blocks.
- Converge Parsing Kit uses Earley Parsing, inspired by SPARK parser (used in original Python ASDL implementation)
- quite slow at 1000 lines/second, or 1 line per millisecond!
- src_info concept -- attributing errors to multiple locations
- alpha renaming for hygiene
- rewriting the tokenizer? You can use Converge's tokenizer, with a list of optional keywords, or you can provide your own tokenizer
- example: ORM! Translating SQL schemas to Converge type definitions (or the opposite?).
- didn't like his terminology of "heterogeneous" and "homogeneous", to mean 2-language vs. 1-language
- this was even different than a paper cited
- Converge Parsing Kit uses Earley Parsing, inspired by SPARK parser (used in original Python ASDL implementation)
- vtparse state machine
- Treesitter grammars -- e.g. C++ is defined in terms of C subset, and TypeScript is defined in terms of JavaScript subset