-
-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VIP: Custom Parser #563
Comments
Please don't. Let's not turn this into another Solidity. :( |
We're already running into limitations of the underlying python syntax, and for future growth there will be a need to violate the syntax in subtle and not-so-subtle ways. Viper is definitely a different language from Python, we will try to stick to the syntax as closely as possible but there are different situations that we need a custom parser to handle. |
I don't think having a custom parser will negatively effect clarity, given how different writing smart contract code is to python code (particularly in terms of security) it would be helpful to customize certain parts of the parser. Here are a couple examples where a customer parser could come in handy:
|
The danger of a custom parser is bugs. (There were 8 critical bugs in the Serpent compiler when Augur ordered an audit...) External contracts could be nicely specified if this suggestion was implemented. You could say that you can only inherit from Contract and ExternalContract and there can only be one Contract per file. The inheritance would basically do nothing. More descriptive keywords could be realized by the preprocessor that is mentioned in this proposal. For the mapping syntax, something like this should be possible: |
One potential solution is to copy the existing Python grammar, modify it to match the Viper language, then generate a parser from that grammar using a tool like ANTLR. This confers a number of benefits:
If there's interest, I'd be happy to build a small prototype. |
@mslipper 👍 I think this is the approach we were getting at. Thanks for suggesting a tool! Is there an easy way to integrate this with our Python flow (e.g. ANTLR wrapper module) so that the build process could be managed 100% in Python? ANTLR is a Java program, but I see some evidence that this is possible here |
Meeting Minutes:
|
@fubuloubu ANTLR is written in Java, but it'll generate a parser in any language it supports. The |
We were discussing this along with a few other things in the call we had today. I think we're still a little reluctant to move to a custom solution fully. We were trying to figure out if there was a way to modify or extend the AST module to get what we're looking for, I think to do that we need a summary of the changes we are looking to make. From the original post above, these are (with examples):
my_contract: contract(
foo(),
bar() -> num,
)
my_map: map(basetype1 -> basetype2)
wei := num("wei")
fee: wei We also chatted a bit today about #584, and I believe my proposed solution may be able to sidestep all of this by changing how types are handled a bit. I think most of our reasons for wanting a custom parser have more to do with being able to specify and easily work with different kinds of globals. Check out the bottom of that issue and feel free to add to the discussion. |
I've started working on defining the grammar in ANTLR. I'm following a similar approach used to create a js-solidity parser. This will enable us to generate a parser to use in our Surya tool. So far I've just been extracting the grammar from documentation and examples, but if the vyper project itself might make use of it in the future, it would be better to ensure the names and structure of nodes is similarly defined. Would someone from the core team be willing to spend 30 minutes walking me through the parser code, or even collaborate on defining the grammar? |
@maurelian sure, glad to help - we can arrange a call time on gitter. This will be a good start to define a grammer: https://github.com/python/cpython/blob/master/Parser/Python.asdl |
One approach I've been wanting to take is a conversion step from the Python AST to a Vyper-specific AST. This can be defined in a friendly way for ANTLR or the K framework. from vyper import ast
# Parses with Python ast, then Vyper ast
print('Vyper AST:', ast.parse(code))
# Prints out the grammer, perhaps in an ANTLR/K friendly format
print('Vyper AST Grammer rules:', ast._grammer) |
Arrived here from what I've been following in #300. Curious on what the status is for the pathway to implementing this. From previous calls/conversations was the consensus on moving forward with a solution built with Sly @fubuloubu? That's what it seemed from the convo with @charles-cooper on #300, which makes sense to me but was also curious if/why the parser generator route had been ruled out. |
@jakerockland we can discuss it for sure at the next meeting. If people want to take on this challenge, it may be the time to do it. A few things to note:
In regards to refactoring our current codebase, that is something @davesque was exploring. The current codebase mixes too many things from parsing into code generation. It would be nicer to see all compiler stages as separate modules with distinct interfaces between the stages, more akin to how you build compilers with functional languages like OCaml (which has an excellent set of libraries for that). I have a really, really, really old example of how that might be done here: https://github.com/fubuloubu/blocktract/blob/master/blocktract/ast.py The idea is that each stage would be formalized in separate modules e.g. |
I don't think a custom parser at this point is a good idea, we can plan this for the 0.2 release. But at this stage there is plenty of other "not as flashy" issues to work on. Using the tokeniser for class isn't the most elegant solution, but can be done without too much trouble. There is a reason we want to keep vyper parsable by the python ast, and that is that it will always stay firmly rooted in python. To me the codebase isn't ready for a custom parser (yet), and needs refactoring, whereafter one probably does not need the custom parser :P Happy to discuss further on the next call. |
@fubuloubu @jacqueswww Thank you both for all the input here! Would be great to loop back on this on the next call but definitely doesn't have to be a deep dive as there are a lot of hotter button issues that need to be resolved. Was mostly just curious what the state of this issue was 😄 👍 |
@jakerockland I also had some thoughts about using a custom parser when I originally started looking at Vyper. However, I think I agree with @jacqueswww that there are higher priorities. Of course, I'm still learning a lot about the entire codebase so my opinion is tastier with salt. 😄 |
A resource for researching different parser generators and tools https://wiki.python.org/moin/LanguageParsing |
Of the options in the above link, the following seem reasonably modern/maintained, and also use grammars defined as some variant of EBNF (rather than python code): lark, pyparsing and tatsu have pre-written python grammar examples: |
We have been using https://github.com/erikrose/parsimonious for |
I'm kind of liking tatsu, it lets us create our VyperAST module by just annotating the grammar (https://tatsu.readthedocs.io/en/stable/mini-tutorial.html#object-models) and it has abstractions for ast traversal (https://tatsu.readthedocs.io/en/stable/mini-tutorial.html#one-rule-per-expression-type) and code generation (https://tatsu.readthedocs.io/en/stable/mini-tutorial.html#code-generation). Not sure how powerful the latter is but I can see its potential. |
Just noticed TatSu is a refactor of Grako, so it has a LOT more pedigree than the GitHub would lead you to believe! |
Also, to summarize a discussion we had, this PR will be split into a few "stages". The stages of this VIP concerning actually replacing the use of the AST module is beyond the scope of the v0.1 release, but the early stages will prepare for this to make it as seamless as possible. |
Closing in favour of #1363. |
Can we make #1363 a VIP then? Capture the important bits of this one? |
@fubuloubu We can, but we haven't really ever had to do a VIP for internal before? (or have we?) |
That's true! If it doesn't change syntax, than it's just a refactor. If there are any syntax changes, we should make sure to capture those separately as VIPs so people can stay informed. |
Preamble
VIP: 563
Title: Custom Parser
Author: @fubuloubu @DavidKnott @jacqueswww
Type: Standard
Status: Draft
Created: 2017-12-08
Simple Summary
Implement a custom parser for Viper that doesn't directly tie us to Python-only syntax, enabling a more focused grammer for our langauge
Abstract
We've been discussing this for a while. A custom parser would allow us to define our syntax more precisely rather than leveraging Python syntax and being tied to only what Python's syntax can provide. We will continue to use Python as a template due to it's clarity and ease of reading, but we need to make decisions that diverge from Python and a custom Parse will enable that.
Motivation
There are specific things that have been discussed where this is necessary:
Specification
We may be able to leverage a Python-compatible lex/yacc library like ply. We should also leverage some of the work the k-framework guys are doing in order to infer a grammar that is consistent and free of formalized conflicts
Backwards Compatibility
Try to maintain backwards compatibility initially, however some of the VIPs this one will enable will be breaking changes in the syntax.
Copyright
Copyright and related rights waived via CC0
The text was updated successfully, but these errors were encountered: