-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST-based program generation #2
Comments
I have a WIP based on To take this further, I think a better approach would be to use Development strategy: instead of starting with everything (as above) and then fixing each node, I think a more productive approach would be to start with a minimal subset (e.g. integer arithmetic) and then gradually expand. Keeping everything well-behaved while adding a single node type should be tractable, and that means incremental development works again 😄 |
The simplest possible way to generate CST nodes is to start with However, this approach uses default values for all node parameters which have defaults, and that makes us much less likely to find any issues related to e.g. whitespace and parentheses. Supporting those makes us responsible for balancing parentheses though, which is non-trivial. I'd also like to support swarm testing over the generated productions. This isn't too hard conceptually, but hypothesis FeatureFlags is not (yet) public API, and an ideal implementation would have some way of avoiding 'dead ends' where we generate something that the feature config doesn't allow us to complete. Finally, there are many nodes dependent on other nodes - which makes registering strategies for them a little tricky. We therefore make very liberal use of the |
This is basically working now - registering more specific strategies for individual node types would be a substantial performance boost, but I'm happy with the current design for now. |
Grammar-based generation works, and gives us syntactically valid source code.
The next step is to get semantically valid source code! The clear best approach for this is to generate a syntax tree, and "unparse" it into source code. Based on experiments at the PyCon Australia sprints the best AST to use is probably from
lib2to3
- and that will give us the unparsing for free viablack
.After that, I'd like to go to a concrete syntax tree where we draw formatting information at the same time as the node. This would massively improve our usefulness for
black
, but it's a lot of extra work.The text was updated successfully, but these errors were encountered: