Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-40222: Mark exception table function in the dis module as private #95960

Closed
wants to merge 72 commits into from

Conversation

pablogsal
Copy link
Member

  • Parsing VM initial structure
  • Hook things up so we can test it
  • Add debugging printf()s in anger; fix bugs
  • Implement actions
  • Support optional tokens
  • Support optional rules
  • Add vmreadme.md; OP_SUCCESS has an argument
  • Do optional items differently (with a postfix op)
  • Compute start/end line/col numbers; add some ideas to vmreadme.md
  • Tighten the code; add some speculation to vmreadme
  • Add OP_NOOP; add enums for rules & actions
  • Implement loops
  • Add a few more rules to the grammar
  • Drop debug printf()s, more flexibility in parse_string()
  • Add memoization, some debug niceties
  • Inline helper functions
  • Explain OP_OPTIONAL better
  • Skeleton of code generator
  • Simplify structure of OP_SUCCESS
  • Move opcodes around
  • Add a 'grammar' for operations
  • Move generated part of vm.h into vmparse.h
  • Clean skeleton of vm_generator
  • Better formatting of generated file; remove unneeded indentation
  • Add OP_LOOP_COLLECT_NONEMPTY -- used for a+
  • Expand description of root rules
  • Initial support for repeat_0
  • Fix name rules for repeat0 nodes
  • Eliminate OP_LOOP_START
  • Do fewer reallocs (at the cost of an extra int per frame)
  • Speculate how to implement a.b+
  • Make memo rule types distinct from token types
  • Fix small issues in vmreadme.pm
  • Add generation of root rules (very coarssely)
  • Add enum for rule types (R_)
  • Generate actions (primitively)
  • Implement code generation for keywords
  • Refactor add_opcode to optionally accept a second argument oparg
  • Translate item names in actions; use the generated vmparse.h!
  • Fix mypy (in vm_generator)
  • Avoid name conflict for 'f'
  • Generate code for repeat1 loops
  • Implement delimited loops (b.a+)
  • Generate code for delimited loop
  • Implement soft keywords (hand-written and code generation) (bpo-29643: Fix check for --enable-optimizations #129)
  • Update generated vmparse.h
  • Fix code generation for if_stmt
  • Implement lookahead ops
  • Generate code for lookaheads (only one token supported!)
  • Implement left-recursion (with hand-coded vmparse.h)
  • Code generation for left-recursive rules
  • Allow specifying different grammars
  • Generate code for 'cut'
  • Support groups and optional in code generator
  • There's no need to special-case -> in actions
  • Treat TYPE_COMMENT as a token (since it is)
  • Generate code for Grammar/parser.gram
  • Group every opcode with its argument (Optimize startup time when user site is disabled. #131)
  • Add vm target to pegen script to generate the vm parser (reinit the TLS before anything else #130)
  • Selective memoization
  • Don't call is_memoized in OP_RETURN_LEFT_REC
  • Different way of doing left-recursion
  • Remove leftover conflict markers
  • Fix deps for vm.o
  • Fix includes for vm.c
  • Regenerated vmparse.h
  • bpo-40222: Mark exception table function in the dis module as private

gvanrossum and others added 30 commits May 26, 2020 10:35
Execute as "python -m pegen.vm_generator" in the Tools folder
lysnikolaou and others added 25 commits June 1, 2020 17:09
This is not complete, but I want to get to left-recursion before I fix
this, and I don't actually understand the code generator well enough
to know how to make it work for `&('foo'|'bar')`.
It definitely makes parsing xxl.py 5-10% slower. :-(
This was super easy.

Benchmark time for xxl.py is now around 2.030 seconds.
(But this is with a super simple grammar. We'll have to see what is will be with the real grammar.)
Also silence compiler warning about default case in call_action().
Also cleaned up the type of the second arg to add_opcode() -- it now
must always be a string, the one call that didn't pass a string now
calls str().
(The bad news: it's currently twice as slow as the 'new' parser.)
This reduces the number of is_memoized calls dramatically, to what it
is for the recursive-descent parser (+1 for the root).

However we still have 50% more calls to insert_memo.  This has to be
investigated later.
It's not any faster than before though.
@pablogsal pablogsal requested a review from lysnikolaou as a code owner August 13, 2022 18:12
@pablogsal pablogsal added the 3.11 only security fixes label Aug 13, 2022
@pablogsal pablogsal closed this Aug 13, 2022
@pablogsal
Copy link
Member Author

Apologies for the noise :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes awaiting core review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants