Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing Code with Syntax Errors #310

Open
RyannDaGreat opened this issue Jun 9, 2020 · 2 comments
Open

Parsing Code with Syntax Errors #310

RyannDaGreat opened this issue Jun 9, 2020 · 2 comments
Labels
enhancement New feature or request parsing Converting source code into CST nodes

Comments

@RyannDaGreat
Copy link

Is it possible to parse code that has syntax errors in part of the code? I'd like to refactor code that might still be incomplete (and this, might have syntax errors).

For example:
print(x) def f(x,y,z):
is not valid code by itself (because the function needs a body). When I call libcst.parse_expression('print(x)\ndef :'), it throws an error. While this is an understandable response, I was wondering if it would be possible to not throw the baby out with the bathwater; that is to say is it possible to recover the print(x) in the output (instead of just throwing an error).

Why I want this:
When editing a python file, usually in between edits, I have invalid syntax (like in writing the above example; before adding a function body). But I'd like to be able to run refactorings anyway, like PyCharm does. Is this possible with this library?

@thatch
Copy link
Contributor

thatch commented Jun 10, 2020

print function is an expression, but def is a statement; you might want to try parse_module instead. I don't think this idea works in the general case very well, and suspect PyCharm is just doing regex
if it works for cases like these, but here's some untested code that splits some source into a valid cst tree, and everything after that you can merge back together later.

def lenient_parse(data)
  try:
    mod = cst.parse_module(data)
    return mod, None
  except cst.ParserSyntaxError as e:
    lines = data.splitlines(True)
    for n in range(e.raw_line, -1, -1):
      try:
        mod = cst.parse_module("".join(lines[:n]))
        bad = lines[n:]
        return mod, bad
  raise

mod, rest = lenient_parse(...)
do_refactor(mod)
return mod.code + rest

@carljm
Copy link
Contributor

carljm commented Jun 10, 2020

I think forgiving parsing would be a reasonable feature for LibCST. It's a common feature for parsers used in IDE contexts, and Parso supports it, which might make it easier to implement in LibCST.

It's really just a question of someone having sufficient motivation to contribute it, IMO.

@zsol zsol added parsing Converting source code into CST nodes enhancement New feature or request labels Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request parsing Converting source code into CST nodes
Projects
None yet
Development

No branches or pull requests

4 participants