-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two don't cares ('-') adjacent in a literal is parsed as a comment #529
Comments
I am fairly certain that you can remove comments if they are the only thing on the line. There are no such thing as multiline string literals in VHDL, so there is no way for |
Also, I'm uncertain as to why you are replacing the comment with spaces. No code can follow a comment, replacing it with whitespace won't affect character index in error reporting because no error could exist after a comment starts. |
The purpose of our parser is to solve the small subset of the full parsing task we need to run VUnit. For example finding design unit and use statements so that we can figure out dependencies. Do you have a use case where this breaks due to don't care assignments or are you thinking about parsing more generally? I recognize that there may be such use cases but the question is whether or not they motivate extra parsing efforts or should be solved with a workaround. A common use case we address by removing comments is when people comment out part of their code (not using block comments). Parsing in general is can be tricky when using a simplistic regexp approach -- dc := "----"; -- Assign "----" to dc. |
The parser fails because of syntax errors introduced by removing the rest of the line. For example, if a closing or opening paren is removed it will cause a failure to parse exception: dc := SomeFunc("----"); After comment removal... dc := SomeFunc(" I can't post any of the failing code. And I know I could modify the source to get it to not fail, but that's simply a hack. I want to use vunit for determining compile order, but these failures make it impossible. |
It is because of the simple comment regex. It doesn't take into account string literals. Any kind of string literals, not only bit string literals. I did some test and found a solution here: |
It may be more complicated and take more time, but it's probably the cheapest fix. |
I have a slightly more optimal and straightforward regex that works on the example given. Although maybe you know more about potential pitfalls than me. patt = re.compile('^([^\"\-]*(?:\"[^\"]*\"[^\"\-]*)*)--.*')
def remove_comments(match):
return match.group(1)
patt.sub(remove_comments, code) |
@ktbarrett Have you done some measurements on the performance hit? |
My test with regex tester show ruffly 6x more steps then the simple version. But I do not think it matters in absolute terms. |
The "real" way of solving the parsing problem would be to do lexing/tokenization of the input. But as Lars mentioned, making a correct parser is much more work than making a heuristic parser that is only supposed to be a private implementation detail of a dependency scanner. I have actually written a full VHDL parser in Rust (https://github.com/kraigher/rust_hdl/) FYI |
As a possible future enhancement, does it make sense to pluggable/selectable VHDL parsers? Mimicking what we do with simulator interfaces, it would allow to have |
First of all I do not think VUnit should have a parser as a publicly visible component. What VUnit needs is a dependency scanner which might have a parser as a private implementation detail. In my opinion the vhdl parser of VUnit already does to much by parsing generic and port lists. Ironically if it did not parse port or generic list it might never have been sensitive to the problem of this issue since the don't care literals probably only appear as initialization values in such lists. One of the reasons I wrote the VHDL parser of rust_hdl is that I do not believe it is a good idea to write a parser in Python. This is due to:
Thus if VUnit ever needed a full parser it would use the parser from rust_hdl. However the question is if it would be worth it. The current method works well enough and introducing a platform dependent binary to VUnit would complicate the build & delivery process. Maybe it can be tolerated since most normal users are found on only two platforms and advanced users on other platforms can be expected to build the binary themselves. I am not so keen on supporting pluggable parsers as you can only assume the least common denominator of functionality and it is twice the code to maintain. I would rather throw away the old and in with the new. |
This was my main motivation to suggest a pluggable system with, at least, the current (old) implementation and the new (rust_hdl). Users that want to keep installation of VUnit simple and multi-platform could use the old implementation (which can be a default fallback if rust_hdl is not available).
I use VUnit on four platforms: GNU/Linux (amd64, aarch64 and armv7) and Win10 (amd64). This is not a great deal (2 vs 4), and I do have docker images to automate the builds. Nevertheless, it is kind of painful if most users don't have any problem with the regex solution.
What about the following middle ground?
|
It is just that I cannot think of any important aspect that would be better for the users of VUnit by using the rust hdl parser for vunit dependency scanning at the moment. The current method even if it is primitive and not formally correct works well. If anything I would like to strip away parsing of elements that do not contribute to dependency scanning. Thus there is really not much which would motivate the work effort to integrate rust hdl. There would need to be a lot of code written to integrate the rust parser and actually use it for dependency scanning. |
What are those elements used for? |
The
remove_comments(code)
function is overly simplistic. It matches any two adjacent-
characters until the end of the line.'-'
is also used as a "don't care" value forstd_logic
. If there is anstd_logic_vector
literal where there are two don't care values in a row (e.g.'----'
), it is parsed as a comment and the rest of the line is removed, causing the parse to fail.vunit/vunit/vhdl_parser.py
Lines 991 to 1007 in 12be973
The text was updated successfully, but these errors were encountered: