-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added syntest example to run ST syntax tests #44
Conversation
I vote for something fishy in
|
The following change seems to help: --- syntect-master/src/easy.rs
+++ syntect-syntest/src/easy.rs
@@ -148,10 +148,9 @@
impl<'a> Iterator for ScopeRegionIterator<'a> {
type Item = (&'a str, &'a ScopeStackOp);
fn next(&mut self) -> Option<Self::Item> {
- let next_str_i = if self.index >= self.ops.len() {
- if self.last_str_index >= self.line.len() {
- return None;
- }
+ let next_str_i = if self.index > self.ops.len() {
+ return None;
+ } else if self.index == self.ops.len() {
self.line.len()
} else {
self.ops[self.index].0 with this change, the "Batch File" and "Haskell" syntax tests start working, and the number of failures in "PHP" are reduced significantly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a preliminary review. I have to go right now but I can do a complete review soon.
In general this looks awesome. Thanks so much for writing this. It seems to be finding a bunch of bugs, but that's a good thing since after we get these to pass I'll feel a lot more confident that things work.
examples/syntest.rs
Outdated
} | ||
|
||
fn main() { | ||
let ss = SyntaxSet::load_defaults_newlines(); // note we load the version with newlines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case it might be good to use SyntaxSet::load_from_folder(argv[0])
so that you can use it for syntax development without recompiling every time. Once we get syntect
to a point where you can actually rely on it to do things correctly that is...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually since this is using the version with newlines included and load_from_folder
defaults to without newlines. You can just do what load_from_folder
does yourself:
let mut ss = SyntaxSet::new();
ss.load_syntaxes(args[0], true).unwrap();
ss.link_syntaxes();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great idea! however, in my testing using cargo run --example syntest testdata/Packages/HTML/
, args[0]
refers to the example in the target/debug
folder, and it would have to traverse up a few folder levels to find the testdata/Packages
folder where the syntaxes are, so we will need to come up with a suitable workaround I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I meant args[1]
and then you would pass testdata/Packages
as the first argument and it would run tests for all languages, although it's kinda nice to be able to run tests for only one language I guess. Maybe have the first command line argument be the same as now, and then if there's a second command line argument load from that folder, otherwise just load the defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something along those lines too :) thanks, I'll give it a go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
examples/syntest.rs
Outdated
println!("Testing file {}", entry.path().display()); | ||
let result = test_file(&ss, entry.path(), true); | ||
println!("{:?}", result); | ||
if let Err(_) = result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if there were a flag/argument to make it not die on the first failing file. It would be easier to get a sense of how much was broken then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it currently doesn't die on the first failing file (unless there is a panic), it just ensures that when it has run through all the files, it will exit with an appropriate exit code :) maybe we could add a flag though so that it optionally can fail fast, to save time (not that these syntax tests take long at all!) when all the caller wants to know is whether there were any failures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh woops. I just skimmed over the code briefly and missed that this didn't actually stop anything.
I think the behaviour you describe is perfect for now. I don't see any pressing need to add a fail-fast flag.
Cargo.toml
Outdated
@@ -24,6 +24,7 @@ rustc-serialize = "^0.3" | |||
bincode = "0.6" | |||
flate2 = "^0.2" | |||
fnv = "^1.0" | |||
regex = "0.2.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now the use of regex
is fine. I may port it to onig before or after it is merged.
However, it should be under dev-dependencies
since it is only used in an example. See rust-lang/cargo#523
It seems like the test failures are due to scopes from
|
@keith-hall interesting. Thanks for the diagnosing. I'm going to see if I can review this PR later today and maybe fix some of the bugs it found. If not today I'll try and do it tomorrow. |
looking at the Java example a bit more closely, it might be related to how |
aafe342
to
b03fe13
Compare
@keith-hall I'm working on reviewing this. Just a heads up I pushed a commit to the branch refactoring the looking up of syntaxes by path (thanks for enabling maintainer commits). You may want to |
@trishume nice one, that's a much better solution than what I had, thanks! :) |
@keith-hall reviewed the fix to Now all that's left for me to review is |
@keith-hall I just reviewed the I've now reviewed everything in this PR and am pretty happy with it. There's still the bugs that keep the tests from actually passing, but I'm thinking I/we can fix those in separate PRs. I'm ready to merge this now if you are. If you're ready to merge I think I'll do an interactive rebase to remove the dump changes to the first commit that changes them, just to avoid increasing the repo size unnecessarily. After merging we can create issues for all the bugs we find. The one that makes the C# test suite crash at least already has an issue: #37 I'm currently looking at why the ASP test has so many failures. I think walkdir might actually be giving us the tests in a different order (possibly because of different OSs, I'm on OSX). Mine runs ASP (with many failures), then Batch (success), then C# (success), then another C# test (crash #37). The ASP failures may or may not be the same underlying bug as you are encountering with Java. |
Also I notice that Travis is failing, but only the nightly one is failing, the stable one passes. And nightly fails on |
@trishume yep looks good, I'm ready to merge - I agree that it makes sense to have separate PRs for the remaining issues :) thanks for the cleanup and associated commentary - makes it super easy to follow along :) it's a pleasure working with you! do you want me to create the issue for the Java failures? I was thinking that it might make sense to tackle that first, then everything related to the same underlying issue will go away and make it easier to investigate what (if anything) else is wrong... I was curious about those Travis failures - what's the benefit of building against nightly, if I may ask, when it's so unstable? :) |
Wow, nice change! After all the bugs are fixed, are you planning to run this as part of regular tests? |
@robinst yah I'm thinking maybe move it to a module and then use it from both the example and |
@trishume FYI, I've fixed the Java failures over at https://github.com/forkeith/syntect/tree/set, and it reduces the number of ASP test failures quite drastically too. I'm not sure if I will get it to a state where it is Pull Request-able or not, but wanted you to see my progress so we can collaborate together rather than work separately to achieve the same thing. |
@keith-hall awesome. I'll take a look at it. Don't worry about duplicating work in the next few days. I'll be busy with school work until this weekend so I won't be doing any work on syntect (other than Githubbing). |
commit ammended later to remove changes to packdump files, to reduce repo size.
This change makes it so that it can't crash on non-UTF8 paths, makes lookup by file path more efficient, and avoids allocating a hash map data structure that is never used as a map. It also adds a test of the functionality Signed-off-by: Tristan Hume <[email protected]>
i.e. the ASP syntax tests make use of this behavior, to test whether line continuation punctuation works properly
I did this to make it clearer what was going on, so that I could understand it again and review that it was doing the right thing, since I clearly got it wrong the first time.
The error handling in test_file was rearranged to use the try! macro and ok_or combinator. This decreases indentation and makes the code easier to read. It is recommended to view this diff without whitespace to make it clearer what actually changed.
i.e. allow text after the assertion to not interfere with the assertion. This is how ST works when it executes the syntax tests.
This PR adds a new example called "syntest", which will parse and execute ST's "syntax_test_" files.
Currently, there is no way to reference a syntax definition in a SyntaxSet from it's original file path, which is how tests specify which syntax definition to use, so it just lets syntect choose based on the extension of the test file at the moment.
I used the regular "regex" crate to achieve the test line parsing - feel free to replace this with "onig" if you don't want the extra reference - I don't quite have enough experience yet to know how to use it effectively.
Also, if the syntax test files use Windows line endings, it has to replace "\r" with nothing, because otherwise the regular expressions don't match expressions like "$\n?" properly, but perhaps it would be better to update the main parsing module to fix this?
I've checked and it shows success on the XML, HTML and JSON syntax tests. I also tried modifying them to cause a test to fail, and syntest correctly picked that up.
However, it seems to detect a failure in the Haskell syntax test file, and it looks like some meta scopes are being applied multiple times - so I'd appreciate if you could take a look and see if it's a flaw in my syntest logic somewhere or a bug in the parser. Thanks!