-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sys/shell: refactor tokenizer code #13197
Conversation
d956584
to
3d08898
Compare
3d08898
to
bf6485c
Compare
5fec2d5
to
a61d66a
Compare
Rebased and ready for review :) |
Check that single and double quotes work, along with backslash escaping and that malformed strings are rejected. Right now the test is failing. The next commit will replace the tokenizer with one that works correctly. Co-authored-by: Juan Carrano <[email protected]>
The tokenizer (the code that breaks up the line given to the shell into strings to create argv) was quite a messy piece of code. This commit refactors it into a more traditional state-machine based parser. This fixes the issues with quote handling exposed by the recently introduced test. Co-authored-by: Juan Carrano <[email protected]>
a61d66a
to
ee1e1ad
Compare
Rebased to master and Murdock builds were successful. |
Could you please also take a look at this one, @fjmolinas ? :) |
Tabs are processed as before now. I replaced the state diagram by a text description which I hope is clear enough. I also did some other improvements in the code. |
9efd64e
to
ad7f6c4
Compare
I'll take a closer look at the latest changes on Monday! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small nitpicks, otherwise this PR is OK IMO. I'll run the test on many boards and see if everything is ok.
sys/shell/shell.c
Outdated
/** | ||
* Break input line into words, create argv and call the command handler. | ||
* | ||
* Words are broken up at spaces. A backslash escaped the character that comes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Words are broken up at spaces. A backslash escaped the character that comes | |
* Words are broken up at spaces. A backslash escapes the character that comes |
tests/shell/tests/01-run.py
Outdated
('echo abcdef"ghijklmn', 'shell: incorrect quoting'), | ||
('echo abcdef\'ghijklmn', 'shell: incorrect quoting'), | ||
|
||
# test more complex output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
# test more complex output | |
# test default commands |
All good from my local tests https://ci.inria.fr/ci-riot-tribe/job/build-pipeline.jk/lastSuccessfulBuild/artifact/failuresummary.md/*view*/ (the failing boards had connection issues, I re-launched the test and it worked). |
@HendrikVE please squash once my last remaining comments are addressed! |
Factor out common code for quoted and unquoted tokens. This makes the code slighly less clear, but it also eliminates repetition (which may improve clarity). Co-authored-by: Juan Carrano <[email protected]>
The code for traversing arrays of shell commands (used to print help messages and to search for commmand handlers) was needlessly complex. Co-authored-by: Juan Carrano <[email protected]>
Code now correctly handles quotes within PARSE_UNQUOTED and tabs are now considered a BLANK just like a space.
Divide test cases in to groups and add test cases for: - multiple spaces between arguments - tabs between arguments - leading/trailing spaces - more simple variations for escaping - multiple tests for correct quoting A second occurence of the test case "('help', EXPECTED_HELP)," was removed.
6f12972
to
cbf5cc5
Compare
I changed the comments as suggested and squashed my commits. But this time, I'm not sure about how much they should be squashed. One possibility would be to have 2 commits |
I prefer keeping the division as you have now, I'd rather split up the different changes. So LGTM! |
Lets see if murdock is happy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, the code changes improve:
- readability
- escaped characters handling
- quoted strings handling
There is an increase in the shell code, but the code itself is pretty complex, so IMO that change is expected. Maybe some people obsessed by code size want to take a second look (@kaspar030 @bergzand ), but IMO this is OK. ACK!
Ajj seems like there is a problem with a worker again... |
Go! |
Awesome, thank you @fjmolinas ! |
Contribution description
With this PR I simply copied PR #12085 authored by @jcarrano. Since he left the team, I hereby offer to take care of rebasing and implement required changes.
The tokenizer (the code that breaks up the line given to the shell into strings to create argv) was quite a messy piece of code. This commit refactors it into a more traditional state-machine based parser.
This fixes the issues with quote handling exposed by the recently introduced test.
Testing procedure
Test vectors were added to the automated test. The old code could not handle this string:
The new code handles it correctly.
Issues/PRs references
Copy of #12085
Dependencies: