Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cucumber-expressions: Use a parser to parse Cucumber Expressions #771

Merged
merged 196 commits into from
Dec 10, 2020
Merged
Show file tree
Hide file tree
Changes from 193 commits
Commits
Show all changes
196 commits
Select commit Hold shift + click to select a range
058cf84
cucumber-expressions: Use pseudo token rewrite parsing algorithm
mpkorstanje Oct 24, 2019
1f3f801
Revert pattern change
mpkorstanje Oct 24, 2019
4ff1f46
Edge cases
mpkorstanje Oct 24, 2019
44ebed2
Edge cases
mpkorstanje Oct 24, 2019
52fde50
Edge cases
mpkorstanje Oct 24, 2019
11270f5
Merge split and processing.
mpkorstanje Oct 24, 2019
a66e5b1
Use some functional stuff
mpkorstanje Oct 24, 2019
b6d6d30
Add grammar rules
mpkorstanje Oct 24, 2019
a450a46
Fix tests
mpkorstanje Oct 25, 2019
4d788fc
WIP Golang impl
mpkorstanje Oct 25, 2019
dbe4b9c
Fix alternation
mpkorstanje Oct 26, 2019
e8ce052
Tests pass
mpkorstanje Oct 26, 2019
43256a1
Additional tests pass
mpkorstanje Oct 26, 2019
a4ef4bc
Remove BiFunction
mpkorstanje Oct 26, 2019
ec58734
Naming
mpkorstanje Oct 26, 2019
291eb0f
Handle doubly escaped slashes
mpkorstanje Oct 27, 2019
aaa4ac0
Naming
mpkorstanje Oct 27, 2019
c084b17
Use FindAllStringSubmatch to remove some magic numbers
mpkorstanje Oct 27, 2019
417c564
Use fixed size slices
mpkorstanje Oct 27, 2019
9bb8fb7
Faster guesstimate
mpkorstanje Oct 27, 2019
eac0e51
Fix
mpkorstanje Oct 27, 2019
2a9006b
Remove redundant empty checks
mpkorstanje Oct 27, 2019
b7b1333
Replace sprintf with concat
mpkorstanje Oct 27, 2019
f4d96db
Fix
mpkorstanje Oct 27, 2019
0e7d713
Implement LR(1) parser
mpkorstanje Oct 31, 2019
a09de33
Rewrite AST to Regex
mpkorstanje Oct 31, 2019
69ef468
Fix
mpkorstanje Oct 31, 2019
006cfbe
Fix
mpkorstanje Oct 31, 2019
b01171a
Fix
mpkorstanje Oct 31, 2019
ec8c89e
Fix
mpkorstanje Oct 31, 2019
7d9463a
Extract token rewrite rules
mpkorstanje Nov 1, 2019
3dfe92f
Add lookahead for optional
mpkorstanje Nov 1, 2019
8e4f3d2
Update grammar
mpkorstanje Nov 1, 2019
22181bc
Move alternation to top level
mpkorstanje Nov 1, 2019
b15d190
Clean up
mpkorstanje Nov 1, 2019
305fc1f
Fiddling
mpkorstanje Nov 1, 2019
9d554b0
Allow parameter under optional
mpkorstanje Nov 1, 2019
4c8d376
Clean up
mpkorstanje Nov 1, 2019
84e92b5
Document tokens
mpkorstanje Nov 1, 2019
fbb2828
Document tokens
mpkorstanje Nov 1, 2019
41cd092
Document tokens
mpkorstanje Nov 1, 2019
a903c5f
Document tokens
mpkorstanje Nov 1, 2019
f6538aa
Document tokens
mpkorstanje Nov 1, 2019
1a701d0
Clean up
mpkorstanje Nov 2, 2019
04367eb
Use proper SOL/EOL tokens
mpkorstanje Nov 2, 2019
0931afd
Naming
mpkorstanje Nov 2, 2019
8d5f12a
Clean up
mpkorstanje Nov 7, 2019
5f92932
Make AST explicit
mpkorstanje Nov 7, 2019
543d030
Implement tokenizer in go
mpkorstanje Nov 7, 2019
a1484ef
parse optional and parameters
mpkorstanje Nov 7, 2019
bebe51a
parse alternation
mpkorstanje Nov 8, 2019
17e9afa
check entire ast in java
mpkorstanje Nov 8, 2019
9700f16
rewrite ast to pattern
mpkorstanje Nov 8, 2019
6aa1d48
clean up
mpkorstanje Nov 8, 2019
97e7910
Add missing tests
mpkorstanje Nov 8, 2019
d909c52
Clean up
mpkorstanje Nov 8, 2019
d5ce42d
rewrite --> compile
mpkorstanje Nov 8, 2019
d83aa48
Revert "rewrite --> compile"
mpkorstanje Nov 8, 2019
5b7c150
Clean up dead code
mpkorstanje Nov 8, 2019
5206678
Clean up dead code
mpkorstanje Nov 8, 2019
45c8094
Add regex production rules
mpkorstanje Nov 8, 2019
dc180b2
Add note about empty invalids
mpkorstanje Nov 8, 2019
40e5385
gofmt
aslakhellesoy Jan 14, 2020
7fb4d6b
Reduce number of tokens used
mpkorstanje Jun 6, 2020
8685896
Reject incomplete expressions
mpkorstanje Jun 6, 2020
f79c662
Put parser definitions at the top of the file
mpkorstanje Jun 6, 2020
638702f
Clean up
mpkorstanje Jun 6, 2020
df3f769
Remove tokens from AST
mpkorstanje Jun 6, 2020
93d3f68
Remove out variable
mpkorstanje Jun 6, 2020
eb3bdc7
Improve exceptions
mpkorstanje Jul 10, 2020
bd82829
Clean up
mpkorstanje Jul 10, 2020
8bcfad0
Clean up
mpkorstanje Jul 10, 2020
a0872e7
Clean up
mpkorstanje Jul 10, 2020
0e132c1
WIP
mpkorstanje Jul 12, 2020
b344af8
Nail down error messages
mpkorstanje Jul 15, 2020
ca188be
WIP
mpkorstanje Jul 16, 2020
98583f5
WIP
mpkorstanje Jul 16, 2020
acd344e
Fix off by one
mpkorstanje Jul 17, 2020
d77a278
Reduce unused exception to validation
mpkorstanje Jul 17, 2020
a52d07d
Structure and naming
mpkorstanje Jul 17, 2020
2258c5b
Structure and naming
mpkorstanje Jul 17, 2020
baf8616
Structure and naming
mpkorstanje Jul 17, 2020
070ffa7
Structure and naming
mpkorstanje Jul 17, 2020
a3a1d8d
Structure and naming
mpkorstanje Jul 17, 2020
083a37f
Merge branch 'master' into tokenize-cucumber-expression
mpkorstanje Jul 24, 2020
49271a1
Structure and naming
mpkorstanje Jul 17, 2020
98b84b0
WIP
mpkorstanje Jul 24, 2020
25dceb1
WIP
mpkorstanje Jul 24, 2020
aeb55fc
Merge branch 'master' into tokenize-cucumber-expression
mpkorstanje Jul 24, 2020
239176c
Fix go compile
mpkorstanje Jul 31, 2020
712b7da
Use parameter type as boundary for alternation
mpkorstanje Sep 8, 2020
9704fb0
Merge remote-tracking branch 'origin/master' into tokenize-cucumber-e…
mpkorstanje Sep 8, 2020
c210fd6
Move unit tests to the right unit
mpkorstanje Sep 10, 2020
54beaf4
Self generate test data
mpkorstanje Sep 10, 2020
ae15b9c
Self generate test data
mpkorstanje Sep 10, 2020
4b2cb5a
Clean up test
mpkorstanje Sep 10, 2020
8d1dc0c
Split composite tests
mpkorstanje Sep 10, 2020
1080935
Clean up
mpkorstanje Sep 10, 2020
4c06f74
Rename tokens to elements
mpkorstanje Sep 10, 2020
9ca3c74
More self generation
mpkorstanje Sep 10, 2020
139d9b0
Render element as json with single quotes (because yaml)
mpkorstanje Sep 10, 2020
a8dc570
Render ast elements as json with single quotes (because yaml)
mpkorstanje Sep 10, 2020
62c4f76
Remove tests
mpkorstanje Sep 10, 2020
a237f89
Clean up test cases
mpkorstanje Sep 10, 2020
ab86517
Rename test cases
mpkorstanje Sep 10, 2020
cadf3f9
Clean up test cases
mpkorstanje Sep 10, 2020
035e0a1
Fix json for ast
mpkorstanje Sep 10, 2020
06179f1
Fix json for tokens
mpkorstanje Sep 10, 2020
d1f4fd1
Fix json for ast
mpkorstanje Sep 10, 2020
1f6d3cf
Clean up
mpkorstanje Sep 10, 2020
f70310c
Clean up
mpkorstanje Sep 10, 2020
1382db4
Clean up
mpkorstanje Sep 10, 2020
9f2ab19
Golang json and yaml experiment (doesn't work)
mpkorstanje Sep 11, 2020
d49cde6
Json is hard
mpkorstanje Sep 11, 2020
f89858b
Read yaml and json in golang
mpkorstanje Sep 11, 2020
fb6acf0
Make test work
mpkorstanje Sep 16, 2020
998fa0d
empty string passes
mpkorstanje Sep 16, 2020
d508aac
some tests pass
mpkorstanje Sep 16, 2020
57c216e
more tests pass
mpkorstanje Sep 16, 2020
4f65ab1
More tests pass
mpkorstanje Sep 17, 2020
ca41221
Fix more
mpkorstanje Sep 17, 2020
27949d6
Throw errors when expected
mpkorstanje Sep 17, 2020
881e8ce
Naming
mpkorstanje Sep 17, 2020
1897cef
Fix examples
mpkorstanje Sep 17, 2020
4da29d8
Single quotes for messages
mpkorstanje Sep 17, 2020
9ca6499
All acceptance tests pass
mpkorstanje Sep 17, 2020
5d3197f
Extract acceptance tests for CucumberExpressions
mpkorstanje Sep 17, 2020
a9d5d2c
Improve illegal character in parameter name message
mpkorstanje Sep 17, 2020
0105780
Complain
mpkorstanje Sep 17, 2020
a0722ca
Improve UndefinedParameterTypeException message
mpkorstanje Sep 17, 2020
41586cb
Attribution
mpkorstanje Sep 17, 2020
9650cab
Make acceptance tests work
mpkorstanje Sep 18, 2020
059c373
Nearly all tests pass
mpkorstanje Sep 18, 2020
fa0037d
Fix escaping of string parameter types in go
mpkorstanje Sep 18, 2020
2ca716a
Merge branch 'master' into tokenize-cucumber-expression
mpkorstanje Sep 18, 2020
48cc016
Make go pass
mpkorstanje Sep 19, 2020
1a02b26
Drop examples from js and ruby
mpkorstanje Sep 19, 2020
07865d7
configure sync for test data
mpkorstanje Sep 19, 2020
e24e84c
Revert a bit more on ruby/js examples
mpkorstanje Sep 19, 2020
3b434ad
Fix sync
mpkorstanje Sep 19, 2020
c2b50ad
Sync test data for go
mpkorstanje Sep 19, 2020
00495e0
Don't make test data
mpkorstanje Sep 19, 2020
0f0ab77
go test data again!
mpkorstanje Sep 19, 2020
ba080f6
Implement failing tokenizer tests in js
mpkorstanje Sep 20, 2020
0857cd1
Clean up
mpkorstanje Sep 20, 2020
8e446dd
Empty string passes
mpkorstanje Sep 20, 2020
3d2bea2
All but the exceptional cases pass
mpkorstanje Sep 20, 2020
3bd6ab3
All tokenizer cases pass
mpkorstanje Sep 20, 2020
95c5f44
Make the code look the same
mpkorstanje Sep 20, 2020
20d4031
Simplify new token creation condition
mpkorstanje Sep 20, 2020
ddb8b05
Add parser test. Empty string passes
mpkorstanje Sep 21, 2020
d2f638d
Parse all the other expressions
mpkorstanje Sep 24, 2020
83623ff
Throw the right exceptions
mpkorstanje Sep 24, 2020
c16467b
Run expression acceptance tests
mpkorstanje Sep 24, 2020
d580f5f
More tests pass
mpkorstanje Sep 24, 2020
d82b0e6
All tests pass!
mpkorstanje Sep 24, 2020
e8e9cb4
Fix lint
mpkorstanje Sep 24, 2020
d1f0380
Keep api the same
mpkorstanje Sep 24, 2020
646e2e9
Merge remote-tracking branch 'origin/master' into tokenize-cucumber-e…
mpkorstanje Sep 24, 2020
0a7f2a2
Merge branch 'master' into tokenize-cucumber-expression
aslakhellesoy Oct 1, 2020
ce04c32
Formatting
Oct 1, 2020
2f7bd36
First passing ruby test
Oct 1, 2020
e48f751
Update plan
Oct 1, 2020
ff96d71
All tokenizer tests green
Oct 1, 2020
d938358
Merge branch 'master' into tokenize-cucumber-expression
aslakhellesoy Oct 3, 2020
fa3d6c8
Merge branch 'master' into tokenize-cucumber-expression
mpkorstanje Oct 29, 2020
8db2905
Merge branch 'master' into tokenize-cucumber-expression
mpkorstanje Nov 10, 2020
7d58abc
Implement failing test for Cucumber expression parser
mpkorstanje Nov 10, 2020
88140f7
Clean up
mpkorstanje Nov 10, 2020
f792bb9
Return expression node
mpkorstanje Nov 14, 2020
d1361e0
Transpiled a few methods
mpkorstanje Nov 15, 2020
86d6451
More tests pass
mpkorstanje Nov 15, 2020
270cab5
More pass
mpkorstanje Nov 15, 2020
c8b535c
Clean up
mpkorstanje Nov 15, 2020
64670cd
All parser tests pass
mpkorstanje Nov 17, 2020
d8e9a07
Merge master
Nov 17, 2020
0221ba6
Failing tests for cucumber expression
mpkorstanje Nov 18, 2020
4ce15dc
Some tests pass
mpkorstanje Nov 21, 2020
69f0088
More tests pass
mpkorstanje Nov 21, 2020
20c0c5b
All test pass
mpkorstanje Nov 21, 2020
812255a
Fix typescript again?
mpkorstanje Nov 21, 2020
8c3c62a
Merge branch 'master' into tokenize-cucumber-expression
mpkorstanje Nov 21, 2020
5a5d3b7
Fix naming conventions
mpkorstanje Nov 21, 2020
dd49b5f
Extract regex tests
mpkorstanje Nov 22, 2020
63462bc
Ruby conventions
mpkorstanje Nov 22, 2020
b977efb
Update docs
mpkorstanje Nov 22, 2020
912c1ad
java: Explicitly forbid nested optionals and limit forbidden paramete…
mpkorstanje Nov 23, 2020
2545091
go: Explicitly forbid nested optionals and limit forbidden parameter …
mpkorstanje Nov 27, 2020
f0ba7b2
js: Explicitly forbid nested optionals and limit forbidden parameter …
mpkorstanje Nov 28, 2020
df464a3
ruby: Explicitly forbid nested optionals and limit forbidden paramete…
mpkorstanje Nov 28, 2020
51d8051
Merge branch 'master' into tokenize-cucumber-expression
aslakhellesoy Dec 10, 2020
e253534
Raise errors from blocks
Dec 10, 2020
3e228e8
More ruby idioms
Dec 10, 2020
81b41ae
More ruby idioms
Dec 10, 2020
f103768
More ruby idioms
Dec 10, 2020
b0bc335
Attribution. Closes #601. Closes #726. Closes #767. Closes #770.
Dec 10, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions cucumber-expressions/Makefile
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
LANGUAGES ?= go javascript ruby java
include default.mk
66 changes: 64 additions & 2 deletions cucumber-expressions/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,71 @@
See [website docs](https://cucumber.io/docs/cucumber/cucumber-expressions/) for more details.
See [website docs](https://cucumber.io/docs/cucumber/cucumber-expressions/)
for more details.

## Grammar ##

A Cucumber Expression has the following AST:

```
cucumber-expression := ( alternation | optional | parameter | text )*
alternation := (?<=left-boundary) + alternative* + ( '/' + alternative* )+ + (?=right-boundary)
left-boundary := whitespace | } | ^
right-boundary := whitespace | { | $
alternative: = optional | parameter | text
optional := '(' + option* + ')'
option := optional | parameter | text
parameter := '{' + name* + '}'
name := whitespace | .
text := whitespace | ')' | '}' | .
```

The AST is constructed from the following tokens:
```
escape := '\'
token := whitespace | '(' | ')' | '{' | '}' | '/' | .
. := any non-reserved codepoint
```

Note:
* While `parameter` is allowed to appear as part of `alternative` and
`option` in the AST, such an AST is not a valid a Cucumber Expression.
* While `optional` is allowed to appear as part of `option` in the AST,
such an AST is not a valid a Cucumber Expression.
* ASTs with empty alternatives or alternatives that only
contain an optional are valid ASTs but invalid Cucumber Expressions.
* All escaped tokens (tokens starting with a backslash) are rewritten to their
unescaped equivalent after parsing.

### Production Rules

The AST can be rewritten into a regular expression by the following production
rules:

```
cucumber-expression -> '^' + rewrite(node[0]) + ... + rewrite(node[n-1]) + '$'
alternation -> '(?:' + rewrite(node[0]) +'|' + ... +'|' + rewerite(node[n-1]) + ')'
alternative -> rewrite(node[0]) + ... + rewrite(node[n-1])
optional -> '(?:' + rewrite(node[0]) + ... + rewrite(node[n-1]) + ')?'
parameter -> {
parameter_name := node[0].text + ... + node[n-1].text
parameter_pattern := parameter_type_registry[parameter_name]
'((?:' + parameter_pattern[0] + ')|(?:' ... + ')|(?:' + parameter_pattern[n-1] + '))'
}
text -> {
escape_regex := escape '^', `$`, `[`, `]`, `(`, `)` `\`, `{`, `}`, `.`, `|`, `?`, `*`, `+`
escape_regex(token.text)
}
```

## Acknowledgements

The Cucumber Expression syntax is inspired by similar expression syntaxes in
other BDD tools, such as [Turnip](https://github.com/jnicklas/turnip), [Behat](https://github.com/Behat/Behat) and [Behave](https://github.com/behave/behave).
other BDD tools, such as [Turnip](https://github.com/jnicklas/turnip),
[Behat](https://github.com/Behat/Behat) and
[Behave](https://github.com/behave/behave).

Big thanks to Jonas Nicklas, Konstantin Kudryashov and Jens Engel for
implementing those libraries.

The [Tiny-Compiler-Parser tutorial](https://blog.klipse.tech/javascript/2017/02/08/tiny-compiler-parser.html)
by [Yehonathan Sharvit](https://github.com/viebel) inspired the design of the
Cucumber expression parser.
6 changes: 5 additions & 1 deletion cucumber-expressions/examples.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ I have {int} cuke(s)
I have 22 cukes
[22]
---
I have {int} cuke(s) and some \[]^$.|?*+
I have {int} cuke(s) and some \\[]^$.|?*+
I have 1 cuke and some \[]^$.|?*+
[1]
---
Expand Down Expand Up @@ -37,3 +37,7 @@ a purchase for $33
Some ${float} of cukes at {int}° Celsius
Some $3.50 of cukes at 42° Celsius
[3.5,42]
---
I select the {int}st/nd/rd/th Cucumber
I select the 3rd Cucumber
[3]
1 change: 1 addition & 0 deletions cucumber-expressions/go/.rsync
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
../../.templates/github/ .github/
../../.templates/go/ .
../examples.txt examples.txt
../testdata .
147 changes: 147 additions & 0 deletions cucumber-expressions/go/ast.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
package cucumberexpressions

import (
"strings"
"unicode"
)

const escapeCharacter rune = '\\'
const alternationCharacter rune = '/'
const beginParameterCharacter rune = '{'
const endParameterCharacter rune = '}'
const beginOptionalCharacter rune = '('
const endOptionalCharacter rune = ')'

type nodeType string

const (
textNode nodeType = "TEXT_NODE"
optionalNode nodeType = "OPTIONAL_NODE"
alternationNode nodeType = "ALTERNATION_NODE"
alternativeNode nodeType = "ALTERNATIVE_NODE"
parameterNode nodeType = "PARAMETER_NODE"
expressionNode nodeType = "EXPRESSION_NODE"
)

type node struct {
NodeType nodeType `json:"type"`
Start int `json:"start"`
End int `json:"end"`
Token string `json:"token"`
Nodes []node `json:"nodes"`
}

func (node node) text() string {
builder := strings.Builder{}
builder.WriteString(node.Token)

if node.Nodes == nil {
return builder.String()
}

for _, c := range node.Nodes {
builder.WriteString(c.text())
}
return builder.String()
}

type tokenType string

const (
startOfLine tokenType = "START_OF_LINE"
endOfLine tokenType = "END_OF_LINE"
whiteSpace tokenType = "WHITE_SPACE"
beginOptional tokenType = "BEGIN_OPTIONAL"
endOptional tokenType = "END_OPTIONAL"
beginParameter tokenType = "BEGIN_PARAMETER"
endParameter tokenType = "END_PARAMETER"
alternation tokenType = "ALTERNATION"
text tokenType = "TEXT"
)

type token struct {
Text string `json:"text"`
TokenType tokenType `json:"type"`
Start int `json:"start"`
End int `json:"end"`
}

var nullNode = node{textNode, -1, -1, "", nil}

func isEscapeCharacter(r rune) bool {
return r == escapeCharacter
}

func canEscape(r rune) bool {
if unicode.Is(unicode.White_Space, r) {
return true
}
switch r {
case escapeCharacter:
return true
case alternationCharacter:
return true
case beginParameterCharacter:
return true
case endParameterCharacter:
return true
case beginOptionalCharacter:
return true
case endOptionalCharacter:
return true
}
return false
}

func typeOf(r rune) (tokenType, error) {
if unicode.Is(unicode.White_Space, r) {
return whiteSpace, nil
}
switch r {
case alternationCharacter:
return alternation, nil
case beginParameterCharacter:
return beginParameter, nil
case endParameterCharacter:
return endParameter, nil
case beginOptionalCharacter:
return beginOptional, nil
case endOptionalCharacter:
return endOptional, nil
}
return text, nil
}

func symbol(tokenType tokenType) string {
switch tokenType {
case beginOptional:
return string(beginOptionalCharacter)
case endOptional:
return string(endOptionalCharacter)
case beginParameter:
return string(beginParameterCharacter)
case endParameter:
return string(endParameterCharacter)
case alternation:
return string(alternationCharacter)
}

return ""
}

func purpose(tokenType tokenType) string {
switch tokenType {
case beginOptional:
return "optional text"
case endOptional:
return "optional text"
case beginParameter:
return "a parameter"
case endParameter:
return "optional text"
case alternation:
return "alternation"
}

return ""
}
Loading