You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that special characters are not handled correctly in Filters. For example, changing the name to "Bílbo Bággins" by adding accents here causes the test to fail:
line 1:18 token recognition error at: '"Bí'
line 1:21 extraneous input 'lbo' expecting CompValue
org.apache.directory.scim.spec.filter.FilterParseException: Failed to parse filter: name.givenName EQ "Bílbo" AND name.familyName EQ "Bággins"
at org.apache.directory.scim.spec.filter.Filter.parseFilter(Filter.java:92)
at org.apache.directory.scim.spec.filter.Filter.setFilter(Filter.java:70)
at org.apache.directory.scim.spec.filter.Filter.<init>(Filter.java:55)
at org.apache.directory.scim.spec.filter.FilterBuilderTest.testSimpleAnd(FilterBuilderTest.java:39)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
Caused by: java.lang.IllegalStateException: failed to parse at line 1:21 due to extraneous input 'lbo' expecting CompValue
at org.apache.directory.scim.spec.filter.Filter$1.syntaxError(Filter.java:81)
at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
at org.antlr.v4.runtime.DefaultErrorStrategy.reportUnwantedToken(DefaultErrorStrategy.java:377)
at org.antlr.v4.runtime.DefaultErrorStrategy.singleTokenDeletion(DefaultErrorStrategy.java:548)
at org.antlr.v4.runtime.DefaultErrorStrategy.recoverInline(DefaultErrorStrategy.java:467)
at org.antlr.v4.runtime.Parser.match(Parser.java:208)
at org.apache.directory.scim.spec.filter.FilterParser.filterExpression(FilterParser.java:459)
at org.apache.directory.scim.spec.filter.FilterParser.filter(FilterParser.java:128)
at org.apache.directory.scim.spec.filter.Filter.parseFilter(Filter.java:86)
... 6 more
The text was updated successfully, but these errors were encountered:
I only had a minute to poke around at this, but my guess is it's related to the ANTLR grammar, the def for STRING looks like it's only parsing ASCII, and would need to be improved.
STRING
: '"' (ESC | SAFECODEPOINT)* '"'
;
fragment ESC
: '\\' (["\\/bfnrt] | UNICODE)
;
fragment UNICODE
: 'u' HEX HEX HEX HEX
;
fragment HEX
: [0-9a-fA-F]
;
fragment SAFECODEPOINT
: ~ ["\\\u0000-\u001F]
;
I did a quick replacement in src/main/antlr4/imports/Json.g4 and it seemed to fix the parsing error, but test failed still because of an equality check of a filter with B\u00EDlbo and Bílbo. (my guess is test needs to be tweaked a bit 🤷, but I don't have time to dig into it more today)
Anyway, great find, hopefully the above info helps!
- Updated ANTLR4 grammar to support unicode chars
- Added test
- Removed logic that escapes/unescapes JSON values, should is handled by at the JSON parser (jackson)
Fixes: #564
It appears that special characters are not handled correctly in Filters. For example, changing the name to "Bílbo Bággins" by adding accents here causes the test to fail:
https://github.com/apache/directory-scimple/blob/develop/scim-spec/scim-spec-schema/src/test/java/org/apache/directory/scim/spec/filter/FilterBuilderTest.java#L33-L40
The text was updated successfully, but these errors were encountered: