Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ECMAScript unicode-mode RegExp usage for 'pattern' and 'patternProperties' #353

Closed
djgoku opened this issue Nov 14, 2023 · 7 comments · Fixed by #511
Closed

Support ECMAScript unicode-mode RegExp usage for 'pattern' and 'patternProperties' #353

djgoku opened this issue Nov 14, 2023 · 7 comments · Fixed by #511

Comments

@djgoku
Copy link
Contributor

djgoku commented Nov 14, 2023

Instead of duplicating everything here is my problem.

If you want me to copy anything for this issue I can.

awslabs/amazon-ecs-intellisense-schema#8

I am looking to see how I can support Python Unicode regex so I can use this json schema.

@sirosen
Copy link
Member

sirosen commented Nov 14, 2023

If I have understood the issue correctly, this is a matter of a schema using an ECMA regex syntax which python (the language) does not support.

However, luckily, we're using regress here to provide ECMA-compatible regex support for format. Which means that I need to know what type of match is failing, and then with through instrumenting the right support.

@djgoku
Copy link
Contributor Author

djgoku commented Nov 15, 2023

Thanks for looking into this. I am opening to helping if I can. I'll try to check out the errors again tomorrow.

@sirosen sirosen changed the title Python Unicode regex support Support ECMAScript unicode-mode RegExp usage for 'pattern' and 'patternProperties' Dec 6, 2023
@sirosen
Copy link
Member

sirosen commented Dec 6, 2023

To put all the information for this in one place (with some fun emojis):

@Rojax
Copy link

Rojax commented Oct 17, 2024

I'm having a similar issue described here: usnistgov/metaschema#770.

In short:
Having this

"pattern": "^(\\p{L}|_)(\\p{L}|\\p{N}|[.\\-_])*$" 

results in this

Error: schemafile was not valid: '^(\\p{L}|_)(\\p{L}|\\p{N}|[.\\-_])*$' is not a 'regex'
Failed validating 'format' in metaschema['properties']['definitions']['additionalProperties']['properties']['pattern']:
    {'type': 'string', 'format': 'regex'}
On schema['definitions']['TokenDatatype']['pattern']:
    '^(\\p{L}|_)(\\p{L}|\\p{N}|[.\\-_])*$'
SchemaError: '^(\\p{L}|_)(\\p{L}|\\p{N}|[.\\-_])*$' is not a 'regex'
Failed validating 'format' in metaschema['properties']['definitions']['additionalProperties']['properties']['pattern']:
    {'type': 'string', 'format': 'regex'}
On schema['definitions']['TokenDatatype']['pattern']:
    '^(\\p{L}|_)(\\p{L}|\\p{N}|[.\\-_])*$'

@sirosen
Copy link
Member

sirosen commented Oct 20, 2024

I took some time today to do the internal restructuring I've been meaning to do, in order to make this possible. It's most of the way there, but I've hit a bit of a strange case with custom validators which I need to sort out. And I still need to put together a good test case to verify my new work.

I think this will not work with arbitrary custom validators, at least for the initial version. In order to attach the alternate pattern validation to a validator class, I'm using the extend API. In theory, a custom validator class could have changes which that API will not preserve. I need to work out how to document this, since it's subtle.

And the other notable thing here is that this is a change to pattern but not patternProperties, at least at the moment. The two are different and each requires it's own implementation, though they can share some bits.

@sirosen
Copy link
Member

sirosen commented Jan 8, 2025

I've just released v0.31.0, which uses unicode-mode JS regexes by default! 🎉

You can control the behavior with the (new) --regex-variant flag, e.g., --regex-variant nonunicode or --regex-variant python.
As always, I hope everyone gets benefit from the new feature and lets me know if they see any issues!

@Rojax
Copy link

Rojax commented Jan 8, 2025

Just tested v0.31.0 and can confirm that my issue described in #353 (comment) is now gone. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants