-
-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Regex HTML_BLOCK_ELEMENT_R because of issue with self closing tags #546
Comments
Could you check again with the latest code? There was a sorting issue in the rules that might have contributed to this problem. I did perf test this particular change and was getting inconclusive results https://jsperf.app/joribi/1/preview |
Thank you very much for looking into it. It seems like the sorting fixes our main concern. The regex's performance improvement was only visible in large examples with a lot of text after the self-closing element. I tested again, and I couldn't see any performance difference now. |
Hey, here is a repro https://regex101.com/r/ac4mJP/1 Apparently, self closing tags cause a runaway regex and it just times out eventually if there is enough content. The fix proposed in here does fix this issue. Would you care to reopen the issue? |
@Goues if you run the adjusted regex against the unit tests it bails too early, but it is a lot faster. Working on finding a happy medium. Worth noting that the OP regex is not current (there's no /^ *(?!<[a-z][^ >/]* ?\/>)<([a-z][^ >/]*) ?([^>]*)>\n?(\s*(?:<\1[^>]*?>[\s\S]*?<\/\1>|(?!<\1\b)[\s\S])*?)<\/\1>(?!<\/\1>)\n*/i |
Ok I found a variation that works better
Thanks all, will get this into v7 |
Closes #546 Thank you @devbrains-com for contributing the basis of this fix!
Closes #546 Thank you @devbrains-com for contributing the basis of this fix!
Closes #546 Thank you @devbrains-com for contributing the basis of this fix!
Closes #546 Thank you @devbrains-com for contributing the basis of this fix!
We found out, the following regex is very slow and takes up to 50ms with a single self closing tag on the page.
const HTML_BLOCK_ELEMENT_R = /^ *(?!<[a-z][^ >/]* ?\/>)<([a-z][^ >/]*) ?([^>]*)\/{0}>\n?(\s*(?:<\1[^>]*?>[\s\S]*?<\/\1>|(?!<\1)[\s\S])*?)<\/\1>\n*/i
The reason for that seems to be a non working check for self closing tags
\/{0}
.The final regex would be:
const HTML_BLOCK_ELEMENT_R = /^ *(?!<[a-z][^ >/]* ?\/>)<([a-z][^ >/]*) ?((?:[^>]*[^/])?)>\n?(\s*(?:<\1[^>]*?>[\s\S]*?<\/\1>|(?!<\1)[\s\S])*?)<\/\1>\n*/i
Thank you very much
The text was updated successfully, but these errors were encountered: