-
-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User-Agent sniffing broken by Chrome 100 and Firefox 100 #707
Comments
Thanks for reporting! Was considering keeping old version of werkzeug, but maybe that's not the best solution given this issue.. |
werkzeug developers suggest using https://github.com/ua-parser/uap-python instead in pallets/werkzeug#2078 Was also thinking about just inverting the browser sniffing check so that object proxies are used by default and the non-proxy rewrites are only used on browsers known to not support them (MSIE, Chrome < 49, Firefox < 44 etc). |
Yep, was thinking the same thing, working on a fix. |
…ency Update (2.6.6) (#708) * js rewriting: default to moden js-proxy based rewriting by default, use legacy rewriting only if browsers are older than minimum, as suggested in #707 * user-agent detection: use ua_parser for user-agent detection instead of obsolete werkzeug.useragent, which also did not support browsers >=100 * tests: additional tests for rewriting with various user-agents, defaulting to new-style rewriting for unknown browsers * dockerfile: Update Dockerfile to use py3.8 * tests: skip s3 tests dependent on commoncrawl data (for now, need better s3 tests). * bump to 2.6.6, update CHANGES
Fixed in the 2.6.6 release! |
Retested our problem sites with pywb 2.6.6 and indeed fixed. Thanks very much. :-) |
Describe the bug
Pywb sniffs the user-agent in RewriterWithJSProxy.ua_allows_obj_proxy to decide whether it can use object proxies for JS rewriting. This appears to have broken with Chrome 100 and Firefox 100:
$ curl -s -H'User-Agent: Chrome/99' http://localhost:8080/test/20220411050413mp_/http://example.org/test.html | grep 123
foo.location = 123;
$ curl -s -H'User-Agent: Chrome/100' http://localhost:8080/test/20220411050413mp_/http://example.org/test.html | grep 123
foo.WB_wombat_location = 123;
$ curl -s -H'User-Agent: Firefox/99' http://localhost:8080/test/20220411050413mp_/http://example.org/test.html | grep 123
foo.location = 123;
$ curl -s -H'User-Agent: Firefox/100' http://localhost:8080/test/20220411050413mp_/http://example.org/test.html | grep 123
foo.WB_wombat_location = 123;
This causes many pages in the wild to redirect to a bogus URL on load including:
https://www.climate200.com.au/
https://www.news.com.au/
https://www.abc.net.au/news/
Steps to reproduce the bug
Expected output: OK
Actual output: URL Not Found http://example.org/123
Environment
Working: Linux Chrome v99.0.4844.84
Not working: Linux Chrome v100.0.4896.60
Not working: Linux Chrome v100.0.4896.75
Working: Linux Firefox 99.0
Not working: Linux Firefox Developer Edition 100.0b2
Additional Context
The user agent parsing is done by werkzeug.useragents.UserAgent and seems to be deprecated and removed in newer versions werkzeug.
The text was updated successfully, but these errors were encountered: