Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewriting of local variable 'location' can incorrectly introduce a redirect #684

Closed
ato opened this issue Nov 26, 2021 · 2 comments
Closed

Comments

@ato
Copy link
Contributor

ato commented Nov 26, 2021

Expected behavior

Replaying https://anzac100.initiatives.qld.gov.au/legacy-projects/bcc/index.aspx should not redirect to https://anzac100.initiatives.qld.gov.au/.

What actually happened

The script at https://anzac100.initiatives.qld.gov.au/assets/v3/script/qgov-environment.js contains the following code (reformatted for readability):

var productionSites = ['http://anzac100.initiatives.qld.gov.au','http://www.anzac100.initiatives.qld.gov.au'];
var swe = {isProduction: (function (location) {
    var i;
    location = location.protocol + '//' + location.hostname;
    for (i = 0; i < productionSites.length; i++) {
        if (productionSites[i] === location) {
            return true;
        }
    }
    return false;
}(window.location)}

Note that while the location parameter is a reference to window.location, the location = assignment to it is not a redirect, it just changes the value of the local variable.

Pywb however rewrites the location assignment line to this double assignment in which the .href assignment triggers a redirect to the base hostname:

location = ((self.__WB_check_loc && self.__WB_check_loc(location)) || {}).href = location.protocol + '//' + location.hostname;

We've worked around this by blocking access to the script in question as it's not needed for the site to function (it's just for analytics). I'm reporting this not because I expect we can fix it but as another interesting example we ran into in the wild that looks like it would be hard to rewrite correctly without a full JavaScript parser that can work out that 'location =' in this context is a local assignment.

Browser

  • OS: Linux
  • Browser Firefox
  • Version 94.0.2
@ikreymer
Copy link
Member

Thanks for sharing this. Yeah, this is a tricky edge case, luckily not something that happens very often. The location = assignment triggers an assignment only when the location is a global variable, even if the object is the same as the global one, and there's not really a definitive way to determine that.

Eg.

function no_reload (location) {
  location = location.href;
}
no_reload(location);

function reload() {
  location = location.href; // this will cause a reload
}
reload()

The best workaround I've found is to pass the arguments in to __WB_check_loc(location), and then skip the assignment if the one of the arguments matches the location. This will at least address this particular issue, but not generally, if the location is bound to a different closure, but hopefully that is less common..

ikreymer added a commit to webrecorder/wombat that referenced this issue Nov 26, 2021
…object is in the arguments array and skip to avoid some false positive location assignments. partial fix for webrecorder/pywb#684
ikreymer added a commit that referenced this issue Nov 26, 2021
…to allow skipping assignment for local 'location's, partial fix to #684
ikreymer added a commit that referenced this issue Nov 26, 2021
ikreymer added a commit to webrecorder/wombat that referenced this issue Nov 27, 2021
* location assignment check: when assigning to location, check if same object is in the arguments array and skip to avoid some false positive location assignments, partial fix for webrecorder/pywb#684

* test: add overrides test to ensure local assign via __WB_check_loc() does not trigger redirect

* bump to 3.3.5
ikreymer added a commit to webrecorder/wabac.js that referenced this issue Nov 27, 2021
…cation that is a local var, supported via wombat 3.3.5

- add 'let arguments;' to top-level override if used in top-level scope. partial fix related to issue in webrecorder/pywb#684
ikreymer added a commit to webrecorder/wabac.js that referenced this issue Nov 28, 2021
* refactor request parsing and post-to-get conversion into separate ArchiveRequest object

* loading improvements:
- blockloaders: httploader: support retrying fetch if 429 received, up-to 20 retries, with 2sec increased backoff for each retry
- web worker loading fix: various fixes to ensure loading in separate web worker (as opposed to the main service worker) is still working.
- warcloader progress update: send update on total number of records loaded, support extra message from progress update
- dependency: update to warcio 1.5.0, faster WARC parsing

* rewrite fix: pass 'arguments' to __WB_check_loc() to guard against location that is a local var
- add 'let arguments;' to top-level override if used in top-level scope. partial fix related to issue in webrecorder/pywb#684
- dependency: update to wombat 3.3.5

* proxy origin mode support: if service worker query param 'proxyOriginMode' and root collection are set, the service worker operates in proxy mode on a single origin, without rewriting and loading from root collection. Relative URLs on current origin are rewritten to absolute urls using the 'extraConfig.proxyOrigin' setting, which must be provided for the root collection. Allows hosting a web archive on an existing URL (experimental)

* bump to 2.10.0-beta.0
ikreymer added a commit to webrecorder/replayweb.page that referenced this issue Nov 28, 2021
- use dedicated web worker for loading, remove when done
- update to warcio 1.5.0 for faster loading of large WARCs
- show WARC records loading for better progress update

* fidelity: fix potentially invalid location redirect via wombat 3.3.5 (see: webrecorder/pywb#684)

* dependencies: wabac.js 2.10.0-beta.0 (includes warcio 1.5.0, wombat 3.3.5)

* bump to 1.5.6, update CHANGES
ikreymer added a commit that referenced this issue Dec 23, 2021
* template/custom env var fix:
- ensure pywb.host_prefix, pywb.app_prefix and pywb.static_prefix set for all requests via prepare_env()
- ensure X-Forwarded-Proto is accounted for in pywb.host_prefix
- call prepare_env() in handle_request(), and also in rewriterapp (in case using a different front-end app).

* update wombat to 3.3.6 (includes partial fix for #684)
* bump version to 2.6.3
ikreymer added a commit that referenced this issue Dec 23, 2021
location rewrite: pass 'arguments' to rewrite func to guard against rewriting local 'location' in some cicrumstances, partial fix for #684
ci: add automated docker push on new v-* tag
ikreymer added a commit that referenced this issue Dec 23, 2021
CHANGES: update changes for 2.6.3

location rewrite: pass 'arguments' to rewrite func to guard against rewriting local 'location' in some circumstances, partial fix for #684

ci: add automated docker push on new v-* tag
@ikreymer
Copy link
Member

This is now fixed in 2.6.3, as well as changes in wombat and also in wabac.js
It is not a general fix, but fixes this particular case, namely by passing arguments, eg. __WB_check_loc(location, arguments)`.

This way, that function can check if the location variable is a passed in argument, and in that case, no navigation happens. This is consistent with assigning to a local variable above.

Of course, this only works if the location is in the immediate function, but not if it is not part of a parent closure...
But, at least it addresses this particular issue.

This can be tested with loading localhost:8080/live/https://anzac100.initiatives.qld.gov.au/legacy-projects/bcc/index.aspx with pywb 2.6.3.

rebeccacremona pushed a commit to rebeccacremona/replayweb.page that referenced this issue Jun 10, 2022
- use dedicated web worker for loading, remove when done
- update to warcio 1.5.0 for faster loading of large WARCs
- show WARC records loading for better progress update

* fidelity: fix potentially invalid location redirect via wombat 3.3.5 (see: webrecorder/pywb#684)

* dependencies: wabac.js 2.10.0-beta.0 (includes warcio 1.5.0, wombat 3.3.5)

* bump to 1.5.6, update CHANGES
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants