-
Notifications
You must be signed in to change notification settings - Fork 7.3k
vm module correction #3042
Comments
Thanks for the detailed writeup, Isaac. I did some work on this a couple of weeks ago, and made some good progress. It shouldn't be too hard to realize what you're asking for here (I think my current implementation meets most of these requirements). My work is here: https://github.com/brianmcd/node/commits/vm-contexts . I'm a student trying to finish my thesis, so I've just been swamped lately and haven't had time to clean up the code, check all the edge cases, etc. I will dedicate some time to this on Tuesday or Wednesday this week, and see if I can clean up my work and get it into something worth discussing. |
If security is important, then VM contexts are the wrong approach, it's not what they're designed for. You need process isolation plus something like seccomp. |
@bnoordhuis Yes, that is true. But, just because VM contexts are not an entire security solution, doesn't mean that they aren't part of a valid security solution. In these troubled times, when it seems the entire news industry has abandoned the purity of HTML markup, if you want to scrape a lot of websites, you have to evaluate their javascript in order to get the actual content. If the vm module doens't work for jsdom, it doesn't work. |
@brianmcd Thanks for the link. If you feel like working on this, of course, it'd be most welcome, but my main goal with this issue is to delineate what needs to be done, not to push too hard for you to do it if you've got other stuff going on. There's not a huge rush to get this finished. |
Much of the criticism around node is due to its single threaded nature. Though some problem is solved by "cluster" module, there are many issues around leaks due to gc not getting time to run. |
Isolates are gone, and there is no plan for them to return. Let's try to keep this issue single-threaded. |
Contextify actually meets most/all of these requirements as is. I was able to meet the above requirements and replicate the vm module (sans Script) using Contextify's .node. The js bindings it has are what introduce the leaking of its functions but that's unnecessary. In this example I've used a WeakMap which ensures proper garbage collection, but it's possible to mostly meet the needs of this using two arrays, or a C++ land makeWeak. https://gist.github.com/2317834 Caveat: to make vm.dispose work full would require help from C++ land or use of a Proxy. Actually, A lot of these problems become solvable entirely in JS with proxies... All that's need from C++ land is the ability to turn arbitrary objects into the global variable record for a chunk of code. (you can actually do this right now using |
Also as a fork of Contextify. https://github.com/Benvie/contextify/blob/vm-module/lib/contextify.js |
I realized I misread the first requirement and have made an alternate version. In this one the sandboxes global is never actually returned by createContext, just the input sandox (or newly created {}). runInContext is also modified so that it simply takes any object and either creates a context based on it or uses an existing one. https://gist.github.com/2324128 First off this means vm.dispose(sandbox) works without any help from C++ land. The one reference to the context wrapper with Mirroring it so that references to the global, such as in vm.runInContext('({ a: this })', sandbox) are translated requires C++ land help (or Proxies). Two or more contexts at once will have references to the containing object so it can't simply be modified in place. It looks like briandmcd's recent work addresses this. |
This would properly also fix this issue #706 var vm = require('vm');
var fn = function (check) { return check instanceof RegExp; };
vm.runInNewContext("fn(/hallo/)", {fn: fn});
// false
vm.runInNewContext("/hallo/ instanceof RegExp", {});
// true |
Not getting to this in 0.10, unfortunately. Some day :) |
@isaacs I would talk to the SES guys, they seem to have this pretty well covered. May need to figure out how to drop the strict mode etc. |
Let's get to this in 0.11!??!! jsdom would love you forever... |
@domenic Patches welcome ;) |
The vm module has a lot of problems: https://github.com/joyent/node/issues/search?utf8=%E2%9C%93&q=vm
Furthermore, while it's useful in some scenarios, it's not suitable for what I consider a primary use case for a JavaScript virtual machine, what the JavaScript language was literally designed for: building sandboxes to run arbitrary code in a suitably low-risk manner1, and communicate over proscribed channels with this arbitrary code (eg, via the DOM api.)
Requirements:
I consider these the requirements of a correct JavaScript vm implementation. Our current one is broken.
A context should be indistinguishable from the object it's based on.
Currently:
Both should return true. This makes it unnecessarily difficult to have links to the global context within the context. As far as I can tell, this is the only way:
However, note that this breaks the link from outside. So, this blocks communication with the child context about any links that should point to the context object.
This prevents easy simulation of either node.js or DOM environments.
Items added to the context from anywhere, should be immediately visible to all programs that can access that context. Example:
This should print "baz" a few times, and then start printing "bloo" once it's updated in the parent context.
As mentioned in Fix issues with v8 contexts in vm module #3039, the vm internal architecture has some problems which cause issues which the Locker object exposes when in debug mode. We need a cleaner way to know when to dispose of the context objects.
@brianmcd's
contextify
module exhibits the correct behavior in the cases that I've tested. However, thecontextify
API is not ideal: it adds properties onto the object, and requires user interaction to manually dispose of the context, making it not quite as easy to use as it could be. Also, it leaks the contextify methods into the child context:We should try to stay as close as we can to the current JavaScript API, but with correct semantics and resource management. Also, more of the logic should be moved out of C++ and into lib/vm.js.
Creating a new context:
var c = vm.createContext({console:console})
Creating a link to the global in the context:
c.global = vm.getGlobal(c)
Iterating over keys in a context:
for (var i in c) console.log(i); // prints "console" and "global", only
Any changes to the global object in the child context, or to c itself in the parent context, is immediately seen in the other.
Disposing of a context happens automatically when the Context object is gc'ed, but can be called explicitly by
vm.dispose(c)
Cyclical links back to the root object should be detected at the outset, and replaced with
vm.getGlobal(obj)
Not sure how we'd handle cases where the supplied object is already the instanceof some other class. It would be acceptable for
vm.createContext(obj)
to throw ifObject.getPrototypeOf(obj) !== Object.prototype
, and simpler to deal with than trying to paper over the discrepancy. The understanding should be that it will modify the existing object in-place.Not sure how we'd handle functions that return the context object directly, when called from inside the context. Consider:
There are probably much more I'm not thinking of right now. Any fix for this should come along with a very thorough set of this-assignment edge case tests.
/cc @laverdet @piscisaureus @bnoordhuis @brianmcd @tmpvar
The history of browsers has shown us that, in fact, this is much harder than only removing access to various system APIs, but also usually requires isolating potentially nefarious code even further so that its process can be killed if it behaves badly.
The text was updated successfully, but these errors were encountered: