-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Userland access to internalBinding
(at one's own risk)
#27061
Comments
i'm very -1 on exposing these bindings, the entire point of adding internalBinding was to prevent userland from accessing them. perhaps we can find another solution. what big picture functionality was process.binding giving you? |
-1 I agree it would be better to focus on actual needs rather than just blanket access to internals. |
@devsnek @mscdex Our isolation layer sits at low level in order to better interpose on APIs. We evaluate code from We're already shipping a product that uses To reiterate, I'm not suggesting that anything added here would be any more supported than |
why doesn't just replacing process.binding and the public interfaces work? the only way internal bindings are called is via the public interfaces anyway. |
@devsnek We interpose at the low level in order to implement our fine-grained policies with less maintenance and performance overhead. Many of our policies are considerably easier to implement at that level due to normalization and the way some parts of the public API interact with others internally. Either performing those normalizations ourselves or re-implementing the interactions that already happen within the |
how exactly do you intercept these calls? if you're using stack traces/context tracking then some hooks for things like "on fs read" and "on socket open" might work. |
As far as I remember the main issue we wanted to prevent was un-audited access to internals by 3rd party modules. That's why the only way to enable access was with a CLI flag. If you could think of a different way to give the process owner exclusive control, I think that is the main blocker. |
There was also the issue of whenever we changed parts of the undocumented api, huge modules with millions of dependents would break and topple the ecosystem. |
@devsnek We intercept these calls by construction: we re-evaluate |
@bengl without more information on "isolation contexts" and "intercepting any function calls" and all of that i'm not sure how much help i personally can provide. is it possible to provide some more technical detail? (or even better, code) |
Yes, that's what I meant with un-audited. The users of those modules (i.e. those how own the process) were not aware of the risk the module authors were taking on their behalf. As I see it that's the problem with "at one's own risk", who is one ¯\(ツ)/¯ |
Another idea that comes up once in a while is a drop-in modular stdlib. Write your own stdlib files, place them in a special place and the runtime will pick them up instead of the bundled ones. Again the key here is that only the owner can do that, and is aware of the risks and trade-offs. |
Unfortunately I can't show code, since it's not an open source product. That being said, we use isolated JavaScript environments that do not include At this point, our (Intrinsic's) options are to either somehow have access to @refack Unfortunately owner-only solutions aren't really enough for us because we support environments where the people deploying Intrinsic aren't the process owners (e.g. serverless). I certainly understand the reasoning behind introducing |
@bengl is an environment a v8 isolate? a v8 context? something else? the more information we have the better we can help. if you're using isolates, perhaps a new api on worker threads would work? new Worker(path, {
hooks: {
// names and granularity tbd obviously
fsOpen: (path) => { return true or false to allow operation },
fsRead: (path) => {},
...
},
}); we've also got policies in development which might internally need hooks that intrinsic could take advantage of (cc @bmeck on that) |
Just thinking, do you have access to the top level |
@devsnek We don't use Worker threads. You can think of our isolation contexts as being similar to v8 Contexts. A global hook system (that's implemented at suitable places for our needs) seems helpful and may actually present a viable path, but there are some messy difficulties that would need to be dealt with like:
And I'm not sure these would end up being finished in a Node 12 timeframe. @refack Read access for sure. We can instruct our customers to modify |
From what I can tell from the use cases, the requirements are:
I don't think exposing I am wondering, is it possible to achieve what you need by using something similar to the Lines 449 to 455 in d5a5b99
The scripts are environment entry points that has access to node/lib/internal/main/worker_thread.js Lines 20 to 23 in d5a5b99
And this is what currently available to these special scripts (note that the Lines 374 to 394 in d5a5b99
An idea I have is to accept a function or a string containing script sources in |
I'm happy to see discussion happening rather than stopping with the initial -1's. The ability to intercept at a low level is important and necessary. Yes, it's also necessary that we started hiding things behind internalBinding to help stem off userland encroaching further into Node.js core internals. I like @bengl's original idea of a native API for installing an internal binding hook. |
as long as this hook doesn't enable another library that exposes node internals, it seems fine. I'm just worried about a natives 2.0 module with like |
I think this specific information was missed in some of the conversation above. @bengl isn't requesting that js have access to internal bindings, but that there be a C++ API to do so. Also, he's OK if the API is unstable, ie, if it changes unexpectedly, as long as its usable and they can rework their own C++ to adapt to the various node.js versions. I don't have comment on the particular form of that API, but I support exposing node in ways that allow innovative modules to do unexpected (by us) things, without forcing people to rebuild and redistribute their own node variant. |
@joyeecheung This seems quite promising! It seems from what you're suggesting (please correct me if I'm wrong) that this would require that we create a new isolate/thread, initialize a |
I forgot to mention: The way we compile our addon code, we can use headers in the |
@bengl You can gain access to those just fine if the Environment is created with the |
@bengl If you have access to the internal headers (I believe it's also possible for the npm packages to do that if they download the correct headers from this repo and turn on NODE_WANT_INTERNALS though), I believe you can already try experimenting with |
@joyeecheung If I'm understanding correctly, this still involves a separate isolate, right? Our customers in serverless environments might be impacted if we need to serialize in and out of the separate isolate due to the platform-provided entrypoint for inbound requests. Ccold-start times would also likely be impacted. Also, is any of this possible in Node 10? Right now we are working around it in Node 10 by re-implementing things. How are people feeling about the idea of having some function exposing the internal bindings that isn't available in JS or in |
Technically, you can create a new Environment from an existing isolate that's associated with another Environment, but I don't think we have ever tested this (cc @addaleax ) I think it is possible to run all your user code in a new Environment on another thread, while keeping the primary one for yourself and not running any user code there, then you can avoid context switching because the first isolate only exists to launch the other one, the feasibility of this depends on how your package is used though:
I do not think so, Node 10 is fairly old by master's standards, and
I think it is possible to store the internalBinding loader as a property behind a |
@joyeecheung First of all, thanks for your help here (same goes to others who have weighed in)!
Our package is something that gets required by users explicitly, and we'd like to keep it that way if at all possible. In serverless environments (e.g. Lambda), we don't even get to run before the platform (e.g. Amazon's code) has already set up inbound request handling on the main thread, so that's why I'm worried about serializing requests over to another thread/isolate. For Lambda, this may change with the Lambda Runtime API, but we also have other serverless platforms to support.
Whatever solution we land upon, if it doesn't make it into Node 10, that's probably okay, since we're already doing workarounds for Node 10, though they're not particularly ideal.
This would be a great solution for us! Other folks in this thread: how does this sound? If no one objects to this, I can start working on a PR.
If it gets broken, we (Intrinsic) would find out fairly quickly and submit a PR to fix it. That being said if there are other people doing similar things to us (I'm not aware of any, but who knows!), it may be worth having a test for it. |
I'm in favor of having some way to do this as long as it has a higher barrier to entry than |
I'd be wary of anything that can be accessed just by a simple module
install wrapper, even this c++ binding is a simple module install to access.
…On Fri, Apr 5, 2019, 12:33 AM Rich Trott ***@***.***> wrote:
I'm in favor of having *some* way to do this as long as it has a higher
barrier to entry than process.binding().
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27061 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAOUowBSGOSuW10C9_TKoJhJEZ0S5KS9ks5vduAmgaJpZM4cZNL1>
.
|
@bmeck To be fair..if they have the ability to access |
If that was sufficient, wouldn't we not need to do anything here?
…On Fri, Apr 5, 2019, 8:34 AM Joyee Cheung ***@***.***> wrote:
@bmeck <https://github.com/bmeck> To be fair..if they have the ability to
access src headers with NODE_WANT_INTERNALS, then they already have
access to env->internal_binding_loader() at least on master (and tons of
other internals)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27061 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAOUo10xGgYCjFgqGNpIGPkBLjNrd15rks5vd1DogaJpZM4cZNL1>
.
|
@bmeck this is asking for v10 where |
Although, this reminds me...at least on master, one can already instantiate a new function with |
It seems like perhaps this should be closed. Feel free to re-open (or leave a comment requesting that it be re-opened) if you disagree. I'm just tidying up and not acting on a super-strong opinion or anything like that. |
At Intrinsic, we make use of
process.binding('natives')
in order to re-evaluate Node.js core modules inside our isolation environment. In order for these modules to work, we need to have the binding layer accessible. This is similar to how thenatives
module works.The introduction of
internalBinding
was of no issue on its own, since we could just useprocess.binding
to replicate its behaviour. However, there are now modules such asstring_decoder
that now having binding parts that are inacessible to userland. We can't simply require that users run with--expose-internals
as our customers aren't always in control of CLI arguments. Also, our product is delivered as a Node.js module, rather than a separate binary, and we'd like to keep it that way.We do have some workarounds involving re-implementing the exposed APIs, but we find this to be prone to errors and subject to extra maintenance for new versions of Node.js.
What we'd like to do is introduce a way of accessing
internalBinding
-provided code from userland. I know this seems counterintuitive, given the purpose ofinternalBinding
, so it would make the most sense for it to be provided in C++ only, requiring some native code to actually get access to it. The difficulty in accessing it is an acknowledgement that we're not expecting the same level of support that the normal user-facing API has. (We would expect basically no support for the actual internal bindings, but we wouldn't expect this access to disappear.) We're certainly open to other suggestions, but we'd like a solution that's maintainable and sustainable.What do folks think?
The text was updated successfully, but these errors were encountered: