-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gas/metering, terminate runaway vats #516
Comments
Metering gets even more interesting when we consider replicated consistency (chain-based swingset machines). As discussed in today's meeting, we're likely to land in one of two worlds:
|
For reference, pack of watchdogs paper is at https://medium.com/@erights/a-pack-of-watchdogs-is-cheaper-than-gas-7e118edfb4cc |
I do not know if this belongs here or not but here goes: Control transfer upon a watchdog bark has traditionaly been What this src2src transform does is equiv to down tick a watchdog As a src2src transform is being done anyway, why not instrument all assignments to capture state updates done during a event-loop turn? Hope this gave some insight. |
For an initial, hacky implementation in SwingSet that would at least prove the concept, I propose adding an endowment to all static vats that allows swapping the global meter, and doing the resetting of the meter in the kernel. This would not be any worse than the status quo (static vats can cause the kernel to hang), and dynamic code would be explicitly loaded into a metering-enabled environment that does not have access to this endowment. This mechanism will soon be replaced by the proper implementation: metering as an option to dynamic vat creation (still using the same kernel modifications, but not the endowment), and removal of support of the hack from Zoe and Spawner as they change to use dynamic vats. |
Updates on our metering plans: Background@michaelfig's The companion (the method isn't actually named If either The intended pattern is:
Currrent (old-SES) implementationIn current trunk, each Vat is given a special Initial New-SES implementationMy plan for the first phase of metering under new-SES (whose goal is "do the minimum amount of work that doesn't make things noticeably worse", and yes I'm relying upon some things to escape notice) will instead provide These should be passed into the Other than passing these through, Vats will be nominally unaware of meters. They can create new metering domains with
In this phase, the unmetered top-most static vat code does not have a way to reset the "globalMeter" when it obtains control. If it calls into metered code and then gets control back again, the builtins will still be operating on the guest code's meter, which might expire at a surprising time. The current old-SES implementation allows this code to call
In this phase, Vats are not killed when a meter expires. The only consequence of a meter expiring is that exceptions are thrown by that code, until the meter is refilled (which only+always happens at the end of the crank). Execution within the metering domain might suddenly stop at any time during a crank, without the metered code being aware of it. This could leave broken invariants lying around. It can only be caused by the execution time/etc of the metered code, however code from some other same-vat metering domain could deliberately call into this one multiple times in a single crank, enough to provoke an underflow at some critical location. In this phase, we tolerate this possibility. meter-per-Vat implementationThe next step is to add metering to the top-level code of all dynamic vats, and maybe also specially-marked static vats (the configuration object could have a flag to enable metering, and we'd activate it on the Zoe vat until we move to split-Zoe vat-per-contract-instance). We don't necessarily want to impose metering on vats that don't need it, because of the performance hit (which we still need to measure). We'll probably start with a "bottomless" top-level vat meter. It would never expire, but the fact that it's Then, we'll make this into a normal meter: it expires, but the kernel refills it between cranks. In this mode, the initial vat code might be halted at surprising (invariant-violating) times, just as above. We may then decide to react to exhaustion of the vat's top-level meter by killing the vat entirely ("death before confusion"). In the future, this may be relaxed to merely rewind the vat upon meter exhaustion (assuming some keeper/handler mechanism to avoid walking into the same hole twice), which removes the confusion problem. Any code that wants to protect itself against exhaustion-based confusion will need to run in its own vat, enable terminate- or rewind- on-exhaustion, and not share their vat with any other metering domain. future meter-within-Vat implementationFor now, we're mainly trying to prevent runaway (misbehaving) contract code from denying service to other (well-behaving) contracts/instances. But sooner or later we'll want to incentivize something by "charging" more for expensive computation, and we'll be interested in measuring resource usage of execution. At that point, vats need to become more aware of meters and their current contents. Zoe contract instance vats, in particular, will have a "ZCF" component (that acts a bit like a supervisor object), that sits next to the metered guest code. @Chris-Hibbert points out that it might seem rude to charge contract operators for the time spent by the supervisor we imposed upon them. So we may want to measure the time/space/etc consumed by each piece separately. To support that, I'm thinking we should rearrange the Meter object a bit. The current I think we should reshape it into an object which lets you observe the level, and pass it into Some Meters could be bottomless, and/or refilled by the kernel between cranks. Others would be more bounded, and not refill automatically. The ZCF component could be given an unbounded meter to operate with, from which it can create bounded ones for the contract's metering domain. pre-paid/post-paid messagesAll of the above looks at metering applied to certain pieces of code, delimited by In Ethereum, all the gas for each transaction must come from the initial (private-key-signing) sender. It's like a clockwork vending machine with no springs: you must push the button hard enough to provide all the kinetic energy necessary to complete the computation. This makes it difficult to build interestingly complex multi-step systems. In an auction, the last bid will cause very different actions than the previous ones (maybe sending out payments and refunds), which might need more gas, but the submitter of that bid might not even know what they're about to trigger, and might not provide enough. Ethereum contracts have dealt with this by moving to a "withdrawal" model: the bids cause local state updates (but never send funds to anyone else), and all participants are obligated to come back later to withdraw their winnings/refunds (with enough gas to cover their own needs). I think we want to enable "spring-loaded computation", where the vending machine uses stored energy to decouple the individual event trigger's contribution from the overall machinery being activated. However we cannot let this become a denial-of-service vector. The attacker should not be able to push the button so frequently that the mere "are we done yet" check causes our stored execution tokens to become exhausted. Exposing an object to the rest of the world means sharing some authority with the rest of the world. But you should be able to share useful authorities without also sharing a "deplete all my stored execution tokens" authority. It may help to have the initial execution paid for by the submitter, but then allow subsequent execution to run off stored tokens. In the model above, this would be implemented by having the Vat's externally-visible objects run in a metering domain whose Meter was empty and not automatically refilled by anybody. These bastion objects would have closely-held references to objects in a second metering domain which has a fully-charged Meter, and it only forwards the requests that it likes. Incoming messages from other vats would have to carry purses with sufficient execution tokens to power the bastion object long enough to pass judgement and pass along the request. Somewhere in this, we need to enable the second metering domain to protect itself against confusion. The bastion object might need to inspect the second Meter to check that it has enough tokens left to complete the action. Or the inner domain might be put into a separate vat entirely, where we can eventually use state rollback to prevent early-termination confusion. Then we might have the second vat run its computation entirely on its own meter, relying upon the limited access to that vat to protect it against attack. And, somehow, all of this needs to be folded into an escalator scheduler, where additional tokens are spent to bid for execution priority slots. These tokens probably won't be seen by the vat at all, but rather are consumed by the scheduler. |
this is basically done |
Once #398 is implemented, we'll have a source-to-source transformation that will instrument user-supplied code with meter checks. The resulting API is still under development, but will probably involve adding a
meter
endowment (an integer) to theCompartment
in which the rewritten code is evaluated, and watching for aRangeError
to be thrown when executing the code.We'll need to define how this is managed in the SwingSet world. We have some interesting source material to work with (KeyKOS, Meters, Keepers, Ethereum's "gas"), and some new constraints (I don't think we can synchronously invoke a Keeper while the overrunning code is paused, waiting for a decision). It's pretty cheap for us to either terminate the Vat, or allow it to run to completion. But if the keeper wants to pause it for later, we must in fact terminate it, and then reload it from a previous checkpoint (which currently requires us to replay the entire transcript, which is super expensive).
We have a lot of decisions to make about user-level control of metering questions. But the simplest place to start is a fixed number of computrons for each message delivery, where the only goal is to catch a runaway loop. We'll respond to this by unconditionally terminating the vat (#514). Higher-level code like Zoe/ContractHost will attempt to put the untrusted user-supplied code into a new Vat, so a runaway contract won't threaten Zoe's ability to maintain offer safety (in particular refund safety).
The text was updated successfully, but these errors were encountered: