-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement kernel-side escalator scheduler #23
Comments
Fix Agoric#22: Remove domain property from promise
comments within 80 columns when possible
At today's kernel meeting, we tried to figure out what the simpest possible metering/scheduling mechanism might be. Some of the goals are:
The basic proposal we sketched out (which does metering but significant scheduling) is:
Vats would maintain a fairly deep stack of meter capacity, to avoid an expensive (or fatal) underflow. But they'd check on the backstop meter, and if it ever gets used, they react somehow (deny service to the calling client?). SchedulingOn top of this, we'd add the escalators. When sending a message, vats would need to indicate which meter the message would use, and the start+slope of the escalator. The most common path is to use the same meter that was used by the message which initiated the crank, but vats must be able to switch to a different one for this to be at all useful. Escalator fees are for priority, not execution cost. Each message is bidding against the other messages for the opportunity to consume one of the limited execution slots. The userspace API for this is not clear: @dtribble 's Flow scheme would provide a nice UI, but 1: we haven't implemented it here yet, and won't be able to before we need the escalators, and 2: its primary purpose is to inform ordering constraints, and we'd need to think through how to manage that when the ordering constraint says A must be delivered before B, but the escalator priorities cause B to be runnable first (does the entire Flow block until both constraints are met? Does the blocked message continue to climb the escalator and increase its bid? Probably each message should not even join the escalator until the other constraints allow it to run) A high-level question raised by @dtribble was "should fees be additive or multiplicative"? I'll defer the exact definition to him, but my understanding was that it relates to chains of messages (the customer sends a message to vat A, which spawns a dozen messages between other vats before the economic operation is complete). If the customer pays for expidited service, so their first message is delivered quickly, but the other dozen messages (not created by them, but created in response to their request) are low-priority and get delayed, then they won't have gotten the expidited service they paid for. @erights pointed out that the vats involved can and should be aware of the priority and take steps to honor the request properly: if I buy next-day parcel delivery, the service I hired is not going to sub-contract to a week-long shipper (if they want to keep my business). I think @dtribble 's "multiplicative" option means that assigning a similar high-priority to each additional "internal" message causes the overall costs to be multiplied by the number of those messages. An "additive" option should mean (I think, there was some disagreement here) that splitting a task into multiple messages (perhaps across multiple vats) should not significantly increase the overall cost over doing it all in a single message. @erights pointed out that messages crossing between distinct swingset machines (e.g. multiple chains in a pegged-currency trade) should incur extra costs, as those messages are not as cheap as intra-swingset or intra-vat ones. One approach that might achieve "additive" costs would be for meters to ride the escalators, not individual messages (so each meter could hold a number of messages). When the meter reaches the top, we run multiple messages (perhaps to different vats), adding their costs together instead of requiring each one to climb the escalator separately. Another would be to allow vats to place some number of outbound messages at the front of the run-queue (maybe they add the message to the current/inbound meter, and since that meter is still at the top of the escalator, the message runs right away). Or to give some amount of escalator credits to messages being created by the current vat, if they're small, or if the current vat hasn't run for too long. @dtribble made an analogy with Unix-style kernel schedulers (in particular KeyKOS, I think), where you bid for CPU time and "win" a 4ms slice. If the process finishes early and returns to the kernel quickly (e.g. it processed a message and returns to waiting for IO), the process gets credit for being polite, in the form of an elevated priority. The next time IO is ready and the process can run again, it will be at the top of the list. In contrast, a process which consumes its entire 4ms time slice is probably CPU bound, and is likely to do the same again next time, so it gets a lower priority. In our system, the "don't nickle-and-dime developers" goal means we don't really want to do fine-grained usage measurements. But if we had that, we could conclude "message X bought a Medium slice but only used Small", and then maybe we let new messages it creates travel on the coattails of the initial bid and execute quickly. Ideally, if all dozen related messages add together to less than Medium, they should all run immediately, rather than going back to the escalators. |
Notes from today's kernel meeting (28-oct-2020):
Some vats are "instance contracts": closely-held (typically by two-ish parties), short-lived, single-purpose. These are likely to be given a decent-sized meter to pay for their operation. Every entry point to the contract will start by switching from the delivery meter to the operational one. The analogy is that you pay up front for all the postage necessary to do business, so you don't have to provide money with each individual message. This is less prone to abuse because the only parties who can deliver messages are the participants that were known when the contract was created. Other vats are "service contracts": long-lived, widely-held. Here, we cannot rely upon the good behavior of the two-ish participants. Callers to these objects are expected to pay for their requests, by including a meter with any delivery that might not be completable within a It should be possible to run synchronous code on a separate meter. The API should look like a lexical block, maybe The meter used for sending messages, however, should be associated with the Promise that wraps the Presence for the remote object. We only have one Presence per remote object, but we can have arbitrary numbers of Promises for each one. We're thinking that each Promise has a prioritization/delivery meter associated with it, such that all messages sent through that Promise will use that meter. Tentative API:
If contract authors don't do anything special (they never call
If the contract authors don't do
For "instance contracts" which are pre-paid for their (small number of) clients, either the fallback meter is used all the time, or all entry points should For "service contracts" that need to do non-trivial work for new clients all the time, the exposed objects should accept a |
Notes from today's meeting, @dtribble and @erights describing their recent conversation:
We didn't come up with a syntax for how vat code should switch to a different meter. Ideally use of a given meter is expressed in a lexical scope. But the syscalls (or syscall arguments) needed to tell the kernel about the meter selection is dynamically scoped. We speculated about some sort of source transform: function handler(args) {
run_with_meter(meter1, {
send_thing();
send_other_thing();
});
run_with_meter(meter2, {
do_third_third();
});
} I could imagine a source-to-source transform which rewrites that to: function handler(args) {
set_meter(meter1);
send_thing();
send_other_thing();
set_meter(meter2);
do_third_third();
} but I can't imagine how it would deal with e.g. We also said that the execution costs of a message should be charged against the same meter which was used for the prioritization costs. |
@warner and @kriskowal and I looked at ...
in the context of this issue. The one conclusion I recall clearly is that no, we don't need to deal with this for a while; we should stay focused on getting the XS vat worker with snapshots is working. (#511, #2145, ...) |
Notes from a conversation with @dtribble today:
|
Notes from today's kernel meeting: Execution AllowanceAll cranks get a "small" unit of capacity, and by default are terminated if they exceed it
Price SelectionThe economics of price selection are still an open question.
Authority LayoutEach Presence is mapped to a Meter. Each Meter is mapped to a 2-tuple of Escalator, and Initial Bid. Each Escalator is defined by a 2-tuple of Slope (price increase per unit time/crank/usage/not-sure) and Maximum Initial Bid. By limiting the Meter's ability to chose an infinite Initial Bid, all messages will eventually reach the top (the "fairness" property). A reasonable starting point would be three meters, whose slopes are 1, 100, and 10000. Each message is sent to a specific Presence, which causes it to be added to an Escalator at a particular starting price. Its price then climbs over time. The actual implementation will have a separate priority queue for each Escalator, rather than tracking every message individually. Meter Balance limits the BidEach message will climb an Escalator until its price is the highest, then it gets delivered. However, the Meter associated with that message may not have the funds to pay for the claimed price. One option is to freeze that bid at the Meter's balance. However this would probably break the Another option is to invoke that Meter's "Keeper" and let it decide whether to supply additional funds, switch to a different meter, or abandon the message. Facets for different Meters on the same RemotableThere are many advantages to baking the Meter into the Presence, however two drawbacks came up:
The example we came up with was a high-priority contract message that pays to get to the top of the queue, executes, and then discovers that it needs a Brand verification or balance lookup to continue. If the contract has an Issuer Presence which has the low- or medium- priority escalator baked into it, the intermediate message will take a long time, slowing down the overall operation despite the user's expressed willingness to pay for high priority. If the contract could make a particular message use a particular Escalator, it could submit the Brand check at high-priority. If it gets multiple Presences for the Issuer, each with a different escalator, it could submit the check via the one that matches the priority level. But, in that case, it would have three different Issuer Presences, which wouldn't compare as EQ, which is necessary for other reasons. @erights suggested that we need to find an object-capability way to express the combination of priority and target. I don't know how to do that (and still maintain some useful form of EQ), but it sounds like the right goal to me, especially when you consider that being able to use a particular priority is an authority all by itself, and should be something that can be shared/attenuated through our usual patterns. Maybe something where there's a primary Issuer Presence (which you use for EQ), and a bunch of priority-bearing facets (which you use for messages), and the primary will tell you if an alleged priority facet is legitimate (so the primary and the facets have the same relationship as an Issuer and its Purses). What to Charge ForWhen a message makes it to the top of the queue and gets delivered, it had a winning price of some sort. It will have some maximum usage limit (perhaps always just one "small" unit, but extensible if the vat requests it). There will be some actual amount used, which we measure to accumulate and size the block. What should we deduct from the message's Meter?
We might charge them the full usage limit (because they had the ability to consume it all, before anyone else, and that's a scarce resource). We might charge them only for what they actually used, because the remaining time can still be used by others. We might charge them for the full limit but then buy back the remaining time at a discount. Or we might just charge a fixed amount and leave the economics to the priority payment. The core question is: what does the bid buy you? We know it should get you at least one "small" unit of computation, and it make sense to allow up to one "large" unit if the vat requests it right away. We know the vat shouldn't get to run forever (starving out everyone else), so there must be an upper limit to what it can get without submitting a new message and waiting for it to get back up to the top. If the vat can buy a "large" unit at the start of the crank, it wouldn't be any less fair to allow it to instead submit several ( End-to-End StoryI wrote up a set of questions for us to figure out:
Buying a house might be more important to you than buying a cryptokitty is to someone else. We can imagine the UI having a slider (like many BTC/ETH wallets) to increase priority and fees. If each Presence is bound to a Meter, the UI will implement this by going to the normal-priority endpoint first (taking however long it takes) to get an Invitation that is bound to a specific escalator. Everything that happens after the Invitation Presence uses the requested priority.
The message would be normal, but the target to which it is sent would be bound to a high-priority Meter.
The contract that created the high-priority Invitation would be responsible for acquiring high-priority Presences for anything else it needs to do the requested job at the right priority level: Issuers, Purses, other contracts, etc. I don't yet know how promise resolutions should work. The easiest answer is that all resolutions sent during a crank use the same Meter that the crank itself used, which means they'll be enqueued at the bottom of that Escalator and wait their (faster-priority) turn.
Maybe something like: const meter2 = vatPowers.makeMeter({ escalator: 'fast', initialBid: 12 };
meter2~.deposit(meter1.getPayment(45));
return Far(iface, behavior, { meter: meter2 });
Yes.
No, not directly, however they might have to buy the high-priority Presence ahead of time, which will cost them whatever the Meter holder wants to charge.
The sender does not influence this, nor is there anything associated with the message, Remotable, or Meter to suggest how long a message might take to execute. All messages get one "small" unit, but the receiver's code can ask for more, up to some limit, after the crank begins. DoS Resistance AnalysisWe'll need to do a careful analysis of how our approach mitigates spam attacks. We'll rely upon something at the Cosmos level to prevent attackers from introducing an unbounded number of messages into the kernel from the outside (messages aimed at the comms vat, which will parse their contents and emit syscalls with various Meters and Escalators). That layer should be safe if each such message costs some execution tokens to get put into a block. The next-earlier resource to protect is the mempool, which we might handle with @michaelfig 's N-per-sender refundable tickets idea, which limits the mempool consumption to |
Per-Presence "Static" Escalator Assignment@dtribble and I brainstormed today about an approach in which:
I can imagine building that system into the kernel and liveslots. I suspect (but haven't fully thought through) that it would satisfy the "priority contagion" property that we need: most of the work done on behalf of a high-priority starting message will also be performed at high priority. One sharp edge to watch out for is any pre-existing Presence from some other vat, which will be on its own pre-selected scheduling. We tried to walk through how this would work from the user's point of view, and I don't think we reached a satisfactory picture. The starting point is a "Enable Premium Service" switch, which sends a normal-priority message to the dapp vat, asking for a high-priority facet (and establishing payment somehow). The dapp vat would bind this facet to the higher-priority scheduling and return it to the user's wallet. Then, later, when the user initiates an action (e.g. an AMM swap), they'd have a "Boost" button that causes their request to use the high-priority facet rather than the standard one. The question is: what happens next?
So, I like the simplification this brings, and I think I know how to build it on the kernel side, but I don't yet see how we could use it successfully at the next higher layer. Still more to think about. |
The kernel currently holds a queue of messages to deliver, and executes them in strict order of their submission. In the longer run, we want messages to be able to pay for faster execution, using the "meter" and "keeper" mechanisms from KeyKOS.
The first step will omit the whole financial aspect and just implement the scheduling algorithm. Each pending delivery, once it is unblocked on any dependency it might have (none for now, but Flows will add some sooner or later), gets put on an escalator with some starting offer price and delta-price/delta-time slope. Nothing executes while its offer price is below zero (this enables timed delivery, more or less). As time passes, all messages have their price increased by their slope. The highest offer gets to execute.
In our isolated environment, "time passing" does not have a lot of meaning. But we should be able to simulate it somewhat. It will become more relevant when we're limited by how much computation we can execute at once, e.g. if the Vats are living in a blockchain and the limit is how much computation can go into each block.
A later phase will be to implement Meters and have some sort of tokens being spent. For now, userspace should just be able to specify the initial price and slope to be anything they like, without concern for how it will pay for the prioritization it wants.
We need to think through the userspace interface for this. Most of the time, each outbound message should use the same Meter (and scheduling parameters?) as the triggering inbound message was using. But it should be possible to fork off an separate scheduler as an option when making outbound calls. Maybe a dynamic-scope tool (what would be a "context manager" in Python) that causes all message sends within that scope to use the alternate settings.
Dean's Flow work should cover this: any messages sent within the context of a Flow will use the same Flow (and will be dependent upon all previous messages in that Flow), but there's a
.fork()
operation that gives you a new Flow, maybe with different parameters. We'll need to make the kernel aware of these Flow objects.The text was updated successfully, but these errors were encountered: