-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constant-complexity dispatcher idea #12650
Comments
I will write detailed specification, hope you like the idea and have desire to implement it. I assume dispatching can cost 100-150 gas for up to 512 methods and bytecode size could be smaller than now. |
I actually spent some time months ago to do something similar. |
Ah, maybe I misread your algorithm: Why do you need
Here, |
If 2 selectors are multiples of N, mulmod with any salt X will lead to a collision. |
Ok, then it looks like we need to throw some xor and add operations in to the mix :) |
XOR could help in some cases. In general we don’t have any proof than solution will exist for some arbitrary form of hash function. I consider keccak as safe and reliable hash to use. But we should consider some other functions, maybe we can find something better. Case in which XOR is not working:
|
What about fallback? This looks like it would map any other function to one of those N. I think this would need an extra check before the jump to ensure that the input selector actually matches the destination selector. If not, we should jump to fallback instead. |
What we are trying to build is this: https://en.wikipedia.org/wiki/Perfect_hash_function |
O(1) would be interesting, but what would be the magnitude of the constant cost? Would it be less than the current costs? It would be nice to see some numbers. I played around a bit with @k06a's script yesterday. For ~10 selectors it works well but any more than that the brute forcing can take a long time. I think most contracts have more than 10 selectors, so splitting into buckets and doing multiple dispatches would likely be necessary. Not sure if it needs to be done with multiple keccak like @k06a suggested, perhaps the cheapest approach is combination of binary search until reaching small enough buckets that we can generate and use an O(1) dispatch table. |
@frangio, yes and formula is |
@frangio I think we can one step solution for 50 gas, 2-step for 100, etc. Current dispatcher gas is around 150 for 10-method contract and 350 for 80-method contract, AFAIR. |
Updated gist with code to support buckets: https://gist.github.com/k06a/737fd1f389bb6f2e66f0ef76ea73729b //
// N = number of selectors
// B = number of buckets
//
// All possible combinations: B^N
// Valid combinations: N! / ((N/B)!)^B
// Combinations per valid: B^N * ((N/B)!)^B / N!
// Combinations per valid when B=N: N^N / N!
//
// Splitting 6 selectors into 6 buckets: 64 combinations
// Splitting 7 selectors into 7 buckets: 163 combinations
// Splitting 8 selectors into 8 buckets: 416 combinations
// Splitting 9 selectors into 9 buckets: 1,067 combinations
// Splitting 10 selectors into 10 buckets: 2,755 combinations
// Splitting 11 selectors into 11 buckets: 7,147 combinations
// Splitting 12 selectors into 12 buckets: 18,000 combinations
// Splitting 13 selectors into 13 buckets: 48,000 combinations
// Splitting 13 selectors into 13 buckets: 127,000 combinations
// Splitting 15 selectors into 15 buckets: 334,000 combinations
//
// Splitting 40 selectors into 5 buckets of equal size: 1,187 combinations
// Splitting 40 selectors into 8 buckets of equal size: 70,000 combinations
// Splitting 100 selectors into 5 buckets of equal size: 7,204 combinations
// Splitting 100 selectors into 4 buckets of equal size: 996 combinations
// Splitting 100 selectors into 10 buckets of equal size: 42,000,000 combinations
// |
Whatever we do, we need at least one full equality comparison per function, even if the contract does not have a fallback function, because there are always "false positives". |
Sure, it's just that keccak is relatively expensive and if there is a cheaper option, I would use the cheaper option. |
@chriseth I agree keccak256 is expensive especially on the host side, is Solidity it is relatively cheap. I will try to produce another function. But we can start with keccak256. |
The probability of several selectors having the same remainders by multiple primary numbers is much lower. Moreover, we can select these primary numbers for a specific selector set exclusively. Maybe the following formula will work:
|
At that point I don't know if it will be cheaper than keccak. |
Just played with this idea of multiple remainders - it has so many collisions, not possible to find such SALT in practice for 5-10 selectors. I afraid multiplication of remainders produces more possibilities to have the same remainder after MOD N. |
We could try dividing by the largest power of N that's a factor in any selector. I'd expect |
Found something interesting: http://stevehanov.ca/blog/?id=119 |
@k06a Great reference. However, reading it made me realize that we're completely overthinking this. The function selector is already a random number. If we take |
@frangio we could bruteforce some simple formula to ensure distribution is good enough, for example, each bucket have max 2 selectors. I think this can be achieved pretty fast even for huge number of selectors. |
@frangio splitting selectors onto N bucket by just using remainder works as following for some 78-selectors contract:
I see it have only 49 non-empty slots I also tried to spend 1000 iterations to find SALT XORing which leads to a good distribution among N/2 slots:
Script used: let min = selectors.length;
for (let salt = 1; salt < 10000; salt++) {
let dict = {};
let max = 0;
for (let s of selectors) {
let key = ((s ^ salt) >>> 0) % Math.trunc(selectors.length / 2);
dict[key] = (dict[key] || []).concat([s]);
max = Math.max(max, dict[key].length);
}
min = Math.min(min, max);
if (max <= 3) {
console.log('Found salt =', salt, ' max =', max, ' dict =', dict);
break;
}
}
console.log('min =', min); |
That's great! I wasn't considering that empty buckets are a problem. Using xor for salting makes a lot of sense. |
Got advice from @snjax that It makes much more sense to use:
And I would not suggest to use MUL instead of XOR, because it could keep remainder the same in some cases. |
Has there been any progress on this in the last days? :) |
@snjax I agree that adding |
I suspect the best solution we have right now for N selectors is the following:
I am trying to compute a number of cases that are suitable for the conditions to compute the probability and complexity of the bruteforcing task. |
This conversation reminds me of the dispatcher in Objective-C, which has to solve some similar issues. In Objective-C methods are invoked dynamically by runtime lookup using "selectors" which are themselves essentially based on the function signature. Obviously method dispatch in a raw CPU context vs dispatch in the EVM is very different. However, the dispatcher has been worked on since the 90s and may contain relevant prior art. Just in case it's of interest, here's where you can find the source code for the message dispatch function: https://github.com/apple-oss-distributions/objc4/tree/main/runtime/Messengers.subproj Extracts from a related conversation (I'd credit the contributors but it's a private email list):
EDIT: Included some items from conversation on a private email list I pinged. |
Checked the following numbers:
Hash function: (selector XOR salt) MOD 65537 MOD m After brute-forcing salts for 10k different sets of 1000 random generated selectors I have the following stats:
I believe we could use something like this: let ptr := mod(mod(xor(selector, SALT), PRIME), TABLE_SIZE)
codecopy(0, add(TABLE_OFFSET_IN_BYTECODE, mul(ptr, 2)), 2)
jump(shr(240, mload(0))) and on destinations we can have 3 variations:
Or we could use the first table to point on 1-2 selectors in the second table. And use 1 code to check these selectors. https://gist.github.com/k06a/5b6c06faa235e656cc391cb444ea71e8 |
Pseudocode of single dispatcher with 2 tables
let ptr := mod(mod(xor(selector, SALT), PRIME), TABLE_SIZE)
codecopy(0, add(TABLE_OFFSET_IN_BYTECODE, mul(ptr, 2)), 2)
let idx := shr(240, mload(0))
if iszero(idx) {
mstore(0, UNKNOWN_SELECTOR_ERROR)
revert(0, 4)
}
codecopy(0, add(TABLE_2_OFFSET_IN_BYTECODE, mul(and(0x3FFF, idx), 8)), 16)
if and(0x8000, idx) {
let s := shr(224, mload(0))
if eq(selector, s) {
jmp(shr(224, shl(32, mload(0))))
}
}
mstore(0, shl(64, mload(0))
// Same code as previous, except mload(0) was shifted to the next 8-byte record
if and(0x8000, idx) {
let s := shr(224, mload(0))
if eq(selector, s) {
jmp(shr(224, shl(32, mload(0))))
}
} |
Maybe you could introduce the |
This should be a separate issue for discussion, and I think there are a bunch of overlapping once already. One (small) downside I see is that in that case a contract could not have any function marked |
This discussion IMHO motivates a set of instructions that expect inputs assumed to come from a certain value distribution (hash based keccak prefix for now) plus well-formed data structures, and emit results ... in this case a computationally efficient lookup. I've been working on non EVM projects for a while now and don't recall, but I don't think the EVM has a "BIOS" and if not, maybe it should have. I.e., "long jumps" into functions that are actually implemented "outside" the EVM in a CPU and gas efficient manner as part of the EVM itself. Clearly the keccak computation instructions are examples "in that direction"; but there's probably a family of functions that shouldn't require new instruction-set but can be implemented efficiently outside the EVM in a physical CPU efficient manner based on a calling ABI. I.e., an stdlib. It doesn't make much sense to me that, structurally speaking, either miners or EVM users would want expensive functions like this to run in the most inefficient context possible. I.e. inside EVM. I imagine keccak is not run there for precisely the same reasons. |
@aathan "precompiled contracts" are effectively accomplishing what you are saying. They were opcodes initially, but moved out to reduce the surface area of the EVM, and keccak256 remained as an opcode because it is "supposed to be used more widely". There is a struggle around these, and I personally am against making the EVM more complicated and more CISC-like, as opposed to introducing useful primitives. |
Having constant-time dispatcher as a precompile is a nice idea, but I afraid it is almost impossible to get it into consensus layer - this could took months or event years of discussions. |
But using |
BTW I developed salt miner tool in rust: https://github.com/1inch/solidity-dispatcher-miner Brute-forces almost 0.5 MSalts/s on a single core on my 8-core Mac Book M1 Max - works nice for up to 25-30 selectors. Estimate execution time as following: |
This issue has been marked as stale due to inactivity for the last 90 days. |
Hi everyone! This issue has been automatically closed due to inactivity. |
Please reopen |
Thx for discussing this topic. Please, consider the aspect that not all functions need to be that gas efficient. I would rather have a dispatcher with ability to prioritize - e.g. ERC20.transfer() to be jumped with lowest gas overhead, while supportsInterface can have higher gas overhead. In other words - if a prioritization would help for mentioned bucketing I could foresee the priority flag aside of the public / external function definition. e.g.:
|
Today I spent some time on prototyping and was managed to achieve O(1)-complexity dispatcher. Check out few my messages here: vyperlang/vyper#2386 (comment)
And find code prototype here: https://gist.github.com/k06a/737fd1f389bb6f2e66f0ef76ea73729b
Abstract
I believe dispatcher implementation initially was O(N), but now it may be tree-like O(log(n)). I assume having an O(1) dispatcher when some contract has 5+ methods is a cool achievement.
Motivation
Contracts with tens of methods could have really slow dispatcher for most of the methods. Computing keccak256 and 2 jumps is a relatively cheap way to dispatch.
Specification
We just need to brute-force
magic
values at compile time to achieve all selectors being mapped into 0..N-1 values. The complexity of such brute force should beN^N/N!
. If the number of selectors is really huge we could, for example, find the first function for splitting into a few groups by 8, and then use different magic for each group. Having 1 magic number you could split 32 selectors into 4 groups by 8 and then find magic for each group. Finding magic for a group of 8 would take8^8/8! = 416
iterations in average. Seems possible to do this in compile-time.Pseudocode:
Backwards Compatibility
The resulting dispatcher code should be fully compatible for external calls with any other versions of the dispatcher.
The text was updated successfully, but these errors were encountered: