[Proposal] Reward allocation design according to Stake ratio #80
Labels
C: proposal
Classification: Proposal for specification, algorithm, architecture, or communication
Milestone
TL;DR
Reward allocation algorithm in Random Sampling without Replacement (non-duplicate elections) with a compensatory model of opportunity loss.
n
Voters fromN
Candidates without duplication, the problem is that the Voter who is selected in thek
-th selection loses the opportunity to earn the rewards for the restn-k
times selection.α
to be the base reward for winning of one selection,s[I]
to be the amount of Candidatei
’s stake,S
to be the total amount of stake for the set of Candidates,σ[r]
be the total stake of the Candidates who were selected before ther
-th selection. Then thei
’s expected reward inr
-th selection isαe[i,r]
=α
s[I]
/ (S
−σ[r]
).n-k
selection to the reward for Candidate who was elected in thek
-th selection, in addition to the base reward for winning.k
-th selected Candidate in the selection of n Voters eventually acquires the next reward:This means that the reward up to the
k
-th selection are obtained by random simulations and the rewards that fail to participate in selections after thek
-th selection are obtained in terms of probabilistic expectations.Problem Setting
What we want to do: Creates a randomly selected subset from a set. In this case, the elements of the set have weights that represent the probability of being selected. Candidates who are selected will be rewarded. we want this reward to be proportional to their weight.
On this page, I'm considering a way to select exactly a given number n of elements (Voters) from a set (Candidates) with a probability corresponding to their weight (Stakes). And we also want the reward of each Candidate to be exactly proportional to its weight.
The purpose of this page is to lead a formula that solves the problem of fairness in 2.
The problem in 2 is that the total Stake in Candidate set becomes smaller as the Voter selection progress, so the probability of being elected by a Candidate who is unlikely to be elected, i.e., a Candidate with lower Stake, becomes a higher probability than the Stake it holding. In fact, as shown in Fig 1, in case we repeat the election that elects 25 Voters from a set of 100 Candidates with inverse bias in Stake, if all 25 Voters acquire the same reward in one election, then the Candidate with the higher Stake will acquire a lower rate of the overall reward distribution.
Fig 1. The ratio of Stake that each Candidate holding and the reward that each one earned in non-duplicate case. In the service requirements, the two should be aligned.
Perspectives and Formulas
I assumed that the underlying cause of the reduction of reward for Candidates who are more likely to be selected is the loss of opportunity due to their inability to participate in selections after selected in the round. In other words, adding the number of elections it has already been elected to and couldn't participate in, and the expected amount of its income, to its reward, would seem to be a fair distribution.
More precisely, the increase in the probability of winning is due to a decrease in the total Stake in the Candidate set as the result of the elected Voter drops out of the Candidate. Therefore, the chance of being elected is getting higher in later elections than in the first.
Let α be the base reward and si be the Candidate i's Stake holdings, The i's expectation reward for a single selection can be expressed as follows:
where S is the total Stake holdings of the candidates participating in that election.
Expected reward in duplicate case
(You can skip this section because this is an explanation as comparative.) I notate S=S=∑isi if all of Candidates participate in the all n selection (i.e., if there is a possibility of multiple winning for a selection) because the S is a fixed value for all selection. Thus, the expected reward for n selections from a Candidate set can be simply added up and expressed as follows:
Due to the nature of the multinomial distribution, repeating n=1 100 times, repeating n=25 4 times, and doing n=100 once all yield the same expected reward.
Although this case can work simple and fair like Fig.2, it doesn't always result in n Voters being selected n times due to duplicate elections. Thus, the Byzantine assumption may be weakened by a smaller number of Voters than assumed. And due to the possibility of consecutive duplicate selections, it's not possible to guarantee the end time even if the selection is repeated until a certain number of Voters are selected.
Fig 2. Reward distribution ratio in case of the duplicate winning scheme.
Expected reward in non-duplicate case
By excluding the selected Voters from the Candidates, exactly n Voters can be selected in n elections (if N≥n).
However, as the number of Candidates decreases, the total amount of Stake in Candidate set decreases, and the probability of being selected increases in later elections. This would result that each total reward amount will not proportional to the number of Stake holdings. The trend is that Candidates with more Stake holdings are selected earlier and those with fewer holdings are less likely to be selected so that Candidates with more Stake holdings are accounted for less compensation. This can be seen from the graph in Fig. 1.
There are two reasons why such a difference occurs.
Introduce the concept of reward expectation
Assume a scratch lottery to win $1M for a particular person. If 100,000 people participate in this lottery, the expected reward per person is $1,000,000÷100,000 = $10 (eliminating the gambling nature as a pastime, it's worth buying the scratch lottery if you can buy it for $9, but it isn't if it's $11).
This also means that if you have the scratch ticket but don't open, you suffer the $10 opportunity loss. So if you couldn't participate in the lottery for some unavoidable reason such that the numbers weren't written of print error, your opportunity loss is $10. However, in a very rough way, it could say that if you get the expected reward of $10 as compensatory of the opportunity loss, the distribution of worth with other participants will be fair.
The point that participating in the lottery itself is worth $10, regardless of whether you win and get $1M or miss out and get $0.
Opportunity loss compensation model
The following example Fig 3 represents an election that selects four Voters, i,j,k,l, out of a set of N Candidate set.
Fig 3. An election to select four Voters from a set of N Candidates.
This model is based on the hypothesis that we can make it fair by compensating for lost opportunity rewards as expected reward that would have ben probabilistically obtained i it had participated din the selection.
In summary, a Candidate i who is selected for the k-th selection should be rewarded with the following:
where Sˆr is the total stake of the Candidates selected from r=1 to k−1.
Result
Here, I show the reward distribution by the simulation. The amount of Stake held by each Candidate and the reward for them are expressed as an overall ratio.
Fig 4.1. Stake distribution with inverse bias.
Fig 4.2. Stake distribution with linear bias.
Fig 4.3. Stake distribution with a flat.
Fig 4.4. A case where Stake is largely ×100 more biased toward one Candidate.
Fig 4.x shows the result of simulation in which only the base reward for winning the election and the opportunity loss is compensated by the expected reward in addition to the base reward in an election in which 25 Voters are selected from 100 Candidates. For 10000 trials, The base reward-only allocation (red points) results in lower distributions for Candidates with high Stake ratios, and higher distributions with relatively low Stake ratios. It can be seen that the allocation compensated by the expected reward behaves according to the Stake ratio.
Note that in the case where Stakes are flat in Fig 4.3, this shows a fair distribution even without correcting. In other words, the unfairness of bias is an inherent problem of weighted random sampling.
Fig 5.1. The number of Candidates is very small and close to the number of Voters.
Fig 5 shows the case where the number of candidates is small and the number of candidates is close.
Consideration
This simulation uses floating-point arithmetic. When the unit of reward is an integer, it's necessary to ensure that the results are not biased by rounding.
The total amount of reward α×n+δ paid in a single election will vary from election to election. However, if S1 is very large with respect to si, then δ will be a small value.
References
The text was updated successfully, but these errors were encountered: