-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicates labeled with partials (partial-25/50/75) decrease duplicates weigth, but also reduce the primary finding weigth in the overall award calculation #145
Comments
I completely agree, and think that Code Arena systems should be designed to encourage the discovery of unique findings and to motivate wardens to produce well-constructed reports. When a warden submits the only valid (with "valid" in this context meaning a 100% rated issue) High or Medium vulnerability along with a high quality report, I believe that, from Code Arena's perspective, such findings should be exceptionally rewarded. Moreover, wardens should be given additional incentives to craft exceptionally thorough reports and to evaluate the full scope of the vulnerability comprehensively. |
I totally agree with the author. In my opinion these kind of issues maybe should stay the same for the report but should be split for rewards. I have also just encountered a case where there was a very simple bug due to non EIP712 compliance, which was recognized by ~90 people which provided very low quality reports of this (max med, more like low severity). Me and 3 others were able to leverage that bug to steal funds and make it a high severity. Now all those 90 people got 25% duplicated, making our high pretty much worthless in the process. This functionality incentivizes people to as soon as they find a simple bug do a low quality report and let it be, as anyways someone else will spend the significant time to leverage the bug to a high severity, to which they will get duplicated. To get back to the original idea, my recommendation for this would be to leave it as one issue for the report but split into 2 for rewarding. This can easily be done when there are submissions that are dups where some of them only are med/qa and some are high. Only the highs would get bundled into one and duplicated (resulting in higher rewards for people that put in significant time to leverage a bug) and all the meds would get bundled into a med not eating up the rewards of the highs. This functionality would also remove the meta of just writing low effort submissions of every small bug without ever describing any valid attack path. |
This idea only works for this specific case. Now imagine 100 people put in the effort to report this as high and one guy is lazy and only reports the most obvious impact as med. He would get a solo med for putting in less work. I think the way it works now does make sense, even if it does indeed not reward putting in effort for simple bugs. If all people find the same root cause of an issue then each of these reports would have had made the sponsor aware of its existance. And thats why the 100% finding only gets the credit of a finding with 3 duplicates. Because 3 people found the root of the problem, even if 2 didnt manage to see the full impact. However, I do see dontonkas point. It could also make sense to redistribute the reduction for partial findings to all duplicates that scored 100%. |
Agree that this doesn't make a lot of sense, intuitively a high with only duplicates with partial credit should be worth something between a solo high and a high with full-credit duplicates. I was playing around with the reward formula and came up with a small change that I think could work:
Instead of counting the number of findings, we take the sum of the total credit awarded to all findings to calculate the pie. This means that a high with one duplicate is worth the same as one with two duplicates with 50% credit. That strikes me as reasonable. And instead of dividing by the number of findings, we multiply by the finding's credit and divide by the total credit. This gives us the share of the credit for a given submission. This is what the scoring would look like for the example above:
The pie and first slice end up being quite large, but that seems reasonable considering there are only 1.5 "findings". The effect is less extreme if the other 2 findings had 50% credit:
In this case, the first finding gets the same number of shares as if there was only one other duplicate, and the other two split the shares that would have gone to the duplicate. I couldn't think of any bad behavior this would incentivize but I may be missing something, it would be great to hear what others think.
@Minh-Trng I can see what you're saying but at the same time, the sponsor may have decided the finding is a non-issue, or the findings could have been rejected if it wasn't for the high-quality report that correctly identified the highest impact. So putting in that extra effort is important, and should be incentivized. |
The suggestion effectively would allow for exploitation:
Partial submissions would steal total rewards from other findings That's ultimately more gameable than what's available now The feedback of disappointment in having a partial award reduce the rewards of your finding is legitimate, the suggested fix offers more downside than the current situation |
Could you walk me through that? I wouldn't think it does since combined partial credit is treated the same as full credit. I just tried out an example:
And with partial findings:
Shares to first + partial credit = 6.14 I was thinking you were right but in fact this is also currently exploitable with a normal duplicate and I think it's just an edge case when
Shares to first + fake dupe = 6.21 Otherwise any number of partial credit findings amount to their total combined weight (4 duplicates with 25% credit are paid the same as one normal duplicate), so it's not any more (un-)profitable than submitting duplicates. |
tldr: We should reward unique information presented to a sponsor. If we all agree (which I think we do) that a solo unique H has the most sponsor value, then it stands to reason that more duplicates, if they are in fact 100% duplicates, degrade the value of that report. However, partial findings are typically marked as such because they didn't add the same value (or contain the same unique information) as a true (100%) duplicate and therefore the original H should be worth more to the sponsor because it represented unique information not present in other reports. I understand concerns about the attack vector is opens up, but do think there is probably a solution in the middle that might make this more fair. |
So my Keep in mind that today the current formula already So to put this into a requiement, would be this additional point:
This additional requirement would make sense to me, as for the sponsor, that reflect the reality. EDIT: Ok did the math to include the additional requirement, seems to work perfectly (modified my google sheet from my original post). Finding's pie would be calculated as follow, severity being 10 for High and 3 for Medium: Warden finding's pie would be as follow if selected for report: Which give the following formula in my current sheet for High-191 Finding's pie: Warden's pie (mine, selected for report): So this would reflect the 3 requirements indicated. And let's try to Finding's pie:
Warden's pie (for the warden selected for report):
Are those making sense? I think so. I think this represent exactly how it should be. So the formula in the doc would change to the following:
|
I agree this should be changed, I brought this up about a year ago and iinm the response to my help request was that this is the intended design. To put it simply, I think the rationality behind C4's formula (when no partials are involved) is to first dilute the pot of the bug by 0.9^(n-1) and then split that pot between all wardens who found this bug. I agree that the fix should be the formula suggested above by @dontonka, there are 2 points here:
|
Interesting. What was the resolution of the ticket exactly @0xA5DF ? |
I can't find the ticket in my email for some reason, but as I said, the help team said this is not a bug. I should've opened an org issue back then but I failed to do so. PS |
Yes this is really annoying, so many wardens didn't receive their proper share of award, but hey, things are not perfect but we need to learn from mistakes, but first recognize our mistakes/errors, fix them and improve as an organisation. |
Yup that's exactly what I meant! Then the slices just need to be scaled by the partial credit to get the final number of shares for a given finding. @0xA5DF note that there's also a partial-75 label now |
I don't think there's any wrongdoing on C4's end here, org issue is the right way to change this kind of stuff.
True, I need to update the script 🙂 |
With all respect, I have to I do agree that the right way to change is by involving the community with this post, which is in process. |
Since this conversation is now happening in two places, I'm going to bring it all here. dontonka:
Why would we ask you to bring this discussion here to discuss it if we wanted to hide something? There are simply tradeoffs. At the point of implementation of the award calc for partials, there was the choice of how the awardcalc was written. The title of this issue is exactly how it was written intentionally—to decrease the value of the identified bug based on others spotting the root cause, without compensating the duplicate equally based on not having fully rationalized the severity/impact. I get that there are folks who don't like how it was written and that's a completely fair and reasonable viewpoint. And my viewpoint may be a dated one that is good to revisit which is why the team asked you to open this issue and why we thanked you for doing so. But "pie" and "slice" are anachronisms anyway which remain based on the way the award calc script was originally written. There are other areas which equally do not fit within the analogy of pies and slices: what does a "30% bonus" mean in the pie and slice context, for example? The awardcalc script used to be open source. It was completely refactored a year or so ago and was intended to be made open. It will be again. dontonka:
In fact multiple members of staff looked deeply into this and engaged with you on the matter before moving ahead. I took it seriously when you raised it with me, pulled in multiple staff members, and this is where the analysis came to, along with the conclusion that if it makes sense to revisit the tradeoff, we will do so.
I'm not sure how you can argue that you should receive retribution without arguing that everyone else impacted by this tradeoff should as well. However if there were to be sufficient consensus among @code-423n4/judges that this is a bug and not an intentional tradeoff, then I think characterizing it as a vulnerability is perfectly reasonable and in such case the original identifier (@0xA5DF) should be awarded. |
Well, that would be a duplicate finding, mine would actually be the primary, and him a dup (probably Anyway thanks for replying, I did say my part, I have nothing more to add.
That would be ideal, but unfortunatelly |
I don't feel like you've been rude at all. I expected you to bring your position here—it's what was asked and what you've been thanked for multiple times—and of course it's rational to defend your opinion. |
Personally I don't understand the I challenge any |
I did provide reasoning above why the current solution is not illogical, which has been reiterated by sockdrawer. Even though I do agree that your suggestion is better, you shouldnt talk from a point of absolute truth about subjective matters. I also find it a bit audacious to trying to argue for a retrospective reimbursement just because you made a suggestion for an improvement of a current solution |
Recognize that before this implementation of awardcalc that what we now call partials would get full credit if a judge assessed them as a functional duplicate! You can likely find many cases from 18 months ago where a warden got awarded a full high or med as a duplicate without rationalizing the severity or describing the impact. So this current implementation has to be understood as a transition away from that and toward something that did not overly award partials. The assumption always was that partial duplicates were indeed duplicates per the decrementer (otherwise they presumably wouldn't be called duplicates at all?) I think it's very productive to have a conversation about
|
I read it but it's not clear, as you reply to another warden mainly. My proposal is not to dedup and separate in different issue, which is what Sherlook btw I beleive, I much more prefer the In software, product owner write requirement A, that is given to the engineer which implements the code to reflect the requirement, afterward demo is done to proove the product owner the requirement really doing A, then the feature is deemed accepted and rollout to production.
Definatelly, let's not focus on that thought, that's another topic, I am sorry. The root problem is because I don't feel I'm suggesting, but I'm fixing something broken, a bug. |
Yes agreed, this is where the suggestion come in (see point 2). So there are 2 items in play here:
|
Yes that's clear, there was a before and after, and it's fine, it's an evolution. Now once we evolved with partial, seems like it was understood one way, and behaved a different way for an edge case (primary warden being negativelly impacted as pie reduced), The part that you might not understand @sockdrawermoney is that right now partials are only doing their job for the dup with the partial, penalizing them, but for the primary (those at 100%), their slice remain the same as before partial world was introduced! So for them partial doesn't bring any plus value, which is |
Yes, and it does it well for the partial dup. While it's true that they compensate the primary more then the partial dup, where is the |
Well, that reference to software engineering really doesnt help your argument. You have the actual product owner here with you, telling you that the implementation does what it was supposed to do (and even how the requirement came to be) and that it is in fact not a bug.
all explained clearly by sockdrawer above |
The only thing I see is the following, which is not clear to me, unfortunatelly. That doesn't explain why there is a reduction in the finding's pie, at least I don't see it, and what are the tradeoffs, and what choices? That is what I'm asking:
|
This answers the question regarding the initial requirement very well.
thats just you designing different requirements, according to what you would like the behavior to be. It was not part of what was originally needed, can you please stop pretending as if it were? The whole concept of the Most people (including me) already do agree that your suggestion would be an improvement to what is currently done, isnt that enough? For some reason you seem very stubbornly focused on declaring it as a bug, even if the product owner (who knows best), tells you it isnt |
Alright, I'm definatelly stubborn, sorry. That will be my The way I understand this is the following. Let's break it down. Requirement: What I understand from this is the partial are duplicate as any other, so they should act as full decrementer. That's not what they are doing! They act as full decrementer The current calculation acheive otherwise the remaining part of the requirement I don't have anything more to add, to argue and spend enough time explaining my position on this issue as a whole. I wish all of the participants of this sensitive discussion a wonderful weekend! |
They indeed decrease the finding's weight, but at the end of the day the warden with full credit gets the same share as if the partial dupes were dupes with full credit. It indeed makes more sense for the funds to go back to the bug's pool, but it's not as if the current state doesn't make sense at all. I don't think you can argue it's so unreasonable that we should make retroactive changes. |
I'd like to see the math for a unique finding in the same example, what would be the Pie and Slice if the finding is unique? |
For a unique |
You see how the current behavior is kind of frustrating having a High with 2 But the |
I totally agree with the concerns @dontonka raised here and I totally understand how it feels to lose a chunk of the rewards you should get after sleepless nights of work as a result of this bug, because I myself got affected by this same exact issue. I strongly believe this is a bug. I am yet to see anybody providing a reasonable justification for this bug being treated as anything but a bug that needs to be fixed. If I report a high bug with two duplicates that has two 25-partial duplicates, and @dotonka reports another high with two duplicates which have no partials, why in the world should @dotonka get the same reward share as me? That makes absolutely zero sense.
What he is raising is a legit issue that needs to be fixed. That's a fact. Him being frustrated about this is 1. his right and 2. it doesn't change the fact that this needs to be fixed. I'm not sure what you're arguing about here. I'm also pretty sure if you were directly affected by this you wouldn't have said something like this.
The proposed solution will make sure that this is the outcome as well. If a high severity bug is reported with 2 non-partial duplicates, the finding pot would be 8.1, each finding would get 2.7 shares. That's 8.1 shares total (2.7 x 3) (excluding best report bonus) So worse reports will surely not get as much money as the primary report, and the primary report gets the treatment it deserves. Issue fixed. |
Thanks, that means a lot @H4X0R1NG. Chainlight in |
SInce you are new to this convo, I think you may have not caught up on all the context necessary to understand
That reaffirms what I just said about missing context, the line you quoted is not even remotely used as an argument against the proposed solution.
I am sure I would have, because its based on reason, not personal bias. If you look further up, you can see that I have already confirmed multiple times (again, context...) that I believe a change to be an improvement. So if anything, I should argue for it, no? In fact, there is a very good chance that I have been affected by this before |
OK I think Let's all grab a 🍿 and check what C4 DAO will do with all those. Have a wonderfull day ❤️ ! |
"Redistrubuting their lost share to others findings[...]is not that much the point" In reality, this seems to be a central aspect for you. Given that the partial findings already undergo an award reduction through the partial label attribution system (25%, 50%, 75%), based on the existing formula, theoretical redistributing of the reward further inflates the value for others. This results in an even more pronounced disparity between a standard satisfactory finding and a partial finding. In essence, a partial finding's value is diminished twice through:
This potential double penalty means that the requirements of the documentation would no longer be met, specifically the guideline that states, "25%, 50%, or 75% of the shares of a satisfactory submission in the duplicate set." It's important to clarify a possible misconception that a partial finding is nearly inconsequential and of minimal quality.
In the case of a solo findings, the slice is equivalent to the pie as it's shared between 1 finding.
To which we add the bonus for selected for report (30%) = 13
This issue has been addressed and explained in detail here: #145 (comment) |
This isn't actually the case. The proposed solution scales the shares of every finding, partial or not, such that the "pie" remains proportional to that of findings with no partial-credit duplicates. The size of the pie for a finding with two Here's again how the shares are currently distributed in that scenario:
And here's what it would look like with the proposed approach:
The ratio remains the same and partial duplicates even stand to gain since they get more than twice the number of total shares. To me, this seems an overall more sensible approach and a natural extension of the current reward calculation. |
I believe a better approach than the current implementation is desirable. However I actually do not think this is a desirable outcome:
Nor do I think a single issue identified by multiple people but only argued by one person for sufficient impact is the same as a true solo finding. A desirable outcome, in my view is:
|
Let's be clear on one item here, it might not be well receive but I'll say it anyway. The most impacted by the award system are the
Agreed. That is not the
Let's @sockdrawermoney
I'm not English native, so if you could please explain your thoughts is the most
This is the |
Just listened to the first part of the C4 office hours with @bytes032 and @GalloDaSballo, and I would like to respond their question. Btw thanks @bytes032 for aligning with me that this rather needs to be addressed ASAP. Scenario TODAY The issue, again, this reduce the value of the finding (doesn't make a That's a nice FUTURE (🙏 ) That represent the proper value of the finding: (0.008 *24) + 0.848 = 1.04 Those are simple math, I'm not sure @GalloDaSballo what you are trying to think someone could game this, there is nothing to game 😄, you guys seems to The only way this can be gamed is already present today by spamming duplicate for a finding, which decrease it's value. The fact that partial duplicate (even at partial-25) reduce the pie already as if the duplicate is full, and there is reason for that as discussed earlier (it's still a duplicate and act as a But tell me who in his right mind will create partial or even duplicate on purpose? The motivation for backstage warden to dedup invalid duplicate already exist today, pushing for some duplicate to be partial instead of full duplicate would be the same motivation, nothing new here.
@Simon-Busch please feel free to validate the number I provided up there, so the information is not only assessed by me. |
@sockdrawermoney Gotcha. It's true that partial duplicates reducing the pie by less than a total duplicate (a byproduct of the proposed solution) doesn't make as much sense intuitively as the pie being distributed in its entirety (what we're trying to address). Maybe the following is closer to what we're looking for:
Where This ensures that all duplicates, partial or not, decrease the pie by the same mount, and distributes all shares in the pie among duplicates in the appropriate ratio. This is what the same example would look like again:
I can see what you're saying. Even with this new formula the This seems undesirable if the goal of the If we shift from an intention of imposing penalties to one of more appropriately distributing rewards for a given finding among all its duplicates, this starts to look more logical. They're still getting significantly less than if they were full credit duplicates, and the main issue gets the lion's share. All this does is redistribute the "slashed shares" among all duplicates, according to their partial weighting. Allowing some of these to flow back to the slashed dups seems fairer than distributing them only among full-weight findings, and also fairer than the present scenario where they are simply lost. |
As discussed here: This change may allow stealing other findings rewards and funnel them to the main report I'd like to see the math of a finding with [0, N] partials as a means to see if that's the case or if the formula prevents it |
@GalloDaSballo it would be helpful if you could provide a specific example of the scenario you have in mind. The suggestion is to take the pie for a given number of duplicates and distribute it equally among all findings according to their partial weighing (25% - 100%). Here's an example for a high severity finding with 3 other duplicates and [0, 3] 25% partials, again using the formula
|
Here you go, @0xEVom is prooving this with facts, ty. So it's end up as follow. So duplicates (even partial) continue acting as
|
Hello, anyone here, tok tok. |
Just a personal advice, I don't think this bitter tone would do you much of a service in this space.
|
@dontonka I'm sorry to hear you ran out of popcorn before the previews were over. I'm awaiting a solution that sees clear consensus support from a few wardens AND at least one judge (ideally several) before it's considered a candidate for implementation. We're moving in the right direction and I respect the sense of urgency to improve something we all agree can and should be improved, but I'm not in a rush to force a decision here. |
Oh my man, I thought Wendy's had accepted your application 😄 . Yes indeed I run out I was too hungry, but just went to the counter to grab another one 🍿 so I can even complete the entire movie. I'm still
Yes that make total sense. And since only 2 judges has been involved on this matter, let's not involve more, as they seems pretty busy already. So let's do something simple, I'm gonna call out ALL the wardens that posted a comment here, @Minh-Trng, @0xEVom, @0xA5DF, @H4X0R1NG , @aliX40, @J4X-98 and 2 judges @0xean and @GalloDaSballo, and the action I'm requesting is simple, go over my post above (four post above this one), and put an ❤️ if you agree, or 👎 if you disagree with the new formula, which address the entire concerns raised in this discussion. If we get a majority of warden and both judge on the ❤️ side, we are good to go, otherwise back to square one, and the judge/warden in disagrement needs to explain their reasoning. |
@GalloDaSballo good morning, I can see you are in disagreement with the proposal, can you walk us throught your counter argument please? I thought the fact that you ❤️ the @0xEVom post was indicative of the contrary, but happy to understand further your perspective. |
Closing this ticket, as while there was hope, this is clearly not going anywhere. Judges involved in this discussion doesn't care to follow up, C4 staffs doesn't care much either as they don't interven really to push their judges to give an active opinion and follow up so this can be concluded, CEO doesn't follow up with my question asked 3 weeks ago, and finally not even the wardens participating here took the effort to give a ❤️ or 👎 on the post with the final formula, which means they also don't care much in the end. BTW we are all very busy, but that doesn't mean you don't have the time to spend few minutes in a window of 2-3 weeks. |
@dontonka No need to close the issue. As much as there is consensus agreement that an improvement is needed, there's not been a recommendation made yet without concerns voiced. If others feel this is worth addressing sooner, it'll emerge. I think people are quite busy with a few audits though :) |
Hello everyone, After thorough analysis, we have compared the existing formula with a new formula to handle partials, guided by @0xEVom's suggestion. As a conclusion, we are confident that this revised formula is viable and addresses @GalloDaSballo's concerns effectively. We deeply appreciate everyone's contribution and the diverse viewpoints shared during this discussion. Your input has been really valuable. We plan to release an updated version of our algorithm in the upcoming weeks. See the analysis below: Case study with dummy dataScenarios
FormulasUsing the current formulaUsing the new formulaWhat changed ?
Previous formula breakdowntest_warden_data_scenario_1
test_warden_data_scenario_2
test_warden_data_scenario_3
test_warden_data_scenario_4
test_warden_data_scenario_5
test_warden_data_scenario_6
test_warden_data_scenario_7
test_warden_data_scenario_8
New formula breakdowntest_warden_data_scenario_1
test_warden_data_scenario_2
test_warden_data_scenario_3
test_warden_data_scenario_4
test_warden_data_scenario_5
test_warden_data_scenario_6
test_warden_data_scenario_7
test_warden_data_scenario_8
Case Study ZetaChainPrevious formula breakdown
New formula breakdown
VariationHere's the combined table with "Award actual", "Award new", and "Variation" columns included, based on the provided data for each handle.
ConclusionThe new formula would work. The total variation sum is 0 which means that the award have been realocated as expected. |
Happy to see a full example which we can all reason through |
@Simon-Busch Thank you for doing this That was a long movie boys and girls 🍿, but happy to see it come to fruition. Now, let's address the Was the previous formula behavior |
@Simon-Busch I just noticed that the C4 docs describe the old awarding method for partial credit duplicates. Has this change been reverted, or is the documentation yet to be updated? |
@0xEVom (and @dontonka) The docs have not yet been updated with the new awardcalc math; we'll add it as soon as we can. Rest assured the change is being applied to the calculation; we just haven't managed to update the docs with the math, yet. |
Recently after having performed well the Zetachain contest, I was dissapointed with my awards. I had made the calculation kind of quickly at the end of PostQA and was hoping to get around 10k, but got only 7k, so that got me thinking, how is this possible, and I did the math again but accurately with the following spread --> zetachain calculation sheet and
confirmed
that there was a problem, or at least a misunderstanding from my part that was needed to be clarified, as my math were following the documentation.After having a long discussion with C4 staffs in a private thread, we identified the source of the misunderstanding. C4 did update their documentation very recently as follow: awards documentation.
Unfortunatelly, C4 staffs and myself didn't came to an agreement/understanding, hence why I'm creating this issue so that it is discussed with the whole C4 community.
Here is what doesn't make sense to me
Short version
One image is worth 1000 words, so here should be the proper value. This is good as it's exactly the case I had in Zetachain. I had a
High
with 2 dups which were atpartial-25
. What this means at high level is that this is "almost" a unique High, as the 2 other duplicates are classifiedpartial-25
, so the judge agrees that while they identify the same issue, they do findvery little
, which why they account for only 25% of a full duplicates. So they should be penalized (and they are in the moment as expected), BUT the problem is that the primary is not getting those rewards back!! Instead thefinding's pie is reduced
, so it's like saying thisHigh
isworth less
then another High with the same numbers of duplicates (without partials), in fact this High almost become a sort of a Boosted Unique Medium with this pie at 4.86 (a unique med pie is 3.9). Do everyone understand how this doesn't make any sense OR it's just me?The
finding's pie
CANNOT decrease, it's a High, it is not worth less then another High with the same amount of duplicates. By having the pie reduced, the rewards that the primary should have had is simply diluted among all the other wardens.Long version
Naaa, the short is enough and crystal clear.
The text was updated successfully, but these errors were encountered: