Improve flips quality by introducing a "flips quality score" #96

0JustinMiles0 · 2022-09-01T08:04:02Z

0JustinMiles0
Sep 1, 2022

Abstract

Introduce an optional flip grading session during the long validation session. Flips graded above average pay six times more iDNAs than flips graded below average. Accurate grading of flips pays the same amount of iDNAs as accurate reports. Identities are assigned a "flips quality score" that is calculated based on the average grade received by their flips over the last six validated ceremonies. The number of flips an identity can submit is dependent on its "flips quality score".

9 reasons to support this proposal:

Improve flip quality over time making Idena more AI resistant

Add a flip grading mechanism with minimal changes to the user experience

Offer a one-click grading system that penalizes all types of low-quality flips

Make mass-production of flips less profitable for flip farms

Distribute more flip rewards to high-quality flip makers

Remove the economic incentive to use extra reports on valid flips

Remove the economic disincentive to create more than 3 flips

Create a more equal access to the reports rewards fund for all participants

Make Idena ceremonies more fun to participate in

How to support this proposal:

This is a community-driven proposal and it needs your active support for consideration as a formal IIP. The core team works hard to make Idena better every day. Undoubtedly, we can count on the core team to design and implement proposals and innovations that make Idena more competitive and attractive. Having a smart and talented core team doesn't mean that the network participants should passively stay on the sideline. If Idena aspires to become a digital democracy, its participants must become active contributors which include voicing concerns and proposing solutions for the core team to assess. If you agree with this proposal, please consider supporting it. Below are a few ways you can actively support this proposal:

Upvote and comment on this proposal

Discuss this proposal on Telegram and Discord

Share your opinion and thoughts with the core team by email at [email protected] and ask the core team to take a public position on this proposal

Whether or not you agree with this proposal, your active participation will prove to all that Idena is a vibrant digital democracy.
Below are a few quotes to remind us that democracies can only be a reality if its constituents take an active role in its governance:

"Each citizen who participates in community affairs is keeping democracy alive." - Benjamin Franklin

"Democracy must be built through open societies that share information. When there is information, there is enlightenment. When there is debate, there are solutions." - Atifete Jahjaga

"Democracy is not a spectator sport, it's a participatory event. If we don't participate in it, it ceases to be a democracy." - Michael Moore

"As free citizens in a political democracy, we have a responsability to be interested and involved in the affairs of the human community, be it at the local or the global level." - Paul Wellstone

"A vital democracy requires an informed electorate, civil discourse, and bold thinking." - Jane O'Meara Sanders

"I'm tired of hearing it said that democracy doesn't work. Of course it doesn't work. We are supposed to work it." - Alexander Woollcott

"There can be no daily democracy without daily citizenship." - Ralph Nader

"The life of a republic lies certainly in the energy, virtue, and intelligence of its citizens." - Andrew Johnson

Motivation

Motivation 1: Improve flips quality to strenghen AI-resistance

Idena network security relies on the ability of the network to generate flips that are difficult for AI to solve. To be AI-resistant, flips stories need to be unique and unpredictable. One key mechanism that aims to prevent flips stories from being repeated is the "two keywords" rule which requires flip creators to integrate two randomly generated keywords into the story. Flips that don't contain the two required keywords are more likely to be reported. Despite the two keywords mechanism and additional rules, a large number of flips created are similar and predictable while being valid as per the flip creation rules. Because of their low quality, these flips are more easily solvable by AI which creates the potential for network security to be jeopardized. Also, many low-quality flips can easily be solved by a person without the need to read and understand the entire story conveyed in the flip. This promotes the ability for a person to more easily validate multiple identities. Valid flips of low-quality can be clustered in four main categories:

"Meme" flips: flips that contain a commonly used sequence of easily recognizable and sortable images often positioned in the first two images. A typical "meme flip" would be a flip containing a person sleeping in image 1 and a person waking up in image 2. Another variant of the meme flip would be a flip containing a person waking up in step 1 and a person brushing his/her teeth in step 2. These flips sometimes manage to follow all the flip creation rules and as such, avoid being reported.

"Template" flips: flips that reuse the same story template but insert the two required keywords in the template. Usually, the two inserted keywords are detached from the flip story.

"Mobile elements in static frame" flips: flips for which the background is static for all four images in which one or several elements are moving in a predictable direction (from up to down, left to right). A variant would be flips with a static background for which a drawing is progressing until completion. These flips are to a lesser extent assimilable to those breaking the "sequence rule" which consists in integrating a series of ordered numbers or objects.

"Two stories" flips: flips that contain two unrelated stories or two sets of two logically sequenced images. Usually, the first story or sequence is contained in images 1 and 2 and the second set of stories or sequences is contained in images 3 and 4. It is possible to tell that these two stories are unrelated because the flip would not be more or less coherent by changing the order of the two stories.

All the flips above are technically valid. Rather than adding new rules to the long list of existing rules, we propose to introduce a flip grading system that will incentivize identities to create high-quality flip stories. This system will penalize the submission of any four types (and more) of low-quality flips such as the ones described above. Also, this mechanism will increase the quality of the flips over time and will provide greater confidence that the Idena network is truly AI-resistant.

Motivation 2: Force flip farms to produce high-quality flips or have them face significant economic penalties

Flip farms are the main ones responsible for the mass production of low-quality flips. The goal of these actors isn't particularly to lower the security of the network. Rather, it is a consequence of them acting as rational economic actors within the boundaries of what is allowed them to do. There are at least three main drivers motivating flip farms from mass-producing low-quality flips:

Cost savings: low-quality flips are easier to mass-produce cheaply than high-quality flips
Easier validation of multiple identities per worker: some flip farms specialized in producing "meme flips", beyond the fact that these are easy to produce, they are also easier to validate fast during the short session. With "meme flips", there is no need to spend a few seconds to understand the flip story, a quick look at the first one or two images is enough to determine which flip option is the correct one. This can be used by flip farms to have their workers validate multiple identities during the short sessions.
No penalty for producing low-quality flips: as long as the flips are valid, it's unlikely for the identity who created them to incur a penalty. Unfortunately, flip farms have become masters at creating low-quality but valid flips.

With this proposal, flip farms will have to choose between creating higher quality flips hence diminishing motivational factors 1 and 2, or keep on creating low-quality flips which will remove motivational factor 3. Indeed, this proposal introduces harsh penalties on flips rewards for actors that create low-quality flips (more about this in the subsequent sections). It also introduces a "flip quality score" that will further reduce the number of flips that can be created by actors who produce low-quality flips.

This dilemma won't be a minor consideration for flip farms since flip rewards represented 46.7% of the total validation rewards distribution for the top 10 pools in the last five epochs.

Also, the share of flip rewards for the top 10 pools is significantly higher than the share of flip rewards for the entire network, which indicates how important of an economic factor flip rewards are to flip farms.

For these reasons, we estimate that continuing to mass-produce low-quality flips under the proposal could be a significant threat to the economic model of the flip farms.

Conducted Research

First, we need to define what is considered a high-quality flip. In the context of Idena, a high-quality flip is a flip that conveys a meaningful story to humans while being hard for AI to comprehend. The meaningfulness of a flip stems from the coherence in the choice and order of the images. However, coherent stories that are often repeated (such as in meme flips) aren't difficult for AI to solve. For a flip to be truly AI-resistant, the flip narrative needs to be unexpected. This attribute can be reached through human creativity and imagination. Story coherence is a prerequisite to an imaginative story, however, not all coherent stories are imaginative. As such, a high-quality flip is a flip that narrates a story that is both coherent and imaginative.

We put this grading framework to test on a series of 14 flips (Appendix A) of various supposed qualities to see what the grading distribution would look like. For each of these 14 flips, we graded the coherence and imagination level by responding to a series of questions aiming to assess these two attributes.

Questions to assess how coherent a flip story is (the more yes, the more coherent the story is):

Does the flip narrate a coherent story?

Does this flip narrate a single story?

Is there a logical progression from one image to the next?

Does each image add context that is useful to understand the story?

Are the two keywords necessary to understand the story?

Questions to assess how imaginative a flip story is (the more yes, the more imaginative the story is):

Are you unlikely to see a similar story like this one?

Does this story only use images that are unlikely to be found in other flips?

Would this story be interesting or fun to narrate?

Does the story narrative build up to an emotional reaction?

Are the two keywords used creatively?

Grading of the 14 sampled flips by the degree of coherence and imagination of their stories:

The grading framework mentioned above is useful for precisely ranking each flip but it can be too complex to use by participants during the ceremony (too many questions to ask for each flip). However, this framework is useful to validate the relevance of the two criteria (coherent and imaginative stories). We will call this framework the "comprehensive grading framework". To facilitate the grading by participants during the ceremony we propose the following framework called the "simple grading framework":

The simple grading framework categorizes the flips into four main buckets. Below is how the 14 flips would be categorized following the simple grading framework.

Drawing a horizontal axis (from red to green) representing the increase in the quality of the flips, we can easily determine for each flip if their quality is below or above the average of the 14 flips.

Based on this research, we propose to reduce the grading options to only two by having participants answer the following question for each flip available for the grading session:

How coherent and imaginative is the story narrated in this flip?

Option 1: Coherent and imaginative below average

Option 2: Coherent and imaginative above average

The average is determined relatively to all the flips present during the grading session. We think it's important to use a comparative grading since each participant will have a subjective appreciation of how coherent and imaginative each story is. It also forces participants to make a distinction between lower and higher-quality flips regardless of their absolute quality. While the comprehensive and simple grading frameworks are useful to determine a precise flip grade, we recognize that in practice most participants may not have the time or make the effort to rigorously assess each flip criteria by criteria. As such, we want the grading question to be as simple and intuitive as possible by limiting the options to two. The concepts of "coherent" and "imaginative" stories are easily interpretable by any human and the nature of the comparative ranking doesn't require participants to either use the comprehensive or the simple framework (although the latter is quite fast) to come up with accurate results. These frameworks can be shared as ranking guidelines. See Appendix B for an example of how the question and grading guidelines could be presented during the grading session.

This grading system presents the advantage to be simple and intuitive for the participants while allowing them to accurately rank flips by their relative quality.

Specification

Grading session

We propose to add a "grading session" following the "report session". Each participant will be allowed to grade the flips that they haven't reported including the ones that they haven't approved. Participants will not be allowed to grade flips that they have reported. For each available flip, the participant gets to answer the question: "How coherent and imaginative is the story narrated in this flip?" by selecting one of the two options: "Coherent and imaginative below average" or "Coherent and imaginative above average". Similarly to "report credits", each participant has a limited number of "below average" and "above average" credits. We propose the number of "below average" and "above average" credits to each be a third of the number of flips available during the long session. Participants can skip the grading session altogether or grade as many flips as they wish within the limit of the available credits.

Flips grading consensus

Only flips that don't get reported during the report session get a grade assigned to them. The grade of each flip is determined as such:

Low quality: a simple majority (51%) of the grading committee selected "below average"

Medium quality: the grading committee is split 50% "below average" and 50% "above average" or no one graded the flip

High quality: a simple majority (51%) of the grading committee selected "above average"

There is no minimum committee size required for a flip to get assigned a grade. All flips that haven't been reported have a grade assigned to them.

Flip grading flow chart:

Also, we think that there is a particular edge case that could produce some odd flip grading results. This edge case would occur when a flip gathers 50% or more than 50% of reports from the qualification committee but would not end-up being reported either because the committee is too small (less than three) or it needs one more report. In this scenario, it is possible that the flip would end-up being graded "high-quality" as qualification committee members who reported the flip wouldn't be part of grading committee (for the flip they reported). For the scenario in which a flip has 50% or more reports but it still valid, we propose to convert each reports to a "Coherent and imaginative below average" vote. We will reference this rule as the "50% reports conversion rule". The identities who reported the flip would receive flip grading rewards for this flip if it is graded "low-quality" or "medium-quality". Below are a few examples to illustrate how the "50% reports conversion rule" would affect the final flip quality grade:

Scenario 1:

Scenario 2:

Scenario 3:

Scenario 4:

Scenario 5:

Scenario 6:

Reports/grading rewards

Participants must be honest when grading flips. As such we propose to incentivize honest committee members each time their grades align with the "grading consensus". To achieve this, we propose to modify the "reports rewards" fund into a "reports/grading rewards fund" so as to not create additional coins emission. A flip correctly graded pays the same amount as a flip correctly reported. Flips graded medium pays each committee member half a correct grade or report. This new rewards distribution also creates a more equal access to the reports rewards fund for all participants as each flip pay a reward as opposed to only reported flip. The difference in how much a participant can access from the reports rewards fund isn't dependent on the number of reported flips but rather the total number of flips present in his/her long session which tends to be more similar from one participant to another.

Reports/grading payment ratio table:

Flips rewards

Currently, flip rewards are paid equally to each valid flip regardless of the flip quality. We propose to differentiate the reward amount based on the flip grade.

A flip of low quality pays a basic reward

A flip of medium quality pays a basic reward increased 3 times

A flip of high quality pays a basic reward increased 6 times

Flips quality score

After validation, an "epoch flip quality score" is calculated for each valid identity. Each flip has a "flip score" associated with it. The "epoch flip quality score" is calculated for each epoch taking the average of all "flip score".

In addition to an "epoch flip quality score", an "identity flip quality score" is computed by taking the average of the last six "epoch flip quality score". For identities that have less than 6 epochs of flip quality data, we take the average of all the existing "epoch flip quality score". The "identity flip quality score" is associated with an identity and will be used to determine the number of flip allowances for this identity for the next epoch.

Flip allowances

At the beginning of a new epoch, each identity is provided with a flip allowance which corresponds to the maximal number of flips an identity is allowed to create. The allowance is determined based on the "identity flip quality score".

Currently, there is a disincentive to create more than 3 flips as creating more flips increases the chance to have one reported which means losing 100% of the validation rewards. With the flip quality score, we wouldn't want to discourage high-quality flip makers from creating as many flips as they are allowed to do. In addition, the new proposed system would confer to low-quality flip makers better odds of not having their validation rewards slashed. For these reasons, we propose to update how reported flips penalize validation rewards so that no matter their flip allowance, each identity will have an equal risk in regard to validation rewards slashing.

Rationale

1) CREATES AN ECONOMIC INCENTIVE TO PRODUCE HIGH-QUALITY FLIPS

To better visualize the impact of this flip grading and scoring mechanism, we modeled the difference in flips rewards distribution with and without the proposed design. To this effect, we set the four following identity profiles:

Assumptions of the model:

The behavior of these identities do not change through the epochs (constant reported flip rate and quality flip rate)

The flips rewards fund is set at 100 iDNAs per epoch and distributed between these four identities

All valid flips get a strong consensus

All identities start at epoch n with 3 flips allowances and no flips quality score

In the current system, all identities create 3 flips per epoch

In the proposed system, all identities create as many flips as they are allowed to create (based on their flip allowances)

We compared the current system vs. the proposed system through eleven epochs

Impact on flips rewards distribution:

These charts show that the proposed flips scoring system does what it is intended to do as it distributes a higher share of the flips rewards over time to identities who produce higher-quality flips. Identities who don't improve the quality of their produced flips are severally penalized by the proposed design.

Impact on flips quality:

These charts show that even in a scenario in which the flips quality rate remains constant, the proposed system generate a higher share of high-quality flips over time through the flip allowances mechanism. That said, the main factor that will drive flip quality higher remains the incentives for low-quality flip makers to correct their behavior.

Evolution of flips quality score and flip allowances:

2) DISTRIBUTES MOST OF THE REPORTS/GRADING REWARDS TO HONEST PARTICIPANTS

The grading system couldn't be reliable if participants don't have an economic incentive to report and grade honestly. We wanted to verify for several adversarial scenarios to which extent the reports/grading rewards are distributed to honest committee members as opposed to colluding committee members. Below are the different parameters we used to build different adversarial scenarios:

Committee size: how many committee members are participating to the reports/grading sessions. We tested committee sizes of 4 and 6 members. We assumed the committee size to be fixed for all flips present for a given validation session.

Share of honest members: the share of committee members who report and grade flips honestly. Honest members are seeking to maximize their report/grading rewards assuming that the committee majority will be honest. We tested share of honest members of 3/4, 2/3 and 1/2. We assumed the share of honest members to be fixed for all flips present for a given validation session.

Share of colluding members: the share of committee members who collude by reporting and grading flips in a manner that goes against the report and grading rules/guidelines. In these scenarios, we assumed that colluding members aren't seeking to maximize their report/grading rewards. Instead, the main goal of colluding members is to favor flips of lower quality in an attempt to create a "low flips quality culture" among the network participants. We tested share of colluding members of 1/4, 1/3 and 1/2. We assumed the share of colluding members to be fixed for all flips present for a given validation session.

Number of flips in the long session: we choose to keep this parameter fixed at 12 flips per long session. Each committee member has 4 report credits, 4 below average votes and 4 above average votes.

Overall flips quality in the long session: long sessions can have a mix of different flips quality levels. We defined three flip quality levels:

Reportable flips: flips that break the report rules

Low-quality flips: flips that are valid but of obvious low-quality such meme flips, two stories flips etc.

High-quality flips: flips that are valid but of obvious high-quality (coherent story and at least somewhat imaginative)

Based on these flip quality levels, we tested three different mix of "overall flips quality" for a long session containing 12 flips:

Medium: long session contains 3 reportable flips, 5 low-quality flips and 4 high-quality flips. We assumed that the flips are shown to all members in the same following order: flip slot 1: low-quality, flip slot 2: high-quality, flip slot 3: reportable, flip slot 4: low-quality, flip slot 5: high-quality, flip slot 6: reportable, flip slot 7: low-quality, flip slot 8: high-quality, flip slot 9: reportable, flip slot 10: low-quality, flip slot 11: high-quality, flip slot 12: low-quality

Low: long session contains 6 reportable flips, 3 low-quality flips and 3 high-quality flips. We assumed that the flips are shown to all members in the same following order: flip slot 1: reportable, flip slot 2: reportable, flip slot 3: low-quality, flip slot 4: high-quality, flip slot 5: reportable, flip slot 6: reportable, flip slot 7: low-quality, flip slot 8: high-quality, flip slot 9: reportable, flip slot 10: reportable, flip slot 11: low-quality, flip slot 12: high-quality.

Very low: long session contains 9 reportable flips, 2 low-quality flips and 1 high-quality flips. We assumed that the flips are shown to all members in the same following order: flip slot 1: reportable, flip slot 2: reportable, flip slot 3: low-quality, flip slot 4: reportable, flip slot 5: reportable, flip slot 6: high-quality, flip slot 7: reportable, flip slot 8: reportable, flip slot 9: low-quality, flip slot 10: reportable, flip slot 11: reportable, flip slot 12: reportable.

We could also have tested a "very high" and "high" overall flips quality but decided not to do so because we wanted to keep the scenarios with a strong adversarial factor.

Report/grading strategy followed by colluding members: these are the rules the colluding members follow to decide which flips to report and how to grade flips. We assumed that all colluding members follow the same strategy in a given scenario. We defined two colluding strategies:

Strategy A:
1. Don't report any flip
2. Approve all reportable and low-quality flips
3. Grade high-quality flips as below average until runs out of below average votes
4. Grade low-quality flips as above average until runs out of above average votes
5. Use remaining above average votes to grade reportable flips as above average

Strategy B:
1. Report high-quality flips until runs out of report credits
2. Approve all reportable and low-quality flips
3. Grade non-reported high-quality flips as below average until runs out of below average votes
4. Grade low-quality flips as above average until runs out of above average votes
5. Use remaining above average votes to grade reportable flips as above average

In all tested scenarios, colluding members do not report and grade flips in the same order. For instance the first colluding member will start his/her report credits by going through the flip slots in sequential order starting with flip slot 1. The second colluding member will do the same starting with flip slot 2 (one flip slot increment from the preceding colluding member) and so on for following colluding members. The same incremental logic applies for the use of below average and above average vote.

Report/grading strategy followed by honest members: these are the rules the honest members follow to decide which flips to report and how to grade flips. We assumed that all honest members follow the same strategy in a given scenario. We defined two honest strategies:

Strategy C:
1. Report reportable flips until runs out of report credits
2. Approve all high-quality and low-quality flips
3. Grade non-reported reportable flips as below average until runs out of below average votes
4. Use remaining below average votes to grade low-quality flips as below average
5. Grade high-quality flips as above average until runs out of above average votes
7. Use remaining above average votes to grade low-quality flips as above average

Strategy D:
1. Report reportable flips until runs out of report credits
2. Approve all high-quality, low-quality and non-reported reportable flips
3. Grade non-reported reportable flips as below average until runs out of below average votes
4. Use remaining below average votes to grade low-quality flips as below average
5. Grade high-quality flips as above average until runs out of above average votes
6. Use remaining above average votes to grade low-quality flips as above average until runs out of above average votes
8. Use remaining above average votes to grade non-reported reportable flips as above average

In all tested scenarios, honest members do not report and grade flips in the same order. For instance the first honest member will start his/her report credits by going through the flip slots in sequential order starting with flip slot 1. The second honest member will do the same starting with flip slot 2 (one flip slot increment from the preceding honest member) and so on for following honest members. The same incremental logic applies for the use of below average and above average vote.

After testing many possible scenarios, it appears that only two parameters have a significant influence on how report/grading rewards are distributed between colluding and honest members. These two parameters are:

Share of colluding/honest members

Overall flips quality in the long session

Below is a comparison of how much of the available report/grading reward each member types would earn in different tested scenarios:

Below is a heat map summary of the different scenario tested:

Key insights:

No matter the strategy adopted by either group, honest members always earn substantially more rewards than colluding members (assuming a collusion below 51%). The best approach for a member looking to maximize his/her report/grading rewards is to report/grade honestly.

Honest members don't have an economic incentive to approve reportable flips when running out of report credits (strategy D). In several scenarios, honest members approving reportable flips earn less rewards than members abstaining from approving reportable flips. This is due to the "50% reports conversion rule" which would be triggered more often when honest members abstain from approving reportable flips when running out of report credits. This rule contributes to assign a low-quality grade to non-reported reportable flips which is the grade that honest members would chose first for these flips.

With a medium overall flips quality and one forth of colluding members scenario, honest members make closed to the maximum rewards amount possible (90%).

3) DELIVERS ACCURATE GRADING RESULTS AS LONG AS LESS THAN 51% COLLUDE

We also studied how closely to their true quality the grade assigned to flips would match in several adversarial scenarios. For the definition of the parameters used to build the different scenarios, please refer to the previous section. One new notion is introduced for this section:

Hierarchical order: it is the expected order by which we would expect the flips from different quality level to be ranked. Reportable flips should rank lower than low-quality flips which should rank lower than high-quality flips. As a reminder, possible ranks are "reported" which is a lower rank than "low grade" which is a lower rank than "medium grade" which is a lower rank than "high grade".

After testing many possible scenarios, it appears that only two parameters have a significant influence on how the flip grading accuracy. These two parameters are:

Share of colluding/honest members

Overall flips quality in the long session

Below is a comparison of how accurate the grading is for each flips quality level in different tested scenarios:

Below is a heat map summary of the different scenario tested:

Key insights:

Colluding members looking to keep/decrease the flips quality would be largely unsuccessful at doing so. The only way they could achieve this goal is by consistently having more than 50% of colluding members in the report/grading committees.

The only grading inconsistency that is observed in the tested scenarios is in the case where half of the committee is colluding and with an overall flips quality of low or very low. In these scenarios, the low-quality flips rank higher than high-quality flips. However, this scenario can only be consistently repeated if the Idena network has a generalized high level of collusion.

We see less reported flips in sessions in which there is a high share of reportable flips (as reports are spread across multiple reportable flips). This is also true with the current system. However, with this proposed systems, creators of non-reported reportable flips would be penalized as those would always get a lower grade and pay less rewards.

Conclusions:

We are highly confident that this proposed system would increase flips quality over time as we've proven that:

Producing high quality flips pays more rewards

Honest reporting/grading pays more rewards

Grades assigned to flips match closely to their actual quality

As long as the network maintains a generalized level of collusion below 51% these effects would remain in place.

Appendix A

Appendix B

Example of presentation for the grading question:

Example of presentation for the grading instructions:

Zen-44 · 2022-09-01T11:38:53Z

Zen-44
Sep 1, 2022

Would be useful to have some data on the economical technicalities. For example, will the Accurate grading of flips be rewarded from the report fund? (You mentioned it will be the same as reporting a flip.) If not, how will the allocated funds be changed to accommodate this?

2 replies

0JustinMiles0 Sep 1, 2022
Author

Yes, rewards from accurate grading would come from the same fund hence the suggestion to rename it reports/grading rewards. Besides avoiding to have to change the different funds structure, the logic to not create a new fund is so participants have a disincentive to use their extra reports (assuming they have some left) on flips they don't like but that are technically valid. If a flip is of low-quality but valid, then the good move would be to not report it (no need to approve it either) but to grade it below average. In my experience, valid low-quality flips rarely end-up being reported and flip farms are getting good at creating these (especially the two-stories flip kinds). A low-quality flip doesn't yield the same penalties than a reported flip but identities that keep getting a lot of those will see their "flip quality score" plummet which will severely reduce their ability to earn flip rewards over time (and we know how important these are to flip farms). Also a flip cannot both be reported and have a grade so it makes sense to use the same fund.

To your point, I will try to add some more info regarding the economical design choices (this week if I have time).

0JustinMiles0 Sep 2, 2022
Author

@Zen-44 I added a model to visualize the change in flips rewards distribution over time.

hlolve · 2022-09-01T12:51:56Z

hlolve
Sep 1, 2022

We can use present report method without need to go to other step. Will be needed only explain each button functionality and the purpose not to click them.

Option 1: Coherent and imaginative below average = not reported / not approved (gray at present)
Option 2: Coherent and imaginative above average = approved

1/3 to report
1/3 to high grade (approve at present)
1/3 to below average = not clicked (gray at present)

9 replies

0JustinMiles0 Sep 1, 2022
Author

@hlolve In the initial design I had in mind, the grading would take place during the report session and only approved flips could be graded. The flow would have been the following:

Option 1: Report
Option 2: Approve then, Option 2.1: Below average or Option 2.2: Above average

The main issue with this design is that they can be more reportable flips than reports available. In order to be graded as low-quality, these reportable flips would need to be approved which increases its chance of not being reported. You could also decide to do nothing which increases the chance for this flip of being graded as high-quality.

The main issue I see with your suggestion is that all valid but lower-quality flips (there could be many descent ones among these) wouldn't be approved and so more at risk of being reported which would be unfair.

In the end, the only satisfying design I could come up was to add a grading session. The drawback of this set up is that it adds an additional session. In the meantime, this may not be a bad thing as it will provide more space for explaining how the grading system works.

hlolve Sep 1, 2022

The main issue I see with your suggestion is that all valid but lower-quality flips (there could be many descent ones among these) wouldn't be approved and so more at risk of being reported which would be unfair.

At present you have this problem, good flips can be reported because validators don´t have incentives to vote in positive way, only in negative way for report reward. If you put 1/3 each and can win rewards in positive or negative way, you are without need of more complex reaching the same final result.

0JustinMiles0 Sep 1, 2022
Author

I agree that currently there is an incentive to use all your reports even if that means reporting valid flips. However, with this proposal that won't be the case anymore because a flip can only have one of the four states (reported, low-quality, medium-quality, high-quality) and that each state pays rewards to validators (low and high-quality pay the same as reported). If you report a valid flip (because you have extra report credits), you won't be able to grade it and so you'll be missing on the grading rewards paid for accurate grading. There would even be an indirect incentive to approve valid flips because, even if you can't directly get rewards for approving flips, you'll get 0 reward if a flip you graded gets reported.

ubiubi18 Sep 2, 2022

If i understand it right you Idea requires 3 additional clicks each flip. Is it mandative or optional?

Since reporting requires only 1 click and wrong reporting doesnt come with a huge penalty, i would expect the majority of the network to start to report nearly everything, if your more complex than one click grading becomes mandative. People tend to chose the path with less effort.

If grading is optional, that can look different?

0JustinMiles0 Sep 2, 2022
Author

@ubiubi18 Sorry if that's not clearly enunciated in the proposal but the grading session is optional. If you skip it, you'll just miss on potential rewards (similarly to skipping the report session). Also, grading a flip takes only 1 click. You can only grade flips that you haven't reported. So let's say you have a long session with 18 flips, you report 6 of them. Next, you'll see the grading session with only 12 flips. You can either skip the grading or you can grade as many or few as you want (each taking one click to grade). The one click is to choose between the two options: "Coherent and imaginative below average" OR "Coherent and imaginative above average". That's it! After, having reviewed the flips for the consensus, then the report session you should already have a good idea to which are those that are higher quality.

0JustinMiles0 · 2022-09-03T19:03:09Z

0JustinMiles0
Sep 3, 2022
Author

I updated the proposal by adding the concept of "50% reports conversion rule" which would convert "reports" to "below average" votes for valid flips gathering 50% or more of reports from the qualification committee.

1 reply

0JustinMiles0 Sep 6, 2022
Author

I updated the "Reports/grading payment ratio" table reflect the "50% reports conversion" rule.

dansus021 · 2022-09-30T13:20:41Z

dansus021
Sep 30, 2022

I actually agree with this idea, but why people with one bad flip didn't get reward at all :( at least need get paid even tho is only 1 idna what do u guys think?

0 replies

0JustinMiles0 · 2022-10-08T22:39:02Z

0JustinMiles0
Oct 8, 2022
Author

I added two sections to the rationale. The first section tests how rewards are distributed between honest and colluding members in various adversarial scenarios. The second section tests how accurate the flips grading remains in these same adversarial scenarios.

0 replies

Ed-Relight · 2022-10-09T11:21:00Z

Ed-Relight
Oct 9, 2022

Solid work, thanks. I like that you keep it simple for the regular users by making the grading session optional and making the flip quality choice binary. I support this proposal and believe it would benefit the network greatly

0 replies

sekisanchi · 2024-08-14T22:59:48Z

sekisanchi
Aug 14, 2024

I came up with a similar experiment from ground up with publicly available LLM service and some primitive tools, with a perspective model in my mind. It successfully exports attributes suitable for the task, and I'm sure nurturing the model improve the score.
The model also can detect the "template" as well.
We may have some handy platform to try the model efficiency, compete each other, and ultimately among "Human plus AI" competitions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve flips quality by introducing a "flips quality score" #96

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 7 comments 12 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Improve flips quality by introducing a "flips quality score" #96

Abstract

Motivation

Motivation 1: Improve flips quality to strenghen AI-resistance

Motivation 2: Force flip farms to produce high-quality flips or have them face significant economic penalties

Conducted Research

Specification

Grading session

Flips grading consensus

Reports/grading rewards

Flips rewards

Flips quality score

Flip allowances

Rationale

1) CREATES AN ECONOMIC INCENTIVE TO PRODUCE HIGH-QUALITY FLIPS

Assumptions of the model:

Impact on flips rewards distribution:

Impact on flips quality:

Evolution of flips quality score and flip allowances:

2) DISTRIBUTES MOST OF THE REPORTS/GRADING REWARDS TO HONEST PARTICIPANTS

3) DELIVERS ACCURATE GRADING RESULTS AS LONG AS LESS THAN 51% COLLUDE

Conclusions:

Appendix A

Appendix B

Replies: 7 comments · 12 replies

0JustinMiles0 Sep 1, 2022 Author

0JustinMiles0 Sep 2, 2022 Author

0JustinMiles0 Sep 1, 2022 Author

0JustinMiles0 Sep 1, 2022 Author

0JustinMiles0 Sep 2, 2022 Author

0JustinMiles0 Sep 3, 2022 Author

0JustinMiles0 Sep 6, 2022 Author

0JustinMiles0 Oct 8, 2022 Author

Replies: 7 comments 12 replies

0JustinMiles0 Sep 1, 2022
Author

0JustinMiles0 Sep 2, 2022
Author

0JustinMiles0 Sep 1, 2022
Author

0JustinMiles0 Sep 1, 2022
Author

0JustinMiles0 Sep 2, 2022
Author

0JustinMiles0
Sep 3, 2022
Author

0JustinMiles0 Sep 6, 2022
Author

0JustinMiles0
Oct 8, 2022
Author