Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIP-2551? | Ed25519 Elliptic Curve Group Primitives in Plutus Core #308

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
271 changes: 271 additions & 0 deletions CIP-2551/CIP-2551.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,271 @@
---
CIP: ?
Title: Ed25519 elliptic curve group primitives
Authors: Morgan Thomas <[email protected]>
Discussions-To: <[email protected]>
Status: Draft
Type: Standards Track
Created: 2022-07-27
* License: Apache-2.0
---

## Preamble

RFC 822 style headers containing metadata about the CIP, including the CIP
number, a short descriptive title (limited to a maximum of 44 characters),
the names, and optionally the contact info for each author, etc.

## Simple Summary

This CIP proposes that Plutus be extended with native code
implementations of certain mathematical operations which are useful for
cryptography. These mathematical operations are used in the implementation
of the Ed25519 public/private key signing protocol. Cardano already
supports a native code implementation of verifying signatures created by
the Ed25519 signing protocol, but it does not enable Plutus to scripts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the Ed25519 signing protocol, but it does not enable Plutus to scripts
the Ed25519 signing protocol, but it does not enable Plutus scripts to

perform the underlying mathematical operations directly. Being able to do
this would be useful for implementing other cryptographic protocols based
on the Ed25519 mathematical operations. Applications include verifying
zero-knowledge proofs, such as in the context of implementing zk-rollups.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only application that these build-ins would be used for? If yes, would it be preferable to have the ristretto prime order group or would it be fine to work with the group defined over Edwards25519 which has cofactor 8?


## Abstract

Extend Plutus with primitives for performing calculations over the
Ed25519 curve and the corresponding finite field. These operations are
to include the basic group and field operations, as well as multiscalar
multiplication.

## Motivation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would include a paragraph explaining why using SNARKs over pairings is not an option for this use case. My understanding is that KZG10 requires a universal setup, and that is not acceptable. I think a few lines explaining why that is the case would be a nice addition.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need a motivation of why we use a curve with cofactor 8 (such as the edwards curve) instead of one with cofactor 1 which is usually safer. This is for efficiency, as edwards curves are much more efficient. Nonetheless, I think it is necessary to explain the danger of having a group with cofactor 8. This makes me wonder if we also need built-in functions to check if the points are part of the prime order group or not. What do you think?

CIP-0381 proposes inclusion of the Bls12-381 curve in Plutus. Bls12-381
is a pairing-friendly curve. Not all applications of elliptic curves
require a pairing-friendly curve. Choosing a non-pairing-friendly
curve where a pairing-friendly curve is not required can lead to better
usage of computing power. The Ed25519 curve allows for more efficient
operations as compared to the Bls12-381 curve. Pairing-friendly curves
require the use of more bits in order to achieve security (255 bits
for Ed25519 vs 381 bits for Bls12-381). Therefore it would be helpful,
for those applications which do not require a pairing-friendly curve,
to include a non-pairing-friendly curve in Plutus.

An example application of the Ed25519 curve is in verifying
zk-SNARKs on-chain. Halo 2, for example, can be used with the Ed25519
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many EC operations are needed to implement Halo proof verification? This was a concern already with the Bls primitives: the primitives themselves are already expensive, and if we need many of them to actually perform Halo verification, then we may still not have enough resources to do so. It would be sad to implement these primitives and then find out we still couldn't do the thing we actually wanted to do.

I'd also be interested in knowing how long it takes to verify a typical Halo 2 proof on typical consumer hardware. If it's in the order of milliseconds then we are in trouble.

curve as the underlying curve. In its standard incarnation, Halo
2 uses the Pasta curves (Pallas and Vesta). This is a 2-cycle of
non-pairing-friendly curves. The Pasta curves are specialized to a
particular recursive proof strategy, as described in [Section 4.6 of the
Halo 2 book](https://zcash.github.io/halo2/background/recursion.html/).
For non-recursive proofs, the Pallas and Vesta curves are not required,
and the Ed25519 curve suffices.

Halo 2 over the Ed25519 curve offers short proofs and relatively low
computational complexity of proof verification. These are the
characteristics we are looking for in a zk-SNARK theory optimized for
on-chain proof verification.

Recursive proofs are very important in some applications of zk-SNARKs:
zk-rollups, in particular. Recursive proofs can be homogeneous (verifying
a zk-SNARK produced by theory A in a zk-SNARK produced by theory A), or
heterogeneous (verifying a zk-SNARK produced by theory A in a zk-SNARK
produced by theory B).
Comment on lines +68 to +70
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this terminology of "zk-SNARK produced by theory A". Could you elaborate that a bit more?


Halo 2's recursive Pasta curves proof strategy is heterogeneous: it
uses proofs over the Pallas curve to verify proofs over the Vesta curve,
and proofs over the Vesta curve to verify proofs over the Pallas curve.

Orbis Labs has been researching ways of building recursive proofs
off-chain and verifying them on-chain, for the purposes of zk-rollups. One
way to do this is to use FRI-based zk-SNARKs for homogeneous recursive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a link to FRI commitment schemes here, or at least some context of what they are, would be helpful.

proofs. A downside of this is that proof sizes are expected to be
substantially larger than the Cardano transaction size limit when using
FRI, necessitating multiple transactions to verify a proof. Another
downside of this is that for recursive FRI proofs, it makes sense to use
a circuit-friendly hash (e.g., Poseidon or Rescue), but Cardano does not
currently support these natively. Cardano-friendly hashes are those which
are implemented as Plutus primitives: SHA2-256, SHA3-256, and Blake2b-256.

The issue is that currently there is no hash function which efficient
both in-circuit and on-chain. A relatively simple workaround for this
Comment on lines +87 to +88
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But do we need a hash function which is efficient both in-circuit and on-chain? Why would we need plutus scripts to understand the snark friendly hashes? Wouldn't it suffice to have the plutus scripts verify the proofs without requiring any snark friendly hash computation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I understood. Here you are saying that Cardano would need to understand the snark friendly hashes in order to verify the proof, as FRI based proofs require hash computation during the verify procedure. Is that right?

issue with homogeneous FRI recursion is to use heterogeneous recursive
proofs, where an on-chain proof is generated using FRI with over a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
proofs, where an on-chain proof is generated using FRI with over a
proofs, where an on-chain proof is generated using FRI over a

Cardano-friendly hash function, and it recursively verifies a proof
generated using FRI over a circuit-friendly hash function. With this
change of hash function heterogeneous recursive proving pattern, we can
have efficient hashes both in-circuit and on-chain.

More generally, we can decouple the choice of zk-SNARK theory for on-chain
proof verification from the choice of zk-SNARK theory for off-chain
recursive proof building, as long as we can reasonably efficiently
convert one form of zk-SNARK to the other using recursive proving.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably my lack of expertise in SNARKs is limiting me here (so maybe @davidnevadoc can comment here). Could we build a FRI based SNARK that uses cardano friendly hashes for the proof generation/verification, but at the same time use SNARK friendly hashes for the circuit of the computation we are trying to generate a proof for without requiring any type of recursion?


For efficient off-chain proving, it makes sense to use a FRI-based
zk-SNARK theory, since this leads to relatively fast proving times,
at the cost of larger proof sizes, compared to Halo 2 for example. This
also has the advantage that recursive proofs are relatively simpler to
implement, because they do not require an accumulation scheme over a
2-cycle of curves. FRI allows for homogeneous recursive proving.

For efficient on-chain proof verification, it would make sense to use
an elliptic curve based zk-SNARK theory, such as Halo 2, since this
leads to relatively small proof sizes, potentially allowing for a proof
to verified in a single Cardano transaction. However, realizing this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to verified in a single Cardano transaction. However, realizing this
to be verified in a single Cardano transaction. However, realizing this

possibility requires primitives for a suitable choice of elliptic curve,
one which is amenable to efficient proof verification.

For realizing Halo 2 proof verification on-chain, we would want primitives
for Pasta curves. However, Pasta curves are not the most ideal choice of
curve for proofs that will not be recursively input to other proofs. When
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hard to parse this sentence.

this is not required, the Ed25519 curve is a better choice because using
it reduces proof size and verification costs. When using FRI for off-chain
recursive proof composition, this requirement is not present. Instead, we
would have a zk-SNARK over an elliptic curve which verifies a FRI-based
zk-SNARK, and that zk-SNARK over the elliptic curve would be the final
proof in the tree of recursive proofs, the one which is posted and
verified on-chain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this recursion be something like

        ┌─────┐
        │Halo2│
        └──┬──┘
           │
         ┌─┴─┐
         │FRI│
  ┌──────┴┬─┬┴──────┐
  │       │ │       │
┌─┴─┐ ┌───┤ ├───┐ ┌─┴─┐
│FRI│ │FRI│ │FRI│ │FRI│
└───┘ └───┘ └───┘ └───┘

or instead something like

        ┌─────┐
        │Halo2│
  ┌─────┴─┬─┬─┴─────┐
  │       │ │       │
┌─┴─┐ ┌───┤ ├───┐ ┌─┴─┐
│FRI│ │FRI│ │FRI│ │FRI│
└───┘ └───┘ └───┘ └───┘


On-chain Halo 2 proof verification would open the door to on-chain
verification of more or less all zk-SNARKs, and not just those
mentioned. Halo 2 is a flexible enough theory to encode computable
functions and thus the verification algorithms of arbitrary zk-SNARK
theories. Therefore this CIP should be seen as a step towards efficient
on-chain verification of zk-SNARKs in general.

That describes our particular motivation for wanting the Ed25519 curve in
Plutus at Orbis Labs. Furthermore, we believe that the Ed25519 curve is a
broadly useful cryptographic primitive, when one needs a cryptographically
secure elliptic curve but one does not need a pairing-friendly curve or
a 2-cycle of curves.


## Specification

Add to Plutus the following types:

* `Ed25519GElement`: the type of elements of the Ed25519 curve.
* `Ed25519FElement`: the type of elements of the finite field of order
2^255 - 19.

Add the following functions:

* Basic group operations:
* `Ed25519_G_add :: Ed25519GElement -> Ed25519GElement -> Ed25519GElement`
* `Ed25519_G_neg :: Ed25519GElement -> Ed25519GElement`
* Basic field operations:
* `Ed25519_F_add :: Ed25519FElement -> Ed25519FElement -> Ed25519FElement`
* `Ed25519_F_mul :: Ed25519FElement -> Ed25519FElement -> Ed25519FElement`
* `Ed25519_F_neg :: Ed25519FElement -> Ed25519FElement`
* `Ed25519_F_recip :: Ed25519FElement -> Maybe Ed25519FElement`
Comment on lines +153 to +157
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these build in functions be replaced with mod operations using normal Integers? Also, what is recip?

Copy link
Collaborator

@perturbing perturbing Dec 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My take on the recip function, it's the reciprocal (the multiplicative inverse) of the field.

* Vector space operations:
* `Ed25519_scale :: Ed25519FElement -> Ed25519GElement -> Ed25519GElement`
* `Ed25519_scale_and_add :: [Ed25519FElement] -> [Ed25519GElement] -> Ed25519GElement`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is usually called multiscalar_multiplication (I think you even use that terminology at the beginning of the document)

* Conversions:
* `Ed25519_G_serialise :: Ed25519GElement -> ByteString`
* `Ed25519_G_deserialise :: ByteString -> Maybe Ed25519GElement`
* `Ed25519_F_serialise :: Ed25519FElement -> ByteString`
* `Ed25519_F_deserialise :: ByteString -> Maybe Ed25519FElement`
* `Ed25519_F_to_integer :: Ed25519FElement -> Integer`
* `Ed25519_F_from_integer :: Integer -> Ed25519FElement`
* Equality comparisons:
* `eq :: Ed25519GElement -> Ed25519GElement -> bool`
* `eq :: Ed25519FElement -> Ed25519FElement -> bool`
* Identities:
* `Ed25519_G_zero :: Ed25519GElement`
* `Ed25519_F_zero :: Ed25519FElement`
* `Ed25519_F_one :: Ed25519FElement`

This makes for two new types, 16 new functions, and three new constants.

This CIP writes the Ed25519 elliptic curve group in additive notation.
The vector space operations consider the Ed25519 curve as a vector
space over the scalar field `Ed25519FElement`. This implies that there
is a scalar multiplication operation, which is `Ed25519_scale`. This
also implies that we can take linear combinations of vectors of curve
elements, which is what `Ed25519_scale_and_add` does.

The Ed25519 curve is the elliptic curve defined by the equation:

```
-x^2 + y^2 = 1 - (121665/121666) * x^2 * y^2
```

This is a twisted Edwards curve.

Field elements are serialized in big endian notation, taking up 32
bytes each. This leaves one unused bit, which should be set to zero.

Group elements can be serialized in compressed form (just the
x-coordinate), or in uncompressed form (both the x-coordinate and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compressed form is a the x coordinate and a bit defining the y coordinate.

the y-coordinate). A group element encoding consists of 65 bytes in
compressed form, or 129 bytes in uncompressed form. The first byte should
have all bits except the three least significant bits set to zero. Let
b0 denote the least significant bit, b1 the second least significant,
and b2 the third least significant.

* b0 is 1 if and only if the group element is in compressed form.
* b1 is 1 if and only if the element is the point at infinity.
(The rest of the encoding after b1 should be zero in case b1 1 is 1.)
* b2 is 1 if and only if the element is in compressed form and not the
point at infinity and the point is "positive," meaning that its y
coordinate is lexicographically greater than the y coordinate of the
other point on the curve with the same x coordinate.
Comment on lines +204 to +210
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see where the "extra" byte comes in from above, but it still should be 33 in compressed form and 65 in uncompressed form. However, I don't think this is the most efficient representation. We can use a representation that requires only 32 and 64 bytes. Also, due to the fact that we need to cost functions, I would stick to a single serialised representation (possibly the compressed one). Otherwise it would be hard to cost the function as the cost would differ for compressed and uncompressed group elements.

Comment on lines +204 to +210
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see where the "extra" byte comes in from above, but it still should be 33 in compressed form and 65 in uncompressed form. However, I don't think this is the most efficient representation. We can use a representation that requires only 32 and 64 bytes. Also, due to the fact that we need to cost functions, I would stick to a single serialised representation (possibly the compressed one). Otherwise it would be hard to cost the function as the cost would differ for compressed and uncompressed group elements.


A compressed group element encoding consists of the one byte
containing the three flags, followed by field element encoding of the x
coordinate. An uncompressed group element encoding consists of the one
byte containing the three flags, followed by two field element encodings,
first the x coordinate and then the y coordinate.

## Rationale

The inclusion of `Ed25519_scale_and_add` is to allow for an efficient
implementation of taking linear combinations of group elements. This
is important for efficient proof verification for Halo 2's inner
product argument. For example, this operation can be implemented using
[Pippenger's algorithm](https://jbootle.github.io/Misc/pippenger.pdf).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or Strauss algorithm would also be an option, depending on the size of the vector size.

Likely, an efficient implementation of this operation would be useful
for other applications of this curve, too.


## Backward Compatibility

There are no expected backward compatibility issues.

## Test Cases

Property test the group, field, and vector space axioms. Also property
test that `Ed25519_scale_and_add` produces the same result as zipping
the lists with `Ed25519_scale` and then folding them with `Ed25518_G_add`.

Provide golden tests for serialization and deserialization. These are unit
tests which provide explicit inputs to serialization and deserialization
and explicit expectations as to the results. Also provide round trip
property tests for serialization and deserialization.

Provide round trip property tests for `Ed25519_F_to_integer` and
`Ed25519_F_from_integer`. The round trip properties for these functions
are that

```
Ed25519_F_from_integer . Ed25519_F_to_integer = id,
```

and that for all integers `x`,

```
Ed25519_F_to_integer (Ed25519_F_from_integer x) ≡ x mod 2^255 - 19.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from integer computes the modular reduction? If that is the case, it should be mentioned.

```

Provide property tests which check that the equality comparisons are
equivalence relations (relexive, transitive, and symmetric).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
equivalence relations (relexive, transitive, and symmetric).
equivalence relations (reflexive, transitive, and symmetric).


## Implementations

Author is not aware of a Haskell binding to a fast and battle tested
implementation of the Ed25519 curve. The first step to building an
implementation would be to find or build such a binding. After that,
the next steps would be to make Plutus depend on this binding and add
the specified primitives to Plutus.

## Copyright

This CIP is licensed under Apache-2.0 by Orbis Labs (2022).