Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Wallet rules for (re)play protection #60

Closed
wants to merge 29 commits into from

Conversation

phyro
Copy link
Member

@phyro phyro commented Aug 10, 2020

-> Bob_output
```

The key idea here is to observe that `T2` can't be replayed by anyone else. Only we can create the input needed for the transaction to be replayed `O2`. The attacker could attempt to replay the previous transaction `T` to create the `O2` input, but this would only be possible if they managed to replay transaction `T` which can't be replayed because it has an `anchor` output that has not been spent and hence can't be valid. This means that our transaction `T2` is protected against other parties replaying it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part "but this would only be possible if they managed to replay transaction T which" is redundant and can be replaced by "but T"


Most common transaction types are:
1. A 1-2 `Regular` tx - 1 input, 2 outputs
2. A 2-2 `PayJoin` tx - 2 inputs (one from each party), 2 outputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Payjoins are characterized by having inputs from both parties, not necessarily one each.
E.g. it's common that you need to combine multiple input to have a large enough amount to pay.
You might also want to spend additional unprotected outputs.


Let's define a clear separation of outputs that are protected from those that are not. We define a protected output generator `GenP` that allows us to either `create` a protected output or `check` whether an output is `Protected`.

An output is labeled as `Protected` *only if* it was created in a transaction where we contributed an input that was labeled as `Protected`. An exception to the rule are outputs that are created as a part of the _Bootstrap_ transaction discussed in the next section.
Copy link
Contributor

@tromp tromp Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't use an emphasized only if when you're going to state an exception right away. Instead you can say something like:
An output is labeled as Protected if it was created in a transaction where we either contributed a protected input, or contributed an anchor output.

There are a few choices how to label outputs as `Protected` while still being able to identify them across difference devices:
1. Call to `create` uses a new derivation path `P` that is used only for creating `Protected` outputs. Similarly `check` uses the same `P` to check whether an output is protected.
2. Call to `create` creates a specific `r` value for `Protected` outputs e.g. they should start with `N` zeros or be divisible by some number `M`
3. Call to `create` uses additional output information in the ~30 bytes that are available in the Bulletproofs to convey the idea whether the output is protected. Perhaps we could define a specific structure for these bytes e.g. `<scheme_version:1 byte><meta_data:4 bytes><data:25 bytes>` where `metadata` would also tell whether the `data` that follows is encrypted or not. The data could be encrypted using the `seed` key and could thus hold information on output labels which would only be available to the owner of the output. We could even include the starting bytes of the anchor that protects it if we wanted to. If we went such path, we would need to think of the possible drawbacks.
Copy link
Contributor

@tromp tromp Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does that square with the 4 bytes: 0|type|switch_commitment_scheme|derivation_depth
from mimblewimble/grin-wallet#105 ?
We only need a bit or two from these bytes...

Copy link
Member Author

@phyro phyro Aug 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never dove into bulletproofs so I'm not sure yet. Out of all 3 options, I prefer option 3. (labeling outputs via bulletproofs bytes) if it is possible because it means the label information is directly on the output itself instead of being some external property like a derivation path of the output. My current understanding from the linked issue is that we might have a "use in the future" byte available which is the first one called 0 in the 0|type|switch_commitment_scheme|derivation_depth. I'll poke some people a bit as I'm not sure I'd be able to grasp this completely in a short time frame.

### Simple bootstrapping of protected outputs

We start off with all of our outputs marked as `Unprotected`. To create a `Protected` output out of nothing we create the following outputs:
1. an `anchor` output that has a form `0*H + r*G` - generated from key derivation path `A`
Copy link
Contributor

@tromp tromp Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should perhaps consider the anchor itself also protected. Conceptually, it's simpler if protected means it cannot be recreated. Or equivalently, if it's an output from a chain of transactions starting at a anchoring (bootstrap) tx. Perhaps it makes sense to say that a protected output is spendable if and only if it has nonzero value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. I do however wonder whether this would close the opportunities to use 0 value outputs somewhere else. Zero value outputs are the only way to bootstrap yourself with outputs without an explicit receive by simply attaching them to a passing by transaction - they're the only type of outputs that can be created on either side input/output that don't require to be matched in their v value. My current belief is that decoys can be useful if they have a good spending pattern (similar to how people spend - whatever that looks like). Perhaps a decoy could have a 0 value input and output? I'm not sure whether what I'm saying actually holds value in practice though. But I agree that it would be nice to have all the vertices from the anchor graph be of the same type...

Copy link
Contributor

@tromp tromp Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can adopt a convention that 0-valued outputs are anchors when protected and decoys when unprotected.
This does require some care though. When we need to send value v (including fees) in a payjoin, we would need to use a protected input of value strictly greater than v, since we want our output to be protected and spendable.
Alternatively, we can say that 0-valued outputs are neither protected nor unprotected. We can reserve those terms for positive valued outputs, and classify 0-valued outputs as either anchor or decoy. Then we still need only a single flag bit to distinguish between output classes, as value is always known separately.

#### Replay protection with utilization of Protected outputs

In order to protect ourselves from replay attacks, we need to follow a simple rule:
**When we are sending money to someone , we _MUST_ always include a `Protected` input as a part of the transaction.** This is to prevent someone doing a replay of the transaction which would move money from our outputs. Exception to this rule are self-spends.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-spends of unprotected outputs are still problematic. If recreated, they don't behave as other unprotected outputs.

Both protected and unprotected outputs come in two varieties. Signed-for, and not-signed-for. Note that I don't use spent and not-spent. If it's signed-for, it's still subject to play attacks, and should be treated with care.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "they don't behave as other unprotected outputs"? Let's say I receive an unprotected output O1 from a 1-2 tx. My current understanding is that there are two ways to do a self spend which includes O1:

  1. with another protected output - this makes the new outputs protected and hence safe
  2. without a protected output e.g. using O1 as an input and creating O2 and O3 outputs that are not protected
    From my understanding, in the 2. case, O2 and O3 have the same properties with regards to replay attacks as O1. It's not different from receiving a 1-3 tx with O2 and O3 being unprotected. I think I didn't make the scenario I described in this comment clear in the document and assumed it was obvious (which it isn't), right? Or did I miss something else?

Regarding your second point, I agree, signed and unsigned is the correct way to think about it to protect from the play attacks - I put the play attacks towards in the end of the document.

1. We can make regular 1-2 transaction if our input is labeled as `Protected`
2. We can receive money through a 1-2 transaction

The only thing we need to be aware is that our output that is created in a 1-2 receive transaction will be labeled as unprotected and will hence need to be spent in a transaction that will include one of our `Protected` outputs as an input. A more general rule is that outputs created in a transaction to which we did not contribute a `Protected` input are unprotected and need to be spent along side some of our `Protected` inputs. In theory, it should be impossible to create replay any transaction if this rule was followed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammo in "to create replay"

#### Protection with wallet history

A wallet can remember which outputs were spent. This way, if a spent output reappears in the wallet, a user is given a choice to either accept it or refresh it through a self-spend transaction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a previously-spent-utxo should only show up if a user asks to see them.
We don't want to bother non-expert users with them.
Obviously they would never contribute to the wallet balance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that if we label outputs in the bulletproof (both protected and unprotected) then we have 3 states.

  1. old outputs that don't have labeling in bulletproofs
  2. protected outputs
  3. unprotected outputs

Suppose we create an anchor output and we start labeling all outputs we create by either giving them protected or unprotected label in the bulletproof bytes.

If we have an anchor created and we receive an output that is old, then it must have come from a replay of one of the old transactions, correct? An exception to this would be if someone perfectly timed a wallet restore in which case we wouldn't know which outputs were recently received, but it might work for recognizing past replay attacks and we might be able to recognize and ignore these (or as you said, ask the user to show them). Might even have a metric for these events sent to a public monitoring system to observe the number of such cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To repeat an observation elsewhere, old outputs that don't have a labeling in bulletproofs should automatically be classified as unprotected. This can be achieved with either a new derivation path or a 1 flag bit instead of a current 0 flag bit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use a single flag bit to distinguish between 4 types of output: Decoys are 0-valued 0-flagged, anchors are 0-valued 1-flagged, unprotected are positive-valued 0-flagged, and protected are positive-valued 1-flagged.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decoy could actually be protected (depends a lot on how we create decoys e.g. if we add an input along with it). Would we make sure that a decoy output is never protected or would we label it as unprotected regardless?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wallet could in principle mark specific decoys as protected, and use them like other protected outputs. But without using additional flag bits, this knowledge would not survive a restore.
After restore, it would appear the same as any other decoy, and must be presumed unprotected.

I think that a wallet that always considers them as unprotected is slightly simpler.
The question is how a wallet would create them in the first place? That will probably determine how best to spend them. For spending them as actual decoys, being protected may not matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Decoys seem to be a whole new topic to discuss separately from this RFC though


### Receive-only wallets

Any kind of automated receiving should default to 1-2 transactions and thus creating _unprotected_ outputs to avoid utxo spoofing attack which would reveal our inputs. Always performing 1-2 receive transactions can be achieved by setting the configuration `RECEIVE_PROTECTED_PROB = 0.0` which is explained in the next section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that a receiver is more at risk of utxo spoofing; a receiver has an inherent incentize to see a tx confirm. The receiver cannot be spoofed if they are the one to finalize a tx.
I remember thinking a while back that only sender finalization makes sense, but now I cannot the recall why that should be.
Anyway, with the possibility of receiver finalization, they should absolutely do payjoins to receive. This puts the utxo spoofing risk on the sender, but the sender has already determined that the receiver is worthy of being paid, so they probably perceive a much smaller utxo spoofing risk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, when I wrote this, I had in mind the current transaction building scheme. I agree about the payjoins in the receiver finalizing the tx scenario 👍 One thing i'm still a bit concerned about is the automated payjoins (I described the reasoning here #59 (comment))

// If PayJoin transactions are not wanted due to privacy concerns, you can set this value to 0.0 in which case
// receive transactions will never contribute an input.
// Default: RECEIVE_PROTECTED_PROB = 0.5
RECEIVE_PROTECTED_PROB: 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why 0.5 is better than 1.0 as default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this and it's not yet clear to me what the consequences are. I personally prefer payjoins, but if you default to 1.0, then if you see a 2-2 transaction you can with high confidence assume that the inputs belong to different entities (not really sure what you can do with that data). If you have a 0.5, then it's much less likely that was the case because the probability of it being 2 inputs from the sender gets higher. I've not done analysis on this so I can't say whether it makes sense or not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having to assume the inputs belong to different parties seems to be the worst case for a privacy attacker?!

Copy link
Member Author

@phyro phyro Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh that's interesting. From the information theory perspective, you are giving him more information when you default to 1.0, but it seems like it's more likely to be less useful information, so maybe you're right and higher values should be prefered 👍 (I'm still not sure 1.0 is optimal for privacy, but it definitely seems better than 0.5)

Copy link
Contributor

@lehnberg lehnberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice start @phyro 👍 I left some comments, I've yet to dig deep in the later sections, but will try to keep myself up to date on your progress.

# Summary
[summary]: #summary

A few months ago, a new class of attacks was found on Grin that we call (re)play attacks. A Mimblewimble transaction that already happened can be replayed if the exact conditions are recreated. In order to be able to replay a transaction the following conditions must be true:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd avoid referencing time like "a few months ago". Instead write the RFC like it's going to be there for years to come and that it should always read as up to date.

1. The inputs must be in the utxo set
2. The outputs of the transaction must not be in the utxo set (Grin does not allow duplicate outputs in the utxo set)

This means that if Alice sent some coins to Bob and the outputs have been spent, anyone that saw their original transaction could replay the transaction if the same inputs existed in the utxo set. But why would the same inputs exist on the chain in the first place? While this seems harmless at first, it can with some creativity and careful coordination be used take someone else's coins without their permission. The attack is not easy to pull off, but is doable in some scenarios. Similarly, a play attack comes from the same reasoning, but with the difference that a transaction never made it to the chain for some reason. One such reason could be that an input was spent before it was broadcasted which would make the current transaction invalid. In this document, we propose new wallet behaviours that make it robust in the face of (re)play attacks so the end users can't be victims of these attacks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the summary focuses on the attacks that were discovered, and then briefly mentions what the purpose of the document is (to propose new wallet behaviours that make it robust in the face of these attacks).

Instead, I'd refocus the summary and make it much more high level and to the point:

  • Given certain conditions, UTXOs that belong to a user can be moved without their explicit permissions.
  • This document outlines the fixes that prevents this behaviour from occurring, specifically x, y, and z.

# Motivation
[motivation]: #motivation

The goal of this RFC is to propose new wallet rules that, when strictly followed, protect the user from all known malicious (re)play attacks. This is done by changing the transaction building process but at the same time keeping it configurable enough to still allow the user to take control of their privacy. The solution mostly supports all the transaction building flows that were used before while encouraging default use of PayJoin transaction with some probability. Contributing inputs on to a 'receive' transaction can be turned off in the wallet configuration settings while still keeping the protection against replay attacks. The wallet also allows choosing different wallet configurations for different use cases e.g. a withdrawal from an exchange never does a PayJoin transaction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in this Motivation section where I'd outline Play / Replay attacks in their full detail and extent. This is the reason why the document exists. Don't outline any actual solution step or mitigation in this section, instead describe the problem that you are trying to solve in the detail that feels appropriate.


The goal of this RFC is to propose new wallet rules that, when strictly followed, protect the user from all known malicious (re)play attacks. This is done by changing the transaction building process but at the same time keeping it configurable enough to still allow the user to take control of their privacy. The solution mostly supports all the transaction building flows that were used before while encouraging default use of PayJoin transaction with some probability. Contributing inputs on to a 'receive' transaction can be turned off in the wallet configuration settings while still keeping the protection against replay attacks. The wallet also allows choosing different wallet configurations for different use cases e.g. a withdrawal from an exchange never does a PayJoin transaction.

# Community-level explanation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to write the community level explanation even more high level, and without any diagrams, or naming conventions (i.e. Transaction T etc as this is technical). Instead try to explain what's being implemented on the wallet with words, from an end user's perspective.

For example:

  • We can stop the attacks by making impossible to recreate the transactions that created the outputs.
  • We do so by creating an output that is never being spent.
  • Outputs that are created at the same time as this one, cannot be replayed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input! Do you think it would be a good idea to move the diagrams to the beginning of the Reference-level explanation or just throw them out? I include them because I find it easier to see what's going on, but it could be that I'm just used to them because I've done quite a few of these. Are they any help to helping you guys understand the idea or not really?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diagrams in general are helpful, I think! Could make sense to move them to reference level 👍


An output is labeled as `Protected` *only if* it was created in a transaction where we contributed an input that was labeled as `Protected`. An exception to the rule are outputs that are created as a part of the _Bootstrap_ transaction discussed in the next section.

#### Possible protected output generators implementations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are very helpful as outlined. Once you've decided to go in a specific direction, it would make sense to then move the other approaches to the "Rationale & Alternatives" section and explain why you opted not to use them.

@tromp
Copy link
Contributor

tromp commented Aug 13, 2020

One thing I want to mention is that payjoins allow for the possibility of the receiver paying their share of the fees. It's not clear that we would actually want to make use of that though.
Would people be able to get used to a 1 grin payment resulting in a balance of e.g. 0.991 Grin ?


The difference between a Play and Replay attacks are that in a Play attack, the transaction never lands on the chain. This allows some funny scenarios where Alice wants to pay Bob, but after constructing the transaction and broadcasting it, the transaction does not land on the chain for some reason. Let's say that Alice and Bob recreate a new transaction and Alice uses different outputs. This second transaction goes through and both Alice and Bob are happy. However, if Bob saw the first transaction, if it becomes valid at some point, it is possible for him to 'play' it or broadcast it on the chain after the second transaction was already done. This way, Bob receives his payment twice. This means that a Play attack attacks the sender by tricking them into signing a transaction multiple times. Transactions that don't make it to the chain should _always_ be cancelled by the user before sending a new transaction.

To protect against this, a user should have an option to cancel a transaction which would label the sender's inputs as `Must use` in the next transaction. The reason why the sender would want to reuse the inputs is to make the previous transaction invalid. So now, only 1 of the two transactions that have been signed can make it to the chain. If the transaction failed to get on the blockchain again and Bob 'gave up', it should be cancelled and an immediate self-spend should be done to prevent giving Bob the possibility of playing the transaction after.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This double-spending cancel option must be used with care. We should not reuse an input of the canceled tx as an input for a receiving payjoin, as it exposes the sender to the possibility of unexpected failure (when Bob has the canceled tx confirm after all).
Similarly, such failure is possible if we reuse an input for a spending transaction. So in that case we must watch the chain until the new spend confirms. If it fails due to the original tx having been broadcast/confirmed, then we must be prepared to replace the now bad input by new ones, adjust the offset, and rebroadcast the send.
Clearly this is something only an expert user should be allowed to do.
For casual users, cancel should always lead to an immediate self-spend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh nasty. Will correct to immediate self-spend. Perhaps we might also want to think of a resend option that would reuse the same inputs to recreate the transaction with the same receiver, but this can be thought about later.

@yeastplume
Copy link
Member

Location of bullet proof storage is here:

https://github.com/mimblewimble/secp256k1-zkp/blob/master/include/secp256k1_bulletproofs.h

 *          message: optional 20 bytes of message that can be recovered by rewinding with the correct nonce
 ...
    size_t extra_commit_len,
    const unsigned char* message
) SECP256K1

https://github.com/mimblewimble/secp256k1-zkp/blob/master/src/modules/bulletproofs/rangeproof_impl.h

    /* Encrypt value into alpha, so it will be recoverable from -mu by someone who knows `nonce` */
    if (n_commits == 1) {
        secp256k1_scalar vals;
        secp256k1_scalar_set_u64(&vals, value[0]);
        if (message != NULL) {
            /* Combine value with 20 bytes of optional message */
            secp256k1_scalar_get_b32(vals_bytes, &vals);
            for (i=0; i<20; i++) {
                vals_bytes[i+4] = message[i];
            }
            secp256k1_scalar_set_b32(&vals, vals_bytes, &overflow);
        }
        secp256k1_scalar_negate(&vals, &vals); /* Negate so it'll be positive in -mu */
        secp256k1_scalar_add(&alpha, &alpha, &vals);
    }

It's been a good long while since I've looked at this so need to confirm further, but it looks to me as if 24 bytes out of a possible 32 in alpha are currently being used. So 6 left that could be accessed relatively easily.

Andrew did mention there may have been a way to get more data into another var (rho, perhaps?) but it would have required a major reworking.. again these are conversations from years ago. Jasper might remember more.

@tromp
Copy link
Contributor

tromp commented Aug 14, 2020

Let's keep in mind that a possible adoption of BP+ in the future will demand a minimization of
the amount of rewindable data...

@tromp
Copy link
Contributor

tromp commented Aug 14, 2020

If we want an ultra-concise version of this RFC, then it would look something like:

We propose that wallets follow the following 2 rules to fully protect users from (re)play attacks.

  1. Spend safely
  2. Cancel safely

An anchor is a 0-valued output (created by this wallet) that is never spent.
A safe (from this wallet's viewpoint) transaction is a tx that either creates an anchor, or that spends an output from an earlier safe transaction.
Unsafe receives are allowed, but safe receives (that are necessarily payjoins) are preferred
as long as the receiver gets to finalize, which minimizes the risk of utxo spoofing.

A safe cancel requires an immediate self-spend of an input of the tx to be canceled.

@phyro phyro force-pushed the replay-wallet-rules branch from f794071 to 33de4a9 Compare August 14, 2020 21:47
@phyro
Copy link
Member Author

phyro commented Aug 14, 2020

Thanks all for the feedback, I pushed changes we discussed for most of the comments

@phyro phyro force-pushed the replay-wallet-rules branch from d7a3d73 to 7fef77e Compare August 15, 2020 14:58
@lehnberg lehnberg added the wallet dev Related to wallet dev team label Aug 17, 2020
@DavidBurkett
Copy link
Contributor

It's been a good long while since I've looked at this so need to confirm further, but it looks to me as if 24 bytes out of a possible 32 in alpha are currently being used. So 6 left that could be accessed relatively easily.

It would require a consensus change, but we can get an unlimited amount of data using the method I described for non-interactive transactions. You just use a known rewind key (could just be 0), and the rewound data would be the hash of an additional data structure. That additional data then must be included with the bulletproof, and anyone verifying the BP must also rewind it & verify the additional data's hash. I don't actually know what's involved with a bulletproof rewind to know how much of a performance hit that is, but there are other options for including extra data if rewind is too slow.

@DavidBurkett
Copy link
Contributor

DavidBurkett commented Aug 23, 2020

Actually, I just realized Andrew already provided us a way to commit to as much data as we want:
https://github.com/mimblewimble/secp256k1-zkp/blob/master/src/modules/bulletproofs/main_impl.h#L192

We can just include extra_commit, which can be a blob with whatever data we want, encrypted with any key we want. When validating the BP, you also validate that the additional encrypted blob stored with it is correct.

Of course, this is still a consensus change. I'm not sure what it is you're trying to accomplish (I'm assuming you want to avoid hardforks), but I figured it might be useful info anyway.

Edit: I just realized @yeastplume may have been the one to add extra_data, back when we used it for switch commitments. Here's an old issue about it: mimblewimble/grin#734
Is that field still usable?

# Drawbacks
[drawbacks]: #drawbacks

It requires an `anchor` input that is never spent which increases the chain 700 bytes per wallet. These outputs might be easier to identify because they never move. How easy/hard would it be to identify them is unclear because each wallet is expected to have only one such output and a lot of wallets will get lost and hence a lot of outputs will never move.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Alice receives from Bob on two separate occasions, and both times the transactions trace back to a single output that doesn't move, then it's easy for Alice to see all transactions Bob was involved with. I really don't see how this approach can avoid a catastrophic loss of privacy that would make it even worse than bitcoin for output linkability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not done any analysis yet so I can't give better estimates - I've had less time and I mostly spent it trying to figure out other things first. I find it unlikely that they will trace back to a single output that doesn't move. Given that people will be transacting with different parties, the outputs should trace back to multiple outputs that don't move (some of which will be anchors) - it becomes especially hard if you have payjoins. I'd like to add that if you don't do payjoins then since you know that all the inputs belong to a single party, you can find all the transactions that created these inputs and know this party was involved in all of them - there is no plausible deniability for input side without payjoins.

@tromp
Copy link
Contributor

tromp commented Aug 23, 2020

it's easy for Alice to see all transactions Bob was involved with.

Suppose Alice knows that Bob's anchoring tx is A,
and there is a forest of 100 payjoin txs spending outputs of A, including the two spends to Alice.
How does Alice determine which of these 100 txs Bob was involved with?

@DavidBurkett
Copy link
Contributor

@tromp That's not how privacy works. You have to plan for worst case scenarios, not best case. And even with best case scenarios, you're still leaking more metadata than before.

A user should not be required to analyze transaction graphs before spending to know whether they'll have any privacy.

@phyro phyro force-pushed the replay-wallet-rules branch from e62778b to 4b39e29 Compare September 27, 2020 19:28
@antiochp
Copy link
Member

antiochp commented Sep 29, 2020

Say we have a chain of outputs, for the purposes of protecting subsequent outputs with an earlier anchor output.
And say we have an anchor Oa as a sibling of O1.

O1 -> O2 -> O3 -> ... -> On
Oa   

As long as Oa (anchor) is unspent then O1 and subsequent outputs are protected.


If we associate a key derivation path closely with this chain of outputs, the wallet can identify both the anchor output and the subsequent protected outputs.

In the case above, we have the following unspent outputs -

  • Oa
  • On (all previous outputs along the chain are, by definition, spent)

The anchor output is therefore the earliest unspent output in this derivation path.
Any other unspent output(s) on the chain (subsequent to the anchor output, on same derivation path) are protected.

If On is spent, producing On+1, then we know On+1 is also protected (assuming Oa is still unspent).

From the derivation path perspective, if the wallet is producing "protected" outputs by following the anchor output rules then we can say that for any derivation path, all subsequent keys are protected is there exists an unspent output earlier on the derivation path.

Say we use m/1'/0'/0' (for demonstration) as the derivation path of a particular chain of protected outputs.

Oa -> m1'/0'/0'/0
O1 -> m1'/0'/0'/1
O2 -> m1'/0'/0'/2
O3 -> m1'/0'/0'/3
etc.

The output associated with m1'/0'/0'/0 protects all outputs along that derivation path, as long as it remains unspent.

The interesting thing here is 0a does not need to be identified as an anchor in any explicit way - it acts as an anchor simply by being unspent (and earlier in the derivation path).
The only way it can remain unspent is if a pair of outputs are created such that one sibling can remain unspent while the other participates in the chain of protected outputs.

Here Oa and O1 are siblings, with Oa "anchor" and O1 "protected".

An anchor can be introduced at any time, and at any point along the derivation path. It does not need to be the first output on this derivation path. An anchor protects all subsequent outputs on the derivation path.


So we could have a scenario that looks like this -

O1 -> O2 -> O3 -> O4 -> O5 -> ... -> On
Oa                Oa'

In this scenario anchor Oa protects all outputs on this derivation path from O1 onwards. And the later anchor Oa' protects outputs from O4 onwards.

We could then remove, by spending, the original anchor Oa.

What would happen in this case? Output O4 and subsequent outputs would still be protected by Oa'. But earlier outputs, O1, O2, O3 would no longer be protected.

In this new state, the anchor output Oa' provides two functions -

  1. It protects subsequent outputs
  2. It identifies the point along the derivation path where protection starts (marking all earlier keys as unprotected).

If we were to ever receive funds on an invalidated, unprotected key (identified by key derivation path) then we handle this carefully.


So maybe one way of thinking about this, if we use derivation paths is -

  • anchors are simply unspent outputs earlier in the derivation path
  • anchors can be created at any time (pair of sibling outputs in a tx)
  • multiple anchors can exist on a single derivation path
  • if we are willing to invalidate earlier keys then we can replace anchors, "spending" old anchors
  • anchors split the derivation path into "protected" (later) and "unprotected" (earlier) keys

Multiple anchors would appear to be good for privacy (they are simply currently unspent outputs).
Ability to spend old anchors would appear to be good for privacy (if we are willing to unprotect/invalidate earlier outputs).


* All of the above ignores merging/combining of derivation paths for simplicity.

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020

Since this RFC is about preventing the recreation of spent outputs, I'm not too keen on methods that undo such prevention, begging the question of how to "handle this carefully."

@antiochp
Copy link
Member

antiochp commented Sep 29, 2020

Since this RFC is about preventing the recreation of spent outputs, I'm not too keen on methods that undo such prevention, begging the question of how to "handle this carefully."

My understanding is this applies to protected/unprotected bits in the rangeproofs also.
These are only actually protected if the following are true -

  • the anchor remains unspent
  • all intermediate txs follow the necessary conventions

If either of these are accidentally or otherwise broken then the protected flag no longer reflects reality.


I'm not suggesting we should remove old anchors - just that it appears possible to do so in a relatively structured way.
But there needs to be a temporal ordering between outputs to do this and the use of derivation paths could provide this.

@antiochp
Copy link
Member

I'm not too keen on methods that undo such prevention, begging the question of how to "handle this carefully."

I believe these "no longer protected" outputs would be no different to an output that had never been protected by an anchor - so they could be handled in a similar way.
The scheme above would allow the wallet to easily identify these.

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020

If either of these are accidentally or otherwise broken then the protected flag no longer reflects reality.

Correct; wallets must strictly enforce the safe spending rules.

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020

I believe these "no longer protected" outputs would be no different to an output that had never been protected by an anchor

They would be crucially different, in that they are already spent, and this spend is now replayable.
The reason we can create new unprotected outputs, with traditional 1-2 receives, is that we spend them safely, i.e. together with a protected output.

@phyro
Copy link
Member Author

phyro commented Sep 29, 2020

I believe these "no longer protected" outputs would be no different to an output that had never been protected by an anchor

They would be crucially different, in that they are already spent, and this spend is now replayable.
We reason we can create new unprotected outputs, with traditional 1-2 receives, is that we spend them safely, i.e. together with a protected output.

Note that he compared to those that never had an anchor which I believe is correct then. It's like moving a time window as to when you started protecting your txs and ignoring all the "old" txs that were unprotected, but the window is carefully managed.

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020

Note that he compared to those that never had an anchor which I believe is correct then.

Yes, they have the same status as those.
But it feels strange to create such outputs when the goal of this RFC is to get rid of them:-(

@DavidBurkett
Copy link
Contributor

Is the plan to support payjoins even with tor transactions? If it is, how do we plan on dealing with the privacy issues I've mentioned with synchronous txs & payjoins? If a user does not manually approve incoming payments, then an attacker can make a bunch of sends to a wallet in order to query the entire list of the wallet's spendable outputs (since each send presumably gets a different payjoin input). They can then cancel those txs, allowing them to learn nearly everything about a wallet for free.

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020

Personally, I think the following payjoin scenarios make sense:

  • a user tells their wallet to listen on TOR for a payment from a specific address, optionally with an expected amount.
  • a user tells their wallet to listen on TOR for any donation, but the interactions use an extra round so that the wallet always gets to finalize, and prevent snooping.

@DavidBurkett
Copy link
Contributor

  • the wallet always gets to finalize

Is this a manual process? If it's not, then a similar problem persists. Instead of it being free to snoop on a wallet, it switches to being cheap to snoop (just send very small amounts).

Each of those scenarios requires huge discussions about UX which seem to be a prerequisite for getting this (already very large) RFC approved. This is starting to feel like the take-it or leave-it kind of comprehensive legislation routinely passed by the US Congress, where we're all forced to accept a bunch of stuff we don't like in order to get a few things that we do. I think it was important and valuable to get this RFC fleshed out so we can see what it would take to fully prevent replay attacks using a wallet-only solution, but I think it's probably time to break this down into more digestible chunks, since this is really just a collection of disparate ideas & techniques.

My personal opinion is that it'd be more productive to first focus on what it will take to get payjoins implemented, and the various security, privacy, and UX challenges that come from those. And even that may be too ambitious (or maybe not, I'm undecided). If we're talking about introducing an extra round for some payjoin transactions, then that alone feels like an RFC. Similarly, listeners that only accept payments from specific addresses and/or for certain amounts also seems like it could potentially be its own RFC. I'm afraid if we don't break this up, and just try to approve it as one big pill we all have to just swallow and deal with, then we might miss some of the negative implications of one of the various ideas being proposed here.

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020

The scenarios I mentioned are indeed outside of the scope of this RFC.

For this RFC, I think payjoin receives should be limited to the invoice workflow;
senders initiating a payment to our wallet would result in non-payjoin receives.

@DavidBurkett
Copy link
Contributor

So since invoices are almost never used, this means the solution to replay attacks is mostly just the addition of anchors, which have been shown to be horrible for privacy when not combined with payjoins (and I still claim even with payjoins, but I digress)?

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020

Invoices will be used more in future. By what measure are anchors "horrible" for privacy?

@DavidBurkett
Copy link
Contributor

I think I've covered that quite well. Outside of payjoins, if I send you your very first grins, and you years later send me some grins, I can see every single spend you made in the interim.

@tromp
Copy link
Contributor

tromp commented Sep 29, 2020 via email

@DavidBurkett
Copy link
Contributor

  1. No, you can't.

When not using payjoins, and when relying on a single anchor, you absolutely can. I've described how here:
#60 (comment)

  1. If I send you your first grins now, and later you send me some grins from your change-output-chain, I see the same.

No, you see a fraction of my spends, but not necessarily all of them. I've described the difference here:
#60 (comment)

@tromp
Copy link
Contributor

tromp commented Sep 30, 2020

I've described how here:

That doesn't describe how to identify the set of all my intermediate spends, which is generally impossible to do, for two reasons:

  1. existence of multiple protected outputs in a wallet
  2. even with a single protected output throughout my spending history, there are generally multiple paths from my first receive to my last spend to you.

@antiochp
Copy link
Member

antiochp commented Sep 30, 2020

I think I've covered that quite well. Outside of payjoins, if I send you your very first grins, and you years later send me some grins, I can see every single spend you made in the interim.

You can "see" every single spend, but you can only reliably identify these if there is only a single path through the transaction graph from A to B. If multiple paths exist between A and B then this path is obscured.

A': Alice anchor output.
A: Alice output.
B: Bob receives funds from Alice.

Bob knows the anchor A' belonging to Alice and would like to identify all outputs belonging to Alice.


A' and B linked with a single path through a single transaction.
Bob can trivially identify A and the path AB.

A'
  \/
  /\
A    B

A' and B linked with a single path through intermediate transactions.
Bob can identify all intermediate outputs belonging to Alice and the path AAB.

A'
  \/
  /\
A   A
      \/
      /\
         B

A and B linked with multiple paths through intermediate transactions.
Bob cannot differentiate between paths AAAAB or ACCAB.

       \/
       /\
A'   C    C       
  \/        \/ 
  /\        /\  
A    A    A    A 
       \/        \/ 
       /\        /\
                    B          

I think this requires Alice to interact at least twice with Charlie to introduce an alternative path.
This interaction does not need to be direct and can be indirect through intermediaries (diagram gets kind of complex though).

This presumably breaks down if Bob knows Charlie's anchor and has already determined all of Charlie's protected outputs.
Bob can presumably also potentially analyze multiple paths and determine which likely belongs to Charlie and which to Alice.
But (at least intuitively) this complexity grows quickly with multiple parties and multiple intersecting paths involved.

@pkariz
Copy link

pkariz commented Sep 30, 2020

I've described how here:

That doesn't describe how to identify the set of all my intermediate spends, which is generally impossible to do, for two reasons:

  1. existence of multiple protected outputs in a wallet
  2. even with a single protected output throughout my spending history, there are generally multiple paths from my first receive to my last spend to you.

If the output B is a descendant of output A and both are on the same chain then if your node stores all the transactions it sees you can bruteforce all the possible ways to get from A to B and one will match (you check tx where B was made and the previous transaction is one of the txs which created one of the inputs in this transaction). If all txs are 2-2 and the chain distance between A and B is 5 then you have 2^5 possible solutions to bruteforce which is very small. Anchor is A in this case but the same problem exists with payjoins unless you converge to n different chains somehow (which would mean that you are sometimes forced to not do a payjoin or create multiple outputs) to lower the probability of being on the same chain

@antiochp
Copy link
Member

antiochp commented Sep 30, 2020

If all txs are 2-2 and the chain distance between A and B is 5 then you have 2^5 possible solutions to bruteforce which is very small.

What are you "bruteforcing" here though? There may exist multiple valid paths. All you know is Alice's outputs exist on one of these but you don't know which.

Edit: I think I misunderstood what you were saying. There may be multiple paths though, there is no guarantee you can identify a single path.

@DavidBurkett
Copy link
Contributor

2. even with a single protected output throughout my spending history, there are generally multiple paths from my first receive to my last spend to you.

No, there is generally a single path if we only have a single protected output. If we have only 1 protected output throughout spending history, then here's what the history looked like:

  1. Alice receives A1 (unprotected) from Bob.
  2. Alice self-spends to create A' (anchor) and A2(protected)
  3. Alice receives N - 2 (unprotected) outputs from various parties. Let's call those A3-AN (unprotected)
  4. Now any spend she makes MUST include A2 as an input, in addition to any number of unprotected outputs (A3-AN). This spend creates a change output AC which is protected.
  5. Alice sends coins (B1) to Bob. She must include AC, as well as any number of other unprotected outputs (A3-AN).

Now if Bob knows A' is Alice's anchor, it's trivial to deduce that A2 is Alice's protected output. With very rare exception, there will be a single straight path from A2 to B1 which trivially identifies every single output Alice has spent up to the creation of B1.

The rare exception I mentioned is if you sent coins to someone else who ends up sending coins back to you directly (or indirectly through various other parties). The longer time goes on, the greater the likelihood of this happening, but each time it does, it only adds 1 possible path to explore. It does not grow exponentially.

Now the case where you create multiple protected outputs instead of only having a single protected output is a bit better, but you're still leaking more data than you were before. The requirement to continually tie every single spend back to a common ancestor is an undeniable loss of privacy. I really don't know what else can be said on the matter. It's quite clear that it's a privacy loss, and whether it's catastrophic like I claim, or only a partial loss of privacy, this doesn't seem like the right decision to make for a privacy coin.

@tromp
Copy link
Contributor

tromp commented Sep 30, 2020

No, there is generally a single path

That is the special case, not the general one.

The rare exception

It's not.

The longer time goes on, the greater the likelihood of this happening, but each time it does, it only adds 1 possible path to explore.

It can add many paths each time.

Now the case where you create multiple protected outputs instead of only having a single protected output is a bit better, but you're still leaking more data than you were before. The requirement to continually tie every single spend back to a common ancestor is an undeniable loss of privacy.

And the option to continually mix with other's people's streams when doing payjoin receives is an undeniable gain in privacy.

It's quite clear that it's a privacy loss

It's quite unclear whether it's an overall gain or loss with payjoins. I think it all ends up as noise when we get some non trivial amount of aggregation, which in any case is the only real fix for the inherent linkability exposed by the transaction graph.

@phyro phyro force-pushed the replay-wallet-rules branch from b6349a9 to bbfb300 Compare October 7, 2020 08:08
@phyro
Copy link
Member Author

phyro commented Oct 12, 2020

Closing in favor of smaller RFCs

@phyro phyro closed this Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wallet dev Related to wallet dev team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants