Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add eth_signTypedData as a standard for machine-verifiable and human-readable typed data signing with Ethereum keys #712

Merged
merged 58 commits into from
Jun 9, 2018

Conversation

LogvinovLeon
Copy link
Contributor

@LogvinovLeon LogvinovLeon commented Sep 12, 2017

@LogvinovLeon LogvinovLeon force-pushed the master branch 2 times, most recently from fc7bde1 to 3fb8fb4 Compare September 12, 2017 23:00
@danfinlay
Copy link
Contributor

danfinlay commented Sep 12, 2017

Previously, the signing story in the Ethereum space has been plagued by incompatibilities, insecurities, and indecipherability. This proposal moves the signing API a big step towards usability, usefulness, and security, so as far as I can tell, it's just a fantastic feature that I'd be excited to see what developers can do with.

Unless there are major issues caught in this discussion, I'm very open to implementing an experimental version of this for MetaMask as soon as we have tests to help implementers ensure compatibility. We actually already have community members lining up to implement this feature, it's very needed for things like state channels.

@alex-miller-0
Copy link

I would use this. I've never liked the idea of users signing non-ASCII gibberish.

@SilentCicero
Copy link

I like this. I am all for this one.

@andytudhope
Copy link

Also a big fan of this EIP, would be very useful to me 👍

@jamesyoung
Copy link

Yes, very useful. I would use this too.

@iam-peekay
Copy link

Incredibly useful.

@mickys
Copy link

mickys commented Sep 13, 2017

Rather useful when you want to show a proper Signing Request for an off-chain transaction.
That could be a Login request for example, or signing a message that everyone can validate the sender.

@MicahZoltu
Copy link
Contributor

Why personal_* namespace instead of eth_* namespace? I have never really understood the difference, but I kind of always assumed that things in eth_* namespace were more central to the operation of Ethereum while personal_* were more ancillary. To me this feels like something that will eventually be core to Ethereum (state channels) so it feels like it should go in eth_*.


Why is the hashing done in a nested fashion? It seems like just a lot of wasted effort vs hashing everything in one shot like:

string message = 'Hi, Alice!';
unit value = 42;
const hash = keccak256('string message', 'uint value', message, value);
address recoveredSignerAddress = ecrecover(hash, v, r, s);

Since hashes are one-way, there isn't really any value in decomposing them into sub-hashes since you can't unpack the data. A contract wanting to validate the signature would look something like:

contract Foo {
  function apple(string message, uint256 value, message, value, v, r, s) {
    bytes32 hash = keccak256('string message', 'uint256 value', message, value);
    address recoveredAddress = ecrecover(hash, v, r, s);
  }
}

Perhaps the idea is to be able to pre-hash the type signature so contracts don't have to re-hash that part each time? If so, then I recommend having the signature hash as the first parameter, then the parameter values as the 2-n parameters:

contract Foo {
  function apple(string message, uint256 value, message, value, v, r, s) {
    bytes32 hash = keccak256(signatureHash, message, value);
    address recoveredAddress = ecrecover(hash, v, r, s);
  }
}

This avoids hash nesting in most cases in the contract, yet still provides the value-add of pre-hashing the signature.


The EIP should include very clear wording on how the types should be encoded. For example, I think we should only use explicit types like uint256 and not uint. Linking to the Solidity docs is too vague and subject to change. Perhaps there is an EIP that defines the set of valid types? The same argument against linking to Solidity GitHub, and the link you provided doesn't actually give a list of valid type names.


This EIP should explicitly state how strings are encoded. My vote is strongly for UTF-8 as the defacto standard of the internet.


Discussion

I wonder if there is value in supporting nested objects? e.g., arrays, structs, etc. While a flat item list is OK, it does put some upper-bound limits on how useful this signing method can be. If we supported arrays and nested objects I think it opens up most of the doors we would want in the foreseeable future.

@danfinlay
Copy link
Contributor

danfinlay commented Sep 13, 2017

Why personal_* namespace instead of eth_* namespace?

I believe the personal_ prefix was an attempt to separate key-signing/account management operations from state-reading operations. Ideally, an RPC node only provides eth_* endpoints, and the wallet/signer handles the personal_* space. Since it involves signing with keys, the current pattern suggests this belongs in the personal space.

This EIP should explicitly state how strings are encoded.

I'm also in favor of UTF-8 for string encoding.

value in supporting nested objects? e.g., arrays, structs, etc.

I'm in favor of initially launching with whatever is the minimal set of supported types that we're interested in (likely current solidity primitives to start), with more complex types (like nested arrays & structs) being good subjects for later, expansion proposals.

@CJentzsch
Copy link

Very useful!

@tomusdrw
Copy link

Great proposal!

In Parity we don't expose personal_ APIs for web-dapps by default. IMHO personal_ should expect a password for private key to be provided and have their equivalents with no password in eth_ namespace: eth_sendTransaction(transactionRequest) -> personal_sendTransaction(transactionRequest, password)

IMHO the dapps should use eth_signTypedData(typedData) and personal_signTypedData(typedData, password) should be exposed only on more secure transports than HTTP (since it allows for handling scenarios where user is not required to confirm each transaction, like some automated services).

@LogvinovLeon
Copy link
Contributor Author

@MicahZoltu

Personal vs Eth

Dan Finlay already helped me with this one ;)

Hashing

Yes, we want the smart-contract to burn as little gas as possible while remaining secure. I like your second approach with keccak256(signatureHash, message, value). I'll update the EIP.

Types

Kinda agree but would like to get more arguments for this one. As far as I know every type alias maps to exactly one real type and it's well-defined. uint -> uint256. Then why would we disallow uint? Smart contract might also have arguments of those type aliases. Can you point any attack vectors if we allow those type aliases?

I was trying to link to more persistent places, but failed to find any. Solidity doesn't even have the BNF grammar defined. Here is the list of solidity types defined by IntelliJ Solidity Plugin, but it's not official. If someone proves me wrong and finds a better source - happy to include it.

UTF-8

Agree. Will update

Nesting

I'll need to read more into how solidity alligns structs in memory.
There are two approaches. Allowing nested lists (like RLP), or allowing Structs also.
I think - this can be implemented as a non-breaking extension to the existing EIP, so even if we don't agree on this extension now (we should still try), we can postpone it and still have this one.

@tomusdrw I'm talking about Parity Signer. The way Parity Signer communicates with the backing node is more of a Parity implementation detail. As an untrusted DApp I don't/shouldn't have access to user's password, so I'll ask Parity Signer to show user the request and request the password. There is a little bit of confusion here, cause the same set of RPC calls is used to communicate with node and with the signer.

@chriseth
Copy link
Contributor

I think having the ability to sign structured data is a very good step forward!

I'm wondering why you don't just use ABI encoding for the data and the json-ABI-specification for the schema? Note that the latest version also supports arbitrarily nested structures.

This would have the benefit that it is clear which types are allowed, we already have encoding and decoding functions and some of them are even accessible from within smart contracts.

@fabioberger
Copy link

fabioberger commented Sep 13, 2017

Although enabling arbitrarily nested data structures rather then simply an array of typed elements would give developers greater expressive power, it does come at a cost of making it much harder for the Signer UI's to anticipate and properly display the data to the user. Instead of a simple list of name => value pairs, they would need to handle the edge case of displaying a triple-nested object.

In addition, any nested data structure can be flattened to a simple array by using more verbose names for the elements, so it's not that we are depriving a developer from including any value they want hashed.

@chriseth
Copy link
Contributor

@fabioberger there are lots of use-cases where you have an array of pairs. A simple example is creating a multisig wallet and initializing it with owners and their personal daily allowance. Flattening this structure will make it much harder for users to check which allowance corresponds to which owner.

@danfinlay
Copy link
Contributor

IMHO personal_ should expect a password for private key to be provided and have their equivalents with no password in eth_ namespace

I'm not a fan of methods that require a password. Since the web3 API is exposed to untrusted websites, for one to have a password means for a user to be entering their password in an untrusted UI. I think it's much better for the signer to take responsibility for requiring password for personal_* methods, even when the password is not an argument to the function.

should be exposed only on more secure transports than HTTP

That sounds fine to me, but is an implementation detail for the signers. For example, personal_sign on MetaMask is never broadcast over the network, it's just a method name. Same would be true for this.

why you don't just use ABI encoding for the data and the json-ABI-specification for the schema?

The strongest reason I'm aware of to not just use ABI encoding is that since there is no smart contract to enforce rules, enforcing parameter names adds a layer of signed intention to the interaction.

@Arachnid
Copy link
Contributor

I'm not a fan of methods that require a password. Since the web3 API is exposed to untrusted websites, for one to have a password means for a user to be entering their password in an untrusted UI. I think it's much better for the signer to take responsibility for requiring password for personal_* methods, even when the password is not an argument to the function.

This - but also, the account may not even have a password; for instance if it's a hardware wallet.

@MicahZoltu
Copy link
Contributor

The strongest reason I'm aware of to not just use ABI encoding is that since there is no smart contract to enforce rules, enforcing parameter names adds a layer of signed intention to the interaction.

@FlySwatter I don't believe @chriseth was arguing for using ABI encoding only, but rather ABI encoding for the parameter data and then JSON-ABI-Specification for the schemea and supplying them both (as this EIP recommends).

My argument against ABI encoding is that ABI encoding is way harder than the encoding proposed here and generally human unreadable. I personally 😡 at ABI encoding and RLP encoding because they are so opaque to humans, hard to code against, and there generally aren't a lot of libraries available that help with this for many languages. Their advantages, in theory, is data size though I would be curious as to how much is actually being saved and whether that is meaningful in this context.

@ameensol
Copy link

Hi all. If no one has any objections my team is going to try and have this spec tested and shipped as part of MetaMask by next Friday. This is a critical dependency for the state channel auction system we've developed for our upcoming ICO. Specifically, it will prevent attacks where malicious client JS attempts to replace the hash the user is about to sign, which in the worst case could be a real Ethereum TX that sends the attacker all the user's ether/tokens.

Based on the above discussion, our implementation targets (for v1) are:

  1. personal_signTypedData over eth_*
  2. No nesting
  3. No JSON-ABI

@ukstv author of Machinomy will lead development and testing.

He has also offered this schema for a Machinomy payment update as an example for this EIP.

@ameensol
Copy link

Going to take silence as meaning there are no objections.

@sekisanchi
Copy link

sekisanchi commented Sep 16, 2017

Apart from main discussion point, I'm concerned about text coding of long bytes characters won't produce any drawbacks on relevant future implementations. I'd like to see the UTF8 implementation will not, by quick experiment, and hope getting attentions from those who concerned similar in double bytes countries to validate the implementation options.

@Arachnid
Copy link
Contributor

While I appreciate the goal here, I think some assumptions need re-examining; namely that a function name and set of named parameters constitute a user interface that an average user will be able to understand.

I think that evidence shows that this isn't a good user experience, and that the vast majority of users will not understand what they're being shown any more than they understand a long hex string. Just look at how often users ignore things like Android permission dialogues - and those are explained in plain english.

My argument against ABI encoding is that ABI encoding is way harder than the encoding proposed here and generally human unreadable.

I believe the suggestion is to send the ABI encoded data alongside the ABI. This seems like a simple solution to me, and compliant software will already need an ABI en/decoder to be able to function anyway.

@danfinlay
Copy link
Contributor

I agree with @sekisanchi that some double-byte text samples should be included in the test suite for any implementation.

@Arachnid I'm a little unclear, are you saying providing the (method name, param names, and type data) is or is not a good enough UX? Your first paragraph suggests it's not (and I'm all for constant improvement) but then you end By saying ABI data would be sufficient.

Of the two binary encoding options suggested here, I think they both give a similar end-user experience (method name, params names, type data), so we should consider other reasons for choosing an encoding scheme.

I personally could go either way. I agree with Micah that RLP is a bit cumbersome, but also I agree with Nick that every client will have this decoder anyways, so who cares?

I wouldn't mind hearing more points for and against the encoding types, although ultimately I don't think it's a deal breaker either way. For example, price of on-chain decoding I think would be an important metric.

@Arachnid
Copy link
Contributor

I'm a little unclear, are you saying providing the (method name, param names, and type data) is or is not a good enough UX? Your first paragraph suggests it's not (and I'm all for constant improvement) but then you end By saying ABI data would be sufficient.

I'm saying that providing that information is not a good enough UX. There's no way an average user will be able to decipher this data; for many API calls even a developer can't understand the implications without also reading the source code.

I'm also saying that if someone wanted to implement this, JSON ABI + ABI-encoded payload seems like a sensible way to do it, and the dependencies for parsing that are already required by apps.

@jannikluhn
Copy link

This EIP is a big step into the right direction. However, one flaw I see is that it adds unnecessary workload on the smart contract: It has to process and potentially store parameter names and types in human readable form. This increases gas costs and makes the code ugly. Making sure that the user signs only messages she wants to is not job of the blockchain, I don't think contracts should have to do anything with this.

Here's how I think this should work in an ideal world: Allowed message types are defined in the contract's metadata, maybe as part of the ABI or in a separate section. Nodes have the ability to import the metadata of contracts they trust. Calls to the sign method reference one message type and specify the values to sign. The UI can then show argument values, types, names and even additional documentation (which is also part of the metadata). As the metadata is not provided by the potentially malicious dapp, argument names and types don't have to be part of the signed data, only the actual values have to be signed.

Unfortunately, nodes don't know about contract metadata as of today to my knowledge (despite them containing useful information to display for normal transactions as well), so this is approach would take longer to implement. But in my opinion it would be cleaner and also more secure, due to the additional information that can be presented to the user.

Note that I'm not necessarily advocating for this, just wanted to put the idea out there. As I said, this EIP may be good enough for the time being.

@MicahZoltu
Copy link
Contributor

It has to process and potentially store parameter names and types in human readable form

The way it is designed, only the hash of the human readable method/parameter names needs to be stored on-chain. This is enough information to validate the data presented to the user was correct.

@wighawag
Copy link
Contributor

wighawag commented Oct 16, 2018

This is not true, dApps have supported Ledger and hardware wallets long before Metamask added support.

Fair enough, I still think this would be a shame to remove the benefit for users that can rely on a trusted bridge just because some cannot. We should look forward to the future where origin checks are a basic requirement for web3 browser. That's the whole point of making it part of the standard.

Also it is worth pointing that it is the dapps themselves who choose to act as bridge. As such if they want to continue they can simply opt for using message with no origin specified.

Since they provide the bridge it also means that the origin is implicitly check by the way

Which means Ledger would have an implementation which sees origin and completely ignores it, trusting something else to do the verification. This is where origin and optional breaks down

That something else would be the user and while this is basically falling back to the current EIP712, I would not say that it break down the concept entirely. Users would simply start to move to use more trustworthy setup where they can rely on their web3 browser to do the checks for them. The same way as users started to move to use hardware wallet for the extra security it provides.

So the Bob dApp just needs to wait until a user is using a "non-bridge" signer and then is free to spoof the origin.

You would have to agree though that it makes Bob's job a lot harder :) especially since the users would start to understand the benefit of using a web3 browser that can check the origin.

My issue with Optional additions to a standard is that they can lead to edge cases where the developers and users assume they are safe but in some circumstances they are not.

That's a fair point but in my opinion that should not prevent us from adding mandatory origin checks (for those that can obviously) to the web3 browser so we can start to live a safer and more usable future.

Note also my use of the terms "web3 browser" vs "web3 signer". Web3 browser are basically the one that should be capable of checking origin.

@ryanio
Copy link
Contributor

ryanio commented Oct 27, 2018

Hey all, we are looking to implement this EIP into Mist but without an origin or url in the domainSeparator, we feel like the most basic security premise of ensuring you are signing the message for the right dApp is compromised.

Another option is to have a ethereum-lists-like mapping for dApp Name to Contract Address to verify against. Or, a value in the contract itself that specifies the origin or url to verify.

@rmeissner
Copy link
Contributor

@ryanio I don't see why a browser would require the origin. There are many cases where an origin is unnecessary or even makes the verification way harder. One of these cases would be our contract (Gnosis Safe) where we don't really care from where the signature is. Also there are multiple signers in our case, so each signer could have a different origin. This introduces quite some unnecessary complexity.

Also another note on the origin. Currently it feels like that the assumption is made that this is always a website (or a specific url). I think this is also an assumption that will not hold up, as native mobile dapps will start to emerge and at some point probably overtake html5 dapps.

I do agree with @wighawag that supporting an origin provides the possibility to add more security. I also think that if a origin is provided in the signature request then it MUST be validated and if it is not as expected the message MUST NOT be signed (same for chainID actually, where it is SHOULD NOT right now)

For a more readable representation of the sign request (as I think reading the smart contract is just not feasible) I think it would be possible to integrate #719 into this EIP as part of the domain. Again here you could say the description MUST be displayed if provided.

@rmeissner
Copy link
Contributor

@ryanio I also think that making the user aware which dapp requested the signature/transaction is independent from this EIP. E.g. MetaMask is currently not displaying which url triggered the pop-up and just displays it.

@ryanio
Copy link
Contributor

ryanio commented Oct 27, 2018

I also think that making the user aware which dapp requested the signature/transaction is independent from this EIP

Then what is the purpose of domainSeparator?

From the EIP:

The domain separator is designed to include bits of DApp unique information such as the name of the DApp, the intended validator contract address, the expected DApp domain name, etc. The user and user-agent can use this information to mitigate phishing attacks, where a malicious DApp tries to trick the user into signing a message for another DApp.

We wouldn't stop the user from continuing to sign the data, but we would want to warn them.

@rmeissner
Copy link
Contributor

This is the purpose of the domain, but not all information are required (to be exact no information are required, you can use an empty domain). And I also see that this is useful, but I wouldn't make it required.

And my second comment was more, that this EIP enforces this on a contract level. But a lot of phishing attacks can already be prevented by just improving the existing UI/UX. If MetaMask displays me for every transaction which tab/website requested this transaction, I would feel a lot more confident that I don't accidentally approve a transaction from a different site.

@wighawag
Copy link
Contributor

wighawag commented Nov 2, 2018

Hi,

The newest version of the code at https://github.com/wighawag/eip712-origin is now using "_originHash" in the application specific data.

This works but I avoided mentioning this solution before as it mix application specific data with protocol data. It will also require to define a set of reserved word for adding more specific scheme in the future (see below).

The only benefit is that it does not require existing wallet to change the existing signing code.( Is that really an issue since it is in both case backward compatible ?)

Then web3 browser would simply have to check the origin's validity if it is included in the application data. In that case though, we need a naming convention to avoid collision with application data.

Actually we need to provide a convention for new fields too. As @naure mentioned we can simply set a reserved prefix like "_" and no application should use such prefix else they will run the risk of inadvertently enabling a future scheme.

Of course if we separate application data from protocol data as I originally envisioned, we do not need such convention since the protocol data is not mixed with the application data.

@wighawag
Copy link
Contributor

wighawag commented Nov 2, 2018

@rmeissner :

Also another note on the origin. Currently it feels like that the assumption is made that this is always a website (or a specific url). I think this is also an assumption that will not hold up, as native mobile dapps will start to emerge and at some point probably overtake html5 dapps.

The origin concept should still be relevant to native apps. Origin is a general concept that is not necessarily tied to the web or url.

I am not sure what you mean by native dapps. We can separate them in 2 :

You either have a native wallet app that get signature request from other native apps, in which case the operating system could provide the origin (whatever the os can provide to identify the app that requested a signature) to the wallet.

Or you have a built-in wallet in your application (not a good idea in my opinion) in which case the origin is the app itself. Since this require users trusting the app itself, the origin check is implicit.

@recmo
Copy link
Contributor

recmo commented Dec 18, 2018

@wighawag

The newest version of the code at https://github.com/wighawag/eip712-origin is now using "_originHash" in the application specific data.

I'm not entirely sure what the goal is here. In general, per-signature data can be added by creating an envelope:

struct Envelope {
    bytes32 originHash;
    Mail contents;
}

This is cleaner that underscore prefixes and reserved words.

Standardizing on some per-signature data from the user-agent is interesting and worth considering, perhaps as a follow-up EIP.

The example can also be done using the existing domain separator:

    function approve(bytes origin) {
       approvedOrigins[msg.sender][hash(EIP712Domain({
            name: "Ether Mail",
            version: '1',
            chainId: 1,
            origin: origin,
            // verifyingContract: this
            verifyingContract: 0xCcCCccccCCCCcCCCCCCcCcCccCcCCCcCcccccccC
        }))] = true;
   }

     function verify(bytes32 domain, MailWithOrigin mail, uint8 v, bytes32 r, bytes32 s) internal view returns (bool) {
        require(approvedOrigins[mail.from.wallet][domain], "origin not approved by mail sender");
        bytes32 digest = keccak256(abi.encodePacked(
            "\x19\x01",
            domain,
            hash(mail)
        ));
        return ecrecover(digest, v, r, s) == mail.from.wallet;
}

PS: In your example contract you have & 2^256-1. In Solidity^ means xor and not exponentiation, which is **. Also, bitmasking to 256 bits is implied by most operations because it is the word size.

You either have a native wallet app that get signature request from other native apps, in which case the operating system could provide the origin (whatever the os can provide to identify the app that requested a signature) to the wallet.

This is a great generalization of the origin concept. But I'm not sure what information an OS can provide that is not easily spoofed by the requestor.

@wighawag
Copy link
Contributor

@recmo Thanks for the feedback

This is cleaner that underscore prefixes and reserved words.

I agree and that's the way to go. It still require a reserved word for the Envelope type name though. Let's choose one and add it to the standard so we can extend the protocol this way.

The example can also be done using the existing domain separator:

This was mostly how I have done it before, except I can't find anymore the origin in the standard and if I remember correctly it was defined as a string. I think using bytes32 is a better approach as it allow to send only one bytes32 and store allowed origins in one storage slot without hashing. If we standardize the hashing for the origin, the signer can still show the string representation. That;s why I used the word "originHash".

PS: In your example contract you have & 2^256-1. In Solidity^ means xor and not exponentiation, which is **. Also, bitmasking to 256 bits is implied by most operations because it is the word size.

Oups, it is not the first time I use ^ instead of ** :)

This is a great generalization of the origin concept. But I'm not sure what information an OS can provide that is not easily spoofed by the requestor.

This is an OS implementation details and definitely achievable. If the OS is in control of the app<-> app messaging, it can ensure authentic origins. I am not knowledgeable on the current state of OS in that regard but OS can evolve.

@SilentCicero
Copy link

Just like to add a clarification to this EIP. I would like everyone to help determine if an EIP712 hash can simply be the domain separator portion:
https://ethereum-magicians.org/t/eip-712-standards-clarification-primarytype-as-domaintype/3286

This doesn't break anything specified in the standard that I know of and reduces on-chain hash requirements by a magnitude of half, which for highly gas-sensitive on-chain deployments and verification is substantial.

Can we please clarify if the primaryType can be EIP712Domain, and if so, if that allows me to dump the secondary hash, and just construct the Domain hash, and then final prefixed signing hash.

@Revolutions94
Copy link

3 license....how it schame..hahh
The problmes todays souce code same...
This is first issue..

On maker smarct contract..have 8 build from ? Revolutions lol...haha

@Revolutions94
Copy link

Foundation alex..have a same souce code.. of revolution.i have share this issue..but comunity problmes...

@maurelian
Copy link
Contributor

Why doesn't this appear here: https://eips.ethereum.org/erc?

@MicahZoltu
Copy link
Contributor

@maurelian The category is set to Interface not ERC.

Copy link

@Qdigital Qdigital left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@infa-jowoods
Copy link

infa-jowoods commented Nov 30, 2020

This EIP has value and I understand its rationale.

What does this sentence mean though: Hashing structured data is non-trivial and errors result in loss of the security properties of the system.

Hashing "structured" data is no different to hashing unstructured data. Covert to bytes, hash.

How is hashing structured data non-trivial?

Do you mean, having multiple wallets/services hash structured transaction data consistently is non-trivial?

@MicahZoltu
Copy link
Contributor

@infa-jowoods I don't know the answer to your question, but I recommend first seeing if the final EIP contains the answer to your question and if not then asking over in the discussions-to link (at the top of the EIP).

Final EIP: https://eips.ethereum.org/EIPS/eip-712

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.