tip: 104
title: Data Signing Specification
author: federico<[email protected]>
discussions-to: https://github.com/tronprotocol/tips/issues/104
status: Last Call
type: Standards Track
category: TRC
created: 2019-10-31
It not only need to sign the transaction, but also some ordinary bytestrings in practical applications. It is necessary to distinguish the two kinds of data when signing because risks may exist if an attacker tricks users to sign semantically meaningless bytes which can de decoded as a valid transaction. This TIP aims to provide explicit method to distinguish between transaction and ordinary bytestrings when signing.
This is a standard for hashing and signing of data to distinguish the transactions and ordinary bytestrings. Furthermore, to make the users be clear of what the message they has signed instead of opaque bytestrings, we also includes the typed structured data processing as opposed to just bytestrings based on EIP712.
To prevent the potential risk when user is tricked to sign forged message, it's necessary to provide the standard method to distinguish between transaction and ordinary message when signing. Since signed messages are an opaque hex string displayed to the user with little context about the items that make up the message, so we present the scheme to encode data along with its structure which allows it to be displayed to the user for verification when signing.
A signature scheme consists of hashing algorithm and a signing algorithm. The signing algorithm of choice is secp256k1
. The hashing algorithm of choice is keccak256
, this is a function from bytestrings, 𝔹⁸ⁿ, to 256-bit strings, 𝔹²⁵⁶.
There are two kinds of messages, transactions 𝕋
and bytestrings 𝔹⁸ⁿ
. These are signed using signTransaction
and signString
respectively. Originally the encoding function encode : 𝕋 ∪ 𝔹⁸ⁿ → 𝔹⁸ⁿ
was defined as follows:
encode(t : 𝕋) = Protobuf(t)
encode(b : 𝔹⁸ⁿ) = b
Note: Protobuf is an encoding method for serializing structured data.
If an attacker take b = Protobuf(t)
, it can mislead the user to sign a valid transaction as opaque bytestrings. To avoid the problem, we can modify the second encoding function:
encode(b : 𝔹⁸ⁿ) = "\x19TRON Signed Message:\n" ‖ len(b) ‖ b
wherelen(b)
is the ascii-decimal encoding of the number of bytes inb
.
This can solve the problem since Protobuf(t : 𝕋)
in our implementation never starts with \x19
. Actually, the first byte should be \x15
, which denotes the byte length of "TRON Signed Message:\n"
. In order to keep consistent with the current implementation, we will leave it as it is.
For the same reason, len(b) is set as the fixed 32.
Different Hash algorithms are used for transactions and bytestring when signing. Actually, when signing the transaction, SHA256 is used; when signing the bytestrings, Keccak256 is used.
Hash(t : 𝕋) = SHA256(encode(t : 𝕋))
Hash(b : 𝔹⁸ⁿ) = Keccak256(encode(b : 𝔹⁸ⁿ))
The signString
call assumes messages to be bytestrings. In practice we are not hashing bytestrings but the collection of all semantically different messages of all different DApps 𝕄
. Unfortunately, this set is impossible to formalize. Instead we approximate it with the set of typed named structures 𝕊
. This standard formalizes the set 𝕊
and provides a deterministic injective encoding function for it.
Just encoding structs is not enough. It is likely that two different DApps use identical structs. When this happens, a signed message intended for one DApp would also be valid for the other. The signatures are compatible. This can be intended behaviour, in which case everything is fine as long as the DApps took replay attacks into consideration. If it is not intended, there is a security problem.
The way to solve this is by introducing a domain separator, a 256-bit number. This is a value unique to each domain that is 'mixed in' the signature. It makes signatures from different domains incompatible. The domain separator is designed to include bits of DApp unique information such as the name of the DApp, the intended validator contract address, the expected DApp domain name, etc. The user and user-agent can use this information to mitigate phishing attacks, where a malicious DApp tries to trick the user into signing a message for another DApp.
This standard is only about signing messages and verifying signatures. In many practical applications, signed messages are used to authorize an action, for example an exchange of tokens. It is very important that implementers make sure the application behaves correctly when it sees the same signed message twice. For example, the repeated message should be rejected or the authorized action should be idempotent. How this is implemented is specific to the application and out of scope for this standard.
The set of signable messages is extended from transactions and bytestrings 𝕋 ∪ 𝔹⁸ⁿ
to also include structured data 𝕊
. The new set of signable messages is thus 𝕋 ∪ 𝔹⁸ⁿ ∪ 𝕊
. They are encoded to bytestrings suitable for hashing and signing as follows:
encode(transaction : 𝕋) = Protobuf(transaction)
encode(message : 𝔹⁸ⁿ) = "\x19TRON Signed Message:\n" ‖ len(message) ‖ message
wherelen(message)
is the non-zero-padded ascii-decimal encoding of the number of bytes inmessage
.encode(domainSeparator : 𝔹²⁵⁶, message : 𝕊) = "\x19\x01" ‖ domainSeparator ‖ hashStruct(message)
wheredomainSeparator
andhashStruct(message)
are defined below.
This encoding is deterministic because the individual components are. The encoding is injective because the three cases always differ in first two bytes. (Protobuf(transaction)
in our implementation does not start with \x19
.)
The 'version byte' is fixed to 0x01
, the 'version specific data' is the 32-byte domain separator domainSeparator
and the 'data to sign' is the 32-byte hashStruct(message)
.
To define the set of all structured data, we start with defining acceptable types. Here is an example:
struct Mail {
address from;
address to;
string contents;
}
Definition: A struct type has valid identifier as name and contains zero or more member variables. Member variables have a member type and a name.
Definition: A member type can be either an atomic type, a dynamic type or a reference type.
Definition: The atomic types are bytes1
to bytes32
, uint8
to uint256
, int8
to int256
, bool
and address
. These correspond to their definition in Solidity. Note that there are no aliases uint
and int
. Note that contract addresses are always plain address
. Fixed point numbers are not supported by the standard. Future versions of this standard may add new atomic types.
Definition: The dynamic types are bytes
and string
. These are like the atomic types for the purposed of type declaration, but their treatment in encoding is different.
Definition: The reference types are arrays and structs. Arrays are either fixed size or dynamic and denoted by Type[n]
or Type[]
respectively. Structs are references to other structs by their name. The standard supports recursive struct types.
Definition: The set of structured typed data 𝕊
contains all the instances of all the struct types.
The hashStruct
function is defined as
hashStruct(s : 𝕊) = keccak256(typeHash ‖ encodeData(s))
wheretypeHash = keccak256(encodeType(typeOf(s)))
Note: The typeHash
is a constant for a given struct type and does not need to be runtime computed.
The type of a struct is encoded as name ‖ "(" ‖ member₁ ‖ "," ‖ member₂ ‖ "," ‖ … ‖ memberₙ ")"
where each member is written as type ‖ " " ‖ name
. For example, the above Mail
struct is encoded as Mail(address from,address to,string contents)
.
If the struct type references other struct types (and these in turn reference even more struct types), then the set of referenced struct types is collected, sorted by name and appended to the encoding. An example encoding is Transaction(Person from,Person to,Asset tx)Person(address wallet,string name)Asset(address token,uint256 amount)
.
The encoding of a struct instance is enc(value₁) ‖ enc(value₂) ‖ … ‖ enc(valueₙ)
, i.e. the concatenation of the encoded member values in the order that they appear in the type. Each encoded member value is exactly 32-byte long.
The atomic values are encoded as follows: Boolean false
and true
are encoded as uint256
values 0
and 1
respectively. Addresses are encoded as uint160
. Integer values are sign-extended to 256-bit and encoded in big endian order. bytes1
to bytes31
are arrays with a beginning (index 0
) and an end (index length - 1
), they are zero-padded at the end to bytes32
and encoded in beginning to end order.
The dynamic values bytes
and string
are encoded as a keccak256
hash of their contents.
The array values are encoded as the keccak256
hash of the concatenated encodeData
of their contents (i.e. the encoding of SomeType[5]
is identical to that of a struct containing five members of type SomeType
).
The struct values are encoded recursively as hashStruct(value)
. This is undefined for cyclical data.
domainSeparator = hashStruct(tip104Domain)
where the type of tip104Domain
is a struct named TIP104Domain
with one or more of the below fields. Protocol designers only need to include the fields that make sense for their signing domain. Unused fields are left out of the struct type.
string name
the user readable name of signing domain, i.e. the name of the DApp or the protocol.string version
the current major version of the signing domain. Signatures from different versions are not compatible.uint256 chainId
the chain id. The user-agent should refuse signing if it does not match the currently active chain.address verifyingContract
the address of the contract that will verify the signature. The user-agent may do contract specific phishing prevention.bytes32 salt
an disambiguating salt for the protocol. This can be used as a domain separator of last resort.
Future extensions to this standard can add new fields with new user-agent behaviour constraints. User-agents are free to use the provided information to inform/warn users or refuse signing.
By adding the bytestrings with given prefix when signing, it's effective to prevent a malicious attacker from tricking the user to sign opaque bytestrings which can be decoded as a transaction. Furthermore, to make the users be clear of of what message they have signed, we also includes the typed structured data processing method, which can be more secure and convenient.
None
None