-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yet another cool checksum address encoding #55
Comments
This is very nice idea. I wander about the |
Hmm, I'm fine either way, though I definitely see the rationale for standardizing one way or the other. |
I saw comments on the TurboEthereum guide that suggested that we were moving away from raw hex keys into ICAP keys:
"Notice the last two lines there. One is the ICAP address, the other is the raw hexadecimal address. The latter is an older representation of address that you'll sometimes see and is being phased out in favour of the shorter ICAP address which also includes a checksum to avoid problems with mistyping. All normal (aka direct) ICAP addresses begin with XE so you should be able to recognise them easily." My concern is that if there was a previous decision to start moving to ICAP, I'm not sure if this will add confusion. However, if this helps give raw hex addresses a checksum I guess that can only be beneficial, even if everyone wants to move to ICAP eventually. |
My preference is that a checksum-enabled Ethereum address is immediately recognizable as such. The proposed solution is not immediately recognizable as being distinct from a standard Ethereum address and could be confused for being a strangely-cased version of non-checksummed addresses. Although it offers superior backwards compatibility, I believe will only cause additional confusion to the end-user. Since the change in format serves to make the address less error prone through checksums, I posit they should also be immediately recognizable through a fixed prefix or otherwise obvious identifier. One reason why I prefer ICAP over this proposed solution is that it signals to the user clearly that this is an Ethereum address and cannot be confused with a transaction/block hash. |
Just saw this proposal now. I disagree @tgerring that it will cause confusion: to a layman, it will be indistinguishable from a normal address. This approach is very easy to implement in the client side and doesnt require much. I would say this could be adopted as a great intermediary before ICAP – also would be a good alternative if ICAPs don't catch on. |
I did a rudimentary implementation on javascript in the web3 object: var isAddress = function (address) {
if (!/^(0x)?[0-9a-f]{40}$/i.test(address)) {
// check if it has the basic requirements of an address
return false;
} else if (/^(0x)?[0-9a-f]{40}$/.test(address) || /^(0x)?[0-9A-F]{40}$/.test(address)) {
// If it's all small caps or all all caps, return true
return true;
} else {
// Otherwise check each case
address = address.replace('0x','');
// creates the case map using the binary form of the hash of the address
var caseMap = parseInt(web3.sha3('0x'+address.toLowerCase()),16).toString(2).substring(0, 40);
for (var i = 0; i < 40; i++ ) {
// the nth letter should be uppercase if the nth digit of casemap is 1
if ((caseMap[i] == '1' && address[i].toUpperCase() != address[i])|| (caseMap[i] == '0' && address[i].toLowerCase() != address[i])) {
return false;
}
}
return true;
}
};
/**
* Makes a checksum address
*
* @method toChecksumAddress
* @param {String} address the given HEX adress
* @return {String}
*/
var toChecksumAddress = function (address) {
var checksumAddress = '0x';
address = address.toLowerCase().replace('0x','');
// creates the case map using the binary form of the hash of the address
var caseMap = parseInt(web3.sha3('0x'+address),16).toString(2).substring(0, 40);
for (var i = 0; i < address.length; i++ ) {
if (caseMap[i] == '1') {
checksumAddress += address[i].toUpperCase();
} else {
checksumAddress += address[i];
}
}
console.log('create: ', address, caseMap, checksumAddress)
return checksumAddress;
}; It works internally and it's almost invisible to the user. I don't really see a good reason not to implement it.
And here the results including
|
You're hashing the hex and not the binary. |
Good catch, I switched to the sha3 of the binary but the results still won't match. I'm a bit confused on what you meant by For example:
I suppose I am misunderstanding what you are using as input.. PS: you can probably simplify your example by not checking for letters: you can do uppercase conversions on numbers and although there is such a thing as a lowercase digits they are represented the same |
By "binary" I meant "just the raw bytes, not any kind of encoded representation". There's also the special chars ¹²³⁴⁵⁶⁷⁸⁹⁰ I suppose, but that's not backwards-compatible anymore. |
I initially like this quite a bit. All of the cons that I see are extreme edge cases and I think that it's pretty trivial for library authors to handle gracefully. I like the backwards compatibility, the compatibility with existing hex parsing utilities. |
I'm not sure if the web3.js coverts to bytes. Also, pure javascript only supports binary conversion up to a hard limit, any larger and I had to use the |
Mathematically speaking it would be a bit ugly imo.
Yeah, I had this problem; for one of my example gambling dapps where I was using a hash-commit-reveal protocol I took an existing sha3 impl; you could do the same: https://github.com/ethereum/dapp-bin/blob/master/serpent_gamble/scripts/sha3.min.js |
I see some problems with ICAP's variable length and low checksum bitsize: "XE7338O073KYGTWWZN0F2WZ0R8PX5ZPPZS": This is a 30 charaters address, IBAN compatible, based on the "Direct approach" from https://github.com/ethereum/wiki/wiki/ICAP:-Inter-exchange-Client-Address-Protocol Now, If you enter such an address, and accidentally add another character somewhere, you have created a "Basic" (incompatible, but allowed and valid in ethereum ICAP implementation). The problem is that naively, without knowing all properties of the checksum algorithm, there is a 1% chance this will pass validation, and consequently you are sending money into a black hole. On the topic of checksums in hex addresses: I agree that there should be some easy identification mechanism to separate it from an unchecked address. Alternatives might include: This makes it not completely backwards compatilble, but increadably easy to edit to satisfy a legacy system without any checksums. |
@simenfd there should be some easy identification mechanism to separate it from an uncheck address. I disagree. I think the whole point of this scheme is that it's completely backwards compatible. There's no point in separating them. In my implementation, if the address is all caps or all small caps then it assumes to be a unchecksummed address. In a 40 char address, there will be in average 15 letters, the chances of all of them being the same case is 1:16384 so I guess it's strong enough. |
That was exactly my line of thinking as well. It's safe enough to assume that all caps or all lower addresses are not checksummed. |
The backwards compatibility is nice but IMO presents a clear danger: If the user believes that the address has a checksum she might be willing to input an address by hand. If she then happens to use an old version of transaction handling that just parses the hex ignoring the case then her funds are lost in the case of a typo. For this reason my feeling is that I prefer a scheme that would make a normal hex parser throw an error, rather than a user thinking she's protected by a checksum when in fact she is not. |
@christianlundkvist that's a good point, which can be solved with UI: show red when it fails, show yellow when it's not checksummed. |
@christianlundkvist Exactly my point: False security might be more dangerous than no security. E.g. when I enter a bitcoin address by hand (yeah, quite rarely), I am quite confident that the system will capture an error with the 32bit checksum that is universally implemented there; I wish I will get this confidence in ethereum as well. For fun, I tried to make some ICAP addresses, using the functions in the go-ethereum implementation. The first two in bold are the original addresses, and the ICAP, the remaining are all ICAP mutation-addresses that validate, but of course, are different addresses. XE1222Q908LN1QBBU6XUQSO1OHWJIOS4603 |
@alexvandesande: My main point was that backwards compatibility allows you to use the address in a dapp that was created before this EIP. So the UI in this case wouldn't know anything about checksummed addresses and wouldn't give the user any specific warning. If the user receives an address like |
I'd like to challenge the idea that we should pay much attention to the "type it in by hand" use cases. If the ecosystem matures then we'll have good tooling around QR-code based transmission of addresses or something else that's even better UX.
The only way to avoid this situation is to have checksummed addresses be backwards incompatible. I'm of the opinion that backwards incompatibility is worse than cases where someone burns ether using an app that doesn't implement checksumming using an address that "looks" like it's checksummed. I think this situation is likely to be rare and to largely apply to using old software from before the checksum days, or poorly written software. |
In that case do you think we should not worry about checksumming at all? Are there other scenarios where checksums are used?
I feel like this would be preferred.
My view is that the moment the checksum is introduced a majority of software becomes old software, and people are notoriously slow at updating too... |
My point was that I believe the type-by-hand use case is a small corner case where the user is potentially already doing something questionable. We can still apply checksums to these, but I am of the opinion that we don't need to cater to this use case. As for the other stuff, I don't have very strong opinions on the matter. Backwards compatibility seems nice but I see the validity in the idea that a breaking change is also a way to achieve a level of security in the area since it removes ambiguity. |
I don't believe we can expect any users to realize the difference between a check summed address and a normal one (most people don't realize this even for bank accounts when the last digit is separated like12345-7), this is not the point of the checksum. The point of backwards of compatibility is that transactions between checksum enabled wallets are safer. If you make a typo in a non checksum enabled wallet you'll lose your ether, just like you do now, and it's that particular wallet's developer job to make that client more secure. Also, I don't think copying by hand is the main situation here, if we were trying to optimize that then we should be talking about pseudo-word seeds and name registries. Checksums are just extra securities against accidental typos, letters that were cut out by copying the wrong digit and are an extra assurance to the user that the address is still intact, just like the icon is. I don't really see any disadvantage of adding these are they were very simple to implement to web3.js Although I still haven't matched the initial implementation, probably because basic primitives on Python are very different than what JavaScript comes up with. Since a lot of implementations will be JavaScript I still think it makes more sense to use the sha of the hex, since that's how it comes to the library..
|
I don't really feel very strongly either way TBH and the design of this particular checksum scheme is actually super cool. 😊 |
Any reason we don't use good old base 58? |
Jonathan: This would break backwards compatibility. We already have a proposed standard without backwards compatibility that adopts more characters it's called IBAN Sent from my iPhone
|
Just chiming in as a web2 dev mostly being an observer (of your work and of end users discussions): If you look at the Ethereum subreddit these days there are a ton of new adopters with no tech experience at all trying to find out how to use Ethereum. In short, I believe anything including typing addresses by hand should be expected. I remember seeing twitter pinned tweets in 2014 with images (not text) of dogecoin addresses for charities etc. A lot of adopters may barely know their way around a computer at all, and I think if you accomplish retaining them you are a raging success and have what is needed for mass adoption. |
Agree. And adding a case sensitive checksum increases security for those cases, while being invisible for implementations that don't support it
|
I am curious what java implementation of this is ? |
You'll find the correct specification and example implementations at the file here: https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. The file also includes an adoption table to help track the adoption of EIP-55 checksums in the ecosystem. We're going to close this issue now. If any corrections need to be made (or to update the adoption table), please open a PR on the file. |
You should edit the example code and test vectors in the first post. It is wrong and someone who does not read the whole conversation will use the incorrect implementation. |
This EIP is now located at https://github.com/ethereum/EIPs/blob/master/EIPS/eip-55.md. Please go there for the correct specification. The text in this issue may be incorrect or outdated, and is not maintained. |
@cdetrio can you push the "official test suite" into the EIP? I believe it is this one: #55 (comment) |
Java checker of ethereum address |
Current python3 eth-utils implementation
Output is
|
Thanks |
|
EDITOR UPDATE (2017-08-24): This EIP is now located at https://eips.ethereum.org/EIPS/eip-55. Please go there for the correct specification. The text below may be incorrect or outdated, and is not maintained.
Code:
In English, convert the address to hex, but if the ith digit is a letter (ie. it's one of
abcdef
) print it in uppercase if the ith bit of the hash of the address (in binary form) is 1 otherwise print it in lowercase.Benefits:
The average address will have 60 check bits, and less than 1 in 1 million addresses will have less than 32 check bits; this is stronger performance than nearly all other check schemes. Note that the very tiny chance that a given address will have very few check bits is dwarfed by the chance in any scheme that a bad address will randomly pass a checkUPDATE: I was actually wrong in my math above. I forgot that the check bits are per-hex-character, not per-bit (facepalm). On average there will be 15 check bits per address, and the net probability that a randomly generated address if mistyped will accidentally pass a check is 0.0247%. This is a ~50x improvement over ICAP, but not as good as a 4-byte check code.
Examples:
0xCd2a3d9f938e13Cd947eC05ABC7fe734df8DD826
(the "cow" address)0x9Ca0e998dF92c5351cEcbBb6Dba82Ac2266f7e0C
0xcB16D0E54450Cdd2368476E762B09D147972b637
The text was updated successfully, but these errors were encountered: