Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce concat or bytes.concat #10903

Closed
axic opened this issue Feb 5, 2021 · 11 comments
Closed

Introduce concat or bytes.concat #10903

axic opened this issue Feb 5, 2021 · 11 comments
Assignees
Labels
language design :rage4: Any changes to the language, e.g. new features

Comments

@axic
Copy link
Member

axic commented Feb 5, 2021

This came up frequently and has been recently discussed under #9829. One of the uses of abi.encodePacked is for it to perform concatenation.

Many EIPs use snippets to explain the rules and many libraries rely on abi.encodePacked. The first impression is that they actually rely on the weird "encoding" it performs, but a closer look reveals that in many cases they only rely on the concatenation feature. In many cases the concatenation is used to add a prefix prescribed by EIP-191. An example for actual data encoding is EIP-712.

While it is a bit unfortunate that standards (aka EIPs) rely on Solidity snippets, it is something we likely can not change, but we can improve the language so that confusion is also minimised in those standards. Having concat instead of encodePacked I think would go towards that. (I believe this confusion lead to libraries adding support for "encodePacked".)

I propose to introduce concat which takes a variable number of bytes memory as an input and returns a single bytes memory.

Question A: should it also allow fixed bytes (i.e. bytesNN) as an inputs, which are tightly packed and not zero padded? Or should we in parallel finally implement conversion between fixed and variable bytes types?

Question B: should we start adding functions as members of types, i.e. bytes.concat()? If yes, is that a breaking change? I guess it is, because libraries could extend those types already.

@axic axic added the language design :rage4: Any changes to the language, e.g. new features label Feb 5, 2021
@cameel cameel added the feature label Feb 6, 2021
@chriseth
Copy link
Contributor

chriseth commented Feb 9, 2021

I don't think a global concat would be nice. Maybe we should start introducing "built-in" files that you can the use as

import { concat } from utils;

(note that there are no quotes around utils)

@hrkrshnn
Copy link
Member

hrkrshnn commented Feb 10, 2021

Does concat really have to take a variable number of bytes memory? I think we should just make it have two arguments. This would allow us to move it into the "standard library" in the future.

@ekpyron
Copy link
Member

ekpyron commented Feb 10, 2021

Does concat really have to take a variable number of bytes memory? I think we should just make it have two arguments. This would allow us to move it into the "standard library" in the future.

Could the optimizer deal with that properly (now or ever) :-)? A concat with a variable number of arguments can first sum up all length and allocate a huge chunk of result memory once. Iterated concats with two arguments would allocate a whole bunch of temporary arrays in between, but all of them will be required until the last one was created. And the size of the last one is unknown until you created all of the ones up front - that seems like a pretty hard problem to reduce to a single allocation in the end.
Apart from that we should have variadic templates, then we could move it to the standard library either way ;-).

@hrkrshnn
Copy link
Member

The uses for abi.encodePacked according to https://github.com/search?p=1&q=abi.encodepacked&type=Code is surprisingly mostly for a single argument!

@chriseth
Copy link
Contributor

What about defining the + operator?

@axic
Copy link
Member Author

axic commented Feb 10, 2021

Could also consider other operators for concatenation, such as ... (I just realised it is not used frequently, but in Lua, which I used for years 😬)

Also see: https://en.wikipedia.org/wiki/Concatenation

@axic
Copy link
Member Author

axic commented Feb 10, 2021

Other operators were suggested:

  • ++
  • ~
  • ||

@axic
Copy link
Member Author

axic commented Feb 10, 2021

Decision from call: bytes.concat with variadic arguments supporting fixed and variable length bytes.

We also discussed the option to have bytesNN.concat, but deferred decision. One use case people have is keccak hashing with encodePacked, because that has a memory wasting behaviour. There having bytesNN.concat would be beneficial. We concluded that we should attempt to improve memory handing for the keccak256(bytes.concat()) (see #9046) case before coming back to bytesNN.concat.

@chriseth
Copy link
Contributor

@mijovic what is still left to do here?

@mijovic
Copy link
Contributor

mijovic commented Apr 14, 2021

I think I didn't close the issue because there was still an open point about allowing bytesNN as an argument of bytes.concat(). If we do not want to allow that, I think there is nothing else to be done here.
Also, returning fixed bytes type, for example bytes32.concat() but probably after finishing bytes->bytesNN (#11047) it is not relevant as we can convert it easily.

@chriseth
Copy link
Contributor

I would say this is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language design :rage4: Any changes to the language, e.g. new features
Projects
None yet
Development

No branches or pull requests

6 participants