-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Naive tuple support #1147
Naive tuple support #1147
Conversation
Note that at this time there are three separate places that "expand" tuples. There feels like a lot of duplication between them, but they are really three separate cases. One expands just tuple types, another just tuple values, and the third expands both types and values. Another expansion will be needed for return value decoding, which I expect will re-use the types&values logic. |
Here is the error output given when running with the current code: https://gist.github.com/feuGeneA/68e714f2cd942f4cfef094b26459b912 |
web3/_utils/abi.py
Outdated
if isinstance(arguments[i], dict): | ||
component_name = function_abi['inputs'][i]['components'][j]['name'] | ||
tuple_value += (arguments[i][component_name],) | ||
elif isinstance(arguments[i], tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experiments, this function is called once to follow through this is...tuple
branch, and another through the parallel is...dict
branch.
1cdf328
to
fa73818
Compare
At this point this is fully functional. In addition to the CI runs being clean, my own (external) test case of calling into the contract I'm trying to support is also working. (Though, I should say, it's working only when running with my eth-utils PR#141 in place.) I would appreciate an initial review before I spend time polishing and whatnot. @carver ? @pipermerriam ? Thank you! 🙏 |
I'm not sure why the most recent CI run failed the |
@feuGeneA I restarted the test, we'll see if it goes green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from an incomplete review I did a few days back and never committed.
@davesque is this something you could review? |
web3/_utils/abi.py
Outdated
@@ -163,6 +193,23 @@ def is_encodable(_type, value): | |||
if not isinstance(_type, str): | |||
raise ValueError("is_encodable only accepts type strings") | |||
|
|||
if _type[0] == "(": # it's a tuple. check encodability of each component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's possible that the collapsing function could yield a nested tuple. This is possible in the ABI encoding format, but I don't know if the ABI JSON can currently represent that. In any case, with a type like (bool,(bool,bool))
, the code in this section will fail i.e. the components
variable will not contain the components of the tuple. I'll have to take a closer look at this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is possible that the collapsing function could yield a nested tuple. See demonstration in the tests I just added. And you're right, is_encodable()
does indeed fail in that case. Let me see if I can't scheme up a reasonably elegant way to handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davesque I believe I've addressed your concern in my latest commit, thank you for raising it. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davesque would you mind taking another look? I realize we're still deliberating strategy in #829 , but I thought it would be good to get your feedback on this while it's still relatively fresh in our minds.
For quick reference: That test showing that collapse_if_tuple()
can indeed yield a nested tuple is here. And the change to support nested tuples in is_encodable()
, the code in question in this thread, is here.
@feuGeneA looking over this now! |
@feuGeneA From first glance this looks good to me. Do you think it would make sense to move these helper methods to eth-abi and the logic from is_encodable into the default ABICodec as well? |
@stefanmendoza thank you for looking. I'm good with moving those helper methods to wherever it makes sense to have them. If/when there's a specific use for them in eth-abi (or its clients) then sure, let's do that. And yeah, if ABICodec will provide an For me, the big question is this: can we accept what I've got here and then later build out the registry stuff on top of this? I have clients that have been itching for these changes for several weeks, and I'd really like to see these changes landed so I can make them happy sooner rather than later. 😄 Are you amenable to an incremental approach? If there are any further changes I could make here that would facilitate such an approach, please ask! |
@feuGeneA I'm sure that'd be fine... I'll have to look into this more this evening to see how we might combine my eth-abi PR and this without having to do a lot of refactoring in the subsequent PRs. I'm still trying to get my bearings with this repo (haven't really played around in web3.py much). I'd like some input from @davesque on this since he has context to what the eth-abi PR was and how they may be compatible. |
I totally understand the sense of urgency in serving client, I've been there. @feuGeneA Please understand that from a maintainers perspective, we have to consider all the future work that every new merge might cause. If we rush to merge something that we expect still needs refactoring, it directly reduces the progress we can make on other great features and projects in the space. Even if we don't have to code the refactor, reviewing also takes time. I'd like to see the outcome of this before we merge:
Of course, you're welcome to take a look at @stefanmendoza 's changes to do the same. Thanks everyone! I know contributing to open source projects takes way more effort than just forking off locally to get something done for yourself. We appreciate that work, and the improvements! |
web3/_utils/abi.py
Outdated
components_csv = tuple_type[1:-1] # strip off expected outer parentheses | ||
# consider case of inner parentheses: | ||
open_paren_index = components_csv.find("(") | ||
close_paren_index = components_csv.rfind(")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feuGeneA I was looking at this and thought there might be a logic issue here...
>>> from web3._utils import abi
>>> abi.get_tuple_component_types('(int,int)')
['int', 'int']
# Correct
>>> abi.get_tuple_component_types('(int,(address,address),(address,address),int)')
['int', '(address,address),(address,address)', 'int']
# Should be - ['int', '(address,address)', '(address,address)', 'int']
>>> abi.get_tuple_component_types('((address,address),int,(address,address),int,(address,address))')
['(address,address),int,(address,address),int,(address,address)']
# Should be - ['(address,address)', 'int', '(address,address)', 'int', '(address,address)']
>>> abi.get_tuple_component_types('(int,(address,(int,int)),int)')
['int', '(address,(int,int))', 'int']
# Correct
I think this is running into an issue due to the reverse lookup of the closing paren. This logic should fix that:
def get_tuple_component_types(tuple_type):
components = tuple_type[1:-1]
if '(' not in components and ')' not in components:
# There were no tuples in the list, so we can just split on commas
return components.split(',')
split_components = re.split('([\.^(^)])', components)
reduced_components = list(filter(lambda val: val not in ['', ',', '(', ')'], split_components))
clean_components = list(map(lambda val: val.strip(','), reduced_components))
return list(map(lambda maybeTup: '(' + maybeTup + ')' if ',' in maybeTup else maybeTup, clean_components))
This works on the following cases:
(int,int) # Tuple with no inner tuples
['int', 'int']
(int,(address,address),(address,address),int) # Tuple with consecutive inner tuples
['int', '(address,address)', '(address,address)', 'int']
((address,address),int,(address,address),int,(address,address)) # Tuple with tuples as the first and last types
['(address,address)', 'int', '(address,address)', 'int', '(address,address)']
this still falls short in the event that we have inner nested tuples:
(int,(address,(int,int)),int) # Tuple with nested inner tuple
['int', 'address', '(int,int)', 'int']
and may be inefficient due to the creation of so many lists, but I think it's a good starting point / a step closer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I kept looking into this and the only way I could think of iterating over this while only considering "top-level" tuples as components is to use a queue:
def get_tuple_component_types(tuple_type):
component_list = []
components = tuple_type[1:-1].split(',')
components_to_skip = 0
for i in range(0, len(components)):
component = components[i]
# Skip this component because it was included as part of an inner tuple
if components_to_skip != 0:
components_to_skip = components_to_skip - 1
continue
if '(' not in component and ')' not in component:
# This tuple component is not part of an inner tuple
component_list.append(component)
else:
if '(' and ')' in component:
# This is a singular tuple
component_list.append(component)
else:
# We've encountered a top-level tuple
tuple_components = [component]
paren_queue = _parse_tuple_boundaries(component, [])
# We start iterating through the components to build the inner tuple
components_index = i
while paren_queue != []:
components_index = components_index + 1
component = components[components_index]
paren_queue = _parse_tuple_boundaries(component, paren_queue)
# The current component is part of an inner tuple, so we'll skip
# it while iterating through the outer component list
components_to_skip = components_to_skip + 1
tuple_components.append(component)
# We've iterated through the entire inner tuple, so reconstruct it
# and add it as a top-level component
component_list.append(','.join(tuple_components))
return component_list
def _parse_tuple_boundaries(component, initial_queue):
for char in component:
if char == '(':
initial_queue.append(char)
elif char == ')':
initial_queue.pop()
return initial_queue
This succeeds with the cases mentioned above as well as the existing tests in this commit:
>>> from web3._utils import abi
>>> tuple1 = '(int,int)'
>>> abi.get_tuple_component_types(tuple1)
['int', 'int']
>>> tuple2 = '(int,(address,address),(address,address),int)'
>>> abi.get_tuple_component_types(tuple2)
['int', '(address,address)', '(address,address)', 'int']
>>> tuple3 = '((address,address),int,(address,address),int,(address,address))'
>>> abi.get_tuple_component_types(tuple3)
['(address,address)', 'int', '(address,address)', 'int', '(address,address)']
>>> tuple4 = '(int,(address,(int,int)),int)'
>>> abi.get_tuple_component_types(tuple4)
['int', '(address,(int,int))', 'int']
>>> tuple5 = '(((address,address),int,int),int)'
>>> abi.get_tuple_component_types(tuple5)
['((address,address),int,int)', 'int']
And performance metrics:
$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(int,int)')"
200000 loops, best of 5: 1.37 usec per loop
$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(int,(address,address),(address,address),int)')"
20000 loops, best of 5: 7.44 usec per loop
$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('((address,address),int,(address,address),int,(address,address))')"
20000 loops, best of 5: 10.1 usec per loop
$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(int,(address,(int,int)),int)')"
50000 loops, best of 5: 5.22 usec per loop
$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(((address,address),int,int),int)')"
50000 loops, best of 5: 6.15 usec per loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanmendoza Thank you for the scrutiny! I've re-written get_tuple_component_types()
, and it satisfies all of the test cases you presented (all of which I wrote into actual test cases), and I think it's pretty intuitive and efficient. Please take a look and tell me what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to leverage the parsing facilities in eth-abi. They use a properly defined grammar. You could do something like this:
In [1]: from eth_abi.grammar import parse
In [2]: tt = parse('(int,(address,(int,int)),int)')
In [3]: tt
Out[3]: <TupleType '(int,(address,(int,int)),int)'>
In [4]: tt.components
Out[4]: (<BasicType 'int'>, <TupleType '(address,(int,int))'>, <BasicType 'int'>)
In [5]: [str(t) for t in tt.components]
Out[5]: ['int', '(address,(int,int))', 'int']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davesque welp... That's a lot simpler than what either of us were doing lol
>>> tuple1
'(int,int)'
>>> [str(t) for t in parse(tuple1).components]
['int', 'int']
>>> tuple2
'(int,(address,address),(address,address),int)'
>>> [str(t) for t in parse(tuple2).components]
['int', '(address,address)', '(address,address)', 'int']
>>> tuple3
'((address,address),int,(address,address),int,(address,address))'
>>> [str(t) for t in parse(tuple3).components]
['(address,address)', 'int', '(address,address)', 'int', '(address,address)']
>>> tuple4
'(int,(address,(int,int)),int)'
>>> [str(t) for t in parse(tuple4).components]
['int', '(address,(int,int))', 'int']
>>> tuple5
'(((address,address),int,int),int)'
>>> [str(t) for t in parse(tuple5).components]
['((address,address),int,int)', 'int']
I need to do a better job of digging through the helper libraries / repos for web3 to see if there are already functions implemented to do what I'm trying to do :P
Thanks, David!
@feuGeneA this method worked with the tests:
def get_tuple_component_types(tuple_type):
"""Return a list of the types of the components of the given tuple.
Expects string representation of the tuple, a parenthesized,
comma-separated list of the types of the tuple's components. Returns a
list representation of those component types.
>>> get_tuple_component_types('(uint256,uint256)')
['uint256', 'uint256']
Properly handles the case where some component types are themselves tuples.
>>> get_tuple_component_types('(uint256,(address,address),uint256)')
['uint256', '(address,address)', 'uint256']
"""
return [str(component) for component in parse(tuple_type).components]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @davesque ! Much simpler indeed!
@stefanmendoza I just removed the method entirely, since there was only one caller. And I migrated our test cases over to tests/core/utilities/test_abi_is_encodable.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@feuGeneA @stefanmendoza No problem! I'm still sort of on holiday but am starting to get back in the groove of things during this week. I think there might be some other places where the functionality in eth-abi could be used. Part of my plan for merging this PR is to go through and make sure there aren't still any places where we could be doing that.
4e0827a
to
5221587
Compare
@feuGeneA Thanks for all your work on this. Is it possible to squash some of the commits in this branch? Also, if any of the resulting squashed commits correspond to fixes for a particular error, can you omit the error message since it takes up quite a bit of space in the log? I'm finding it a bit hard to read and follow the commit history. |
Yes indeed. I usually am quite meticulous about making a nice commit
history, but the hack-first, polish-later approach that I took here didn't
facilitate that. And I love to squash, but never after review, but since
you asked I'll be happy to clean it up. Thank you for taking the time to
review!
…On Tue, Jan 1, 2019, 7:56 PM David Sanders ***@***.*** wrote:
@feuGeneA <https://github.com/feuGeneA> Thanks for all your work on this.
Is it possible to squash some of the commits in this branch? Also, if any
of the resulting squashed commits correspond to fixes for a particular
error, can you omit the error message since it takes up quite a bit of
space in the log? I'm finding it a bit hard to read and follow the commit
history.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1147 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHhMAVCMzdeJZt5KxNovf-LOFMYfTzbSks5u_AOrgaJpZM4Yz1CA>
.
|
@feuGeneA Thanks for that, Eugene! |
25fa7b0
to
218fa4f
Compare
Naive in the sense that there's no support for value munging (eg translating an ENS name into its associated address) for values within tuples. Changes to _utils.abi's * is_encodable(), * get_abi_inputs(), and * check_if_arguments_can_be_encoded() are all tested via tests.core.contracts.contract_util_functions.test_find_matching_fn_abi(), since find_matching_fn_abi() uses check_if_arguments_can_be_encoded(), which in turn uses get_abi_inputs() and is_encodable(). Also, get_abi_inputs() has its own test_get_abi_inputs() in that same test module. Finally, check_if_arguments_can_be_encoded() is also exercised by the new test_filter_by_encodability(), which was added for consistency with the other test_filter_by_* tests. Changes to _utils.abi.data_tree_map() are tested via new function tests.core.utilities.test_abi.test_data_tree_map(). Changes to _utils.abi.abi_sub_tree() are tested via a new test case on existing tests.core.utilities.test_map_abi_data(), since map_abi_data() calls abi_data_tree(), which in turn calls abi_sub_tree(). Changes to _utils' * _utils.abi.get_abi_input_types(), * _utils.abi.get_abi_output_types(), and * _utils.contracts.get_function_info() are currently not covered by any tests.
Includes fixes for previously broken doctests.
Renamed some tests/core/utilities/test_abi_filter*.py modules so that their names are all consistent. Also, in test_abi_filter_by_argument_name.py, renamed tests.core.utilities.test_abi_filter_by_name.test_filter_by_arguments_1() to ...test_filter_by_argument_name(), to match the function under test. And, in test_abi_filter_by_name.py, renamed test_filter_by_arguments() to test_filter_by_name(), again to match the function under test.
218fa4f
to
2b8d47e
Compare
Okay @davesque , I've gotten this to a place I'm happy with. In the interest of having commits that are completely functional and independently verifiable, nearly all of the changes ended up in a single commit (though I was able to tease a few things out). Let me know what you think, and in particular let me know if there's anything else I can do to facilitate. Thanks again! |
baaf104
to
a2e0583
Compare
a2e0583
to
3549fde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a full review, I want to make sure recursive cases are handled first.
@pytest.mark.parametrize( | ||
'types, data, funcs, expected', | ||
[ | ||
( # like web3._utils.rpc_abi.RPC_ABIS['eth_getCode'] | ||
['address', None], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has come up a few times, so let's all be on the lookout for it: recursive structures must be handled. For one example, what happens when we try to run a map function over ABI data that has an array inside a tuple?
Something like:
(
['(string,address[])'],
[(b'a string', [b'\xf2\xe2F\xbbv\xdf\x87l\xef\x8b8\xae\x84\x13\x0fOU\xde9['],
[addresses_checksummed, abi_string_to_text],
[('a string', ['0xF2E246BB76DF876Cef8b38ae84130F4F55De395b'])],
),
data_type[0] == "(" and | ||
isinstance(data_value, tuple) | ||
): | ||
return ABITypedData([data_type, data_value]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note below how the array-type ABI values have abi_sub_tree
applied recursively? I was expecting something like that for a tuple. Otherwise, I suspect data_tree_map
will fail to apply to the types within the tuple (or arrays within the tuple).
@feuGeneA @stefanmendoza Thanks for all your work on this guys! We'll take things from here to do the work of finalizing things and merging them. |
1510980
to
bf0a2fd
Compare
This is in as of #1235 🎉 Thanks everyone! |
What was wrong?
Tuples are not supported in contract interfaces (Issue #829). This PR offers a partial solution to that problem.
I've got a very specific use case that I need to get working. It's a call to an existing contract method, which both accepts and returns a
struct
, which of course is encoded as atuple
in the ABI.Thankfully, since my requirements don't demand support for ENS names or any other fancy string handling in the
struct
components, I'm taking @pipermerriam 's comment as license to make just the changes I need in order to get my use case working. My understanding is that the use ofeth_abi
's registry will be used for that, but that's a bigger task than I need done, so I'm leaving it for someone else to implement 😄The problem is that, while the current code just blindly copies the type names from the ABI and passes them to
eth_abi
, the type name'tuple'
isn't supported byeth_abi
, which raises an exceptionValueError: No matching entries for 'tuple' in encoder registry
. Instead,eth_abi
expects you to describe the tuple; that is, indicate the type not just as'tuple'
but rather as a parenthesized list of the types of the tuple components. So if you had a function that accepts a Soliditystruct { uint256 a; uint256 b; }
as one of its arguments, then you would need to useeth_abi
likeencode_abi('(uint256, uint256)', [0, 1])
How was it fixed?
My solution is simple: Before calling
eth_abi
methods, look in the ABI for arguments of typetuple
, and if found walk through them to compose type and argument lists aseth_abi
expects them.(By "naive" I mean that my solution does not try to solve the problem of processing, for example, ENS names in address fields of a struct, or any other fancy stuff that would need to be done for the fields of the struct.)
is_encodable
and other preparation logic.encode_abi
and other supporting logicCute Animal Picture