Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naive tuple support #1147

Closed

Conversation

feuGeneA
Copy link
Contributor

@feuGeneA feuGeneA commented Nov 26, 2018

What was wrong?

Tuples are not supported in contract interfaces (Issue #829). This PR offers a partial solution to that problem.

I've got a very specific use case that I need to get working. It's a call to an existing contract method, which both accepts and returns a struct, which of course is encoded as a tuple in the ABI.

Thankfully, since my requirements don't demand support for ENS names or any other fancy string handling in the struct components, I'm taking @pipermerriam 's comment as license to make just the changes I need in order to get my use case working. My understanding is that the use of eth_abi's registry will be used for that, but that's a bigger task than I need done, so I'm leaving it for someone else to implement 😄

The problem is that, while the current code just blindly copies the type names from the ABI and passes them to eth_abi, the type name 'tuple' isn't supported by eth_abi, which raises an exception ValueError: No matching entries for 'tuple' in encoder registry. Instead, eth_abi expects you to describe the tuple; that is, indicate the type not just as 'tuple' but rather as a parenthesized list of the types of the tuple components. So if you had a function that accepts a Solidity struct { uint256 a; uint256 b; } as one of its arguments, then you would need to use eth_abi like encode_abi('(uint256, uint256)', [0, 1])

How was it fixed?

My solution is simple: Before calling eth_abi methods, look in the ABI for arguments of type tuple, and if found walk through them to compose type and argument lists as eth_abi expects them.

(By "naive" I mean that my solution does not try to solve the problem of processing, for example, ENS names in address fields of a struct, or any other fancy stuff that would need to be done for the fields of the struct.)

  • Support tuples in is_encodable and other preparation logic.
  • Support tuples in encode_abi and other supporting logic
  • Support tuples in decoding of value returned by contract method
  • Tests! Run existing and write new!
  • Documentation?

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@feuGeneA
Copy link
Contributor Author

Note that at this time there are three separate places that "expand" tuples. There feels like a lot of duplication between them, but they are really three separate cases. One expands just tuple types, another just tuple values, and the third expands both types and values. Another expansion will be needed for return value decoding, which I expect will re-use the types&values logic.

@feuGeneA
Copy link
Contributor Author

Here is the error output given when running with the current code: https://gist.github.com/feuGeneA/68e714f2cd942f4cfef094b26459b912

if isinstance(arguments[i], dict):
component_name = function_abi['inputs'][i]['components'][j]['name']
tuple_value += (arguments[i][component_name],)
elif isinstance(arguments[i], tuple):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experiments, this function is called once to follow through this is...tuple branch, and another through the parallel is...dict branch.

@feuGeneA
Copy link
Contributor Author

At this point this is fully functional. In addition to the CI runs being clean, my own (external) test case of calling into the contract I'm trying to support is also working. (Though, I should say, it's working only when running with my eth-utils PR#141 in place.)

I would appreciate an initial review before I spend time polishing and whatnot. @carver ? @pipermerriam ? Thank you! 🙏

@feuGeneA feuGeneA changed the title [WiP] Naive tuple support Naive tuple support Dec 4, 2018
@feuGeneA feuGeneA mentioned this pull request Dec 4, 2018
@feuGeneA
Copy link
Contributor Author

feuGeneA commented Dec 4, 2018

I'm not sure why the most recent CI run failed the py36-integration-parity-ws check, on the install parity if needed step, but it seems unrelated to my changes, since all checks passed on the previous commit, and only a docstring was changed in the last commit.

@pipermerriam
Copy link
Member

@feuGeneA I restarted the test, we'll see if it goes green.

Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from an incomplete review I did a few days back and never committed.

web3/_utils/abi.py Outdated Show resolved Hide resolved
web3/_utils/abi.py Outdated Show resolved Hide resolved
@pipermerriam
Copy link
Member

@davesque is this something you could review?

@@ -163,6 +193,23 @@ def is_encodable(_type, value):
if not isinstance(_type, str):
raise ValueError("is_encodable only accepts type strings")

if _type[0] == "(": # it's a tuple. check encodability of each component
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's possible that the collapsing function could yield a nested tuple. This is possible in the ABI encoding format, but I don't know if the ABI JSON can currently represent that. In any case, with a type like (bool,(bool,bool)), the code in this section will fail i.e. the components variable will not contain the components of the tuple. I'll have to take a closer look at this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is possible that the collapsing function could yield a nested tuple. See demonstration in the tests I just added. And you're right, is_encodable() does indeed fail in that case. Let me see if I can't scheme up a reasonably elegant way to handle it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davesque I believe I've addressed your concern in my latest commit, thank you for raising it. Let me know what you think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davesque would you mind taking another look? I realize we're still deliberating strategy in #829 , but I thought it would be good to get your feedback on this while it's still relatively fresh in our minds.

For quick reference: That test showing that collapse_if_tuple() can indeed yield a nested tuple is here. And the change to support nested tuples in is_encodable(), the code in question in this thread, is here.

@stefanmendoza
Copy link
Contributor

@feuGeneA looking over this now!

@stefanmendoza
Copy link
Contributor

@feuGeneA From first glance this looks good to me. Do you think it would make sense to move these helper methods to eth-abi and the logic from is_encodable into the default ABICodec as well?

@feuGeneA
Copy link
Contributor Author

@stefanmendoza thank you for looking.

I'm good with moving those helper methods to wherever it makes sense to have them. If/when there's a specific use for them in eth-abi (or its clients) then sure, let's do that.

And yeah, if ABICodec will provide an is_encodable() method, then I think we'll need to have that logic in there.

For me, the big question is this: can we accept what I've got here and then later build out the registry stuff on top of this?

I have clients that have been itching for these changes for several weeks, and I'd really like to see these changes landed so I can make them happy sooner rather than later. 😄 Are you amenable to an incremental approach?

If there are any further changes I could make here that would facilitate such an approach, please ask!

@stefanmendoza
Copy link
Contributor

@feuGeneA I'm sure that'd be fine... I'll have to look into this more this evening to see how we might combine my eth-abi PR and this without having to do a lot of refactoring in the subsequent PRs. I'm still trying to get my bearings with this repo (haven't really played around in web3.py much). I'd like some input from @davesque on this since he has context to what the eth-abi PR was and how they may be compatible.

@carver
Copy link
Collaborator

carver commented Dec 11, 2018

I have clients that have been itching for these changes for several weeks, and I'd really like to see these changes landed so I can make them happy sooner rather than later.

I totally understand the sense of urgency in serving client, I've been there. @feuGeneA

Please understand that from a maintainers perspective, we have to consider all the future work that every new merge might cause. If we rush to merge something that we expect still needs refactoring, it directly reduces the progress we can make on other great features and projects in the space. Even if we don't have to code the refactor, reviewing also takes time.

I'd like to see the outcome of this before we merge:

I'll have to look into this more this evening to see how we might combine my eth-abi PR and this without having to do a lot of refactoring in the subsequent PRs.

Of course, you're welcome to take a look at @stefanmendoza 's changes to do the same.


Thanks everyone! I know contributing to open source projects takes way more effort than just forking off locally to get something done for yourself. We appreciate that work, and the improvements!

components_csv = tuple_type[1:-1] # strip off expected outer parentheses
# consider case of inner parentheses:
open_paren_index = components_csv.find("(")
close_paren_index = components_csv.rfind(")")
Copy link
Contributor

@stefanmendoza stefanmendoza Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feuGeneA I was looking at this and thought there might be a logic issue here...

>>> from web3._utils import abi
>>> abi.get_tuple_component_types('(int,int)')
['int', 'int']
# Correct
>>> abi.get_tuple_component_types('(int,(address,address),(address,address),int)')
['int', '(address,address),(address,address)', 'int']
# Should be - ['int', '(address,address)', '(address,address)', 'int']
>>> abi.get_tuple_component_types('((address,address),int,(address,address),int,(address,address))')
['(address,address),int,(address,address),int,(address,address)']
# Should be - ['(address,address)', 'int', '(address,address)', 'int', '(address,address)']
>>> abi.get_tuple_component_types('(int,(address,(int,int)),int)')
['int', '(address,(int,int))', 'int']
# Correct

I think this is running into an issue due to the reverse lookup of the closing paren. This logic should fix that:

def get_tuple_component_types(tuple_type):
    components = tuple_type[1:-1]
    
    if '(' not in components and ')' not in components:
        # There were no tuples in the list, so we can just split on commas
        return components.split(',')
    
    split_components = re.split('([\.^(^)])', components)    
    reduced_components = list(filter(lambda val: val not in ['', ',', '(', ')'], split_components))
    clean_components = list(map(lambda val: val.strip(','), reduced_components))
    return list(map(lambda maybeTup: '(' + maybeTup + ')' if ',' in maybeTup else maybeTup, clean_components)) 

This works on the following cases:

(int,int) # Tuple with no inner tuples
['int', 'int']

(int,(address,address),(address,address),int) # Tuple with consecutive inner tuples
['int', '(address,address)', '(address,address)', 'int']

((address,address),int,(address,address),int,(address,address)) # Tuple with tuples as the first and last types
['(address,address)', 'int', '(address,address)', 'int', '(address,address)']

this still falls short in the event that we have inner nested tuples:

(int,(address,(int,int)),int) # Tuple with nested inner tuple
['int', 'address', '(int,int)', 'int']

and may be inefficient due to the creation of so many lists, but I think it's a good starting point / a step closer.

cc @davesque / @carver

Copy link
Contributor

@stefanmendoza stefanmendoza Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I kept looking into this and the only way I could think of iterating over this while only considering "top-level" tuples as components is to use a queue:

def get_tuple_component_types(tuple_type):
    component_list = []
    components = tuple_type[1:-1].split(',')

    components_to_skip = 0
    for i in range(0, len(components)):
        component = components[i]
        
        # Skip this component because it was included as part of an inner tuple
        if components_to_skip != 0:
            components_to_skip = components_to_skip - 1
            continue

        if '(' not in component and ')' not in component:
            # This tuple component is not part of an inner tuple
            component_list.append(component)
        else:
            if '(' and ')' in component:
                # This is a singular tuple
                component_list.append(component)
            else:
                # We've encountered a top-level tuple
                tuple_components = [component]

                paren_queue = _parse_tuple_boundaries(component, [])

                # We start iterating through the components to build the inner tuple
                components_index = i
                while paren_queue != []:
                    components_index = components_index + 1
                    component = components[components_index]

                    paren_queue = _parse_tuple_boundaries(component, paren_queue)

                    # The current component is part of an inner tuple, so we'll skip
                    # it while iterating through the outer component list    
                    components_to_skip = components_to_skip + 1
                    tuple_components.append(component)
                
                # We've iterated through the entire inner tuple, so reconstruct it
                # and add it as a top-level component
                component_list.append(','.join(tuple_components))
    
    return component_list


def _parse_tuple_boundaries(component, initial_queue):
    for char in component:
        if char == '(':
            initial_queue.append(char)
        elif char == ')':
            initial_queue.pop()

    return initial_queue

This succeeds with the cases mentioned above as well as the existing tests in this commit:

>>> from web3._utils import abi
>>> tuple1 = '(int,int)'
>>> abi.get_tuple_component_types(tuple1)
['int', 'int']
>>> tuple2 = '(int,(address,address),(address,address),int)'
>>> abi.get_tuple_component_types(tuple2)
['int', '(address,address)', '(address,address)', 'int']
>>> tuple3 = '((address,address),int,(address,address),int,(address,address))'
>>> abi.get_tuple_component_types(tuple3)
['(address,address)', 'int', '(address,address)', 'int', '(address,address)']
>>> tuple4 = '(int,(address,(int,int)),int)'
>>> abi.get_tuple_component_types(tuple4)
['int', '(address,(int,int))', 'int']
>>> tuple5 = '(((address,address),int,int),int)'
>>> abi.get_tuple_component_types(tuple5)
['((address,address),int,int)', 'int']

And performance metrics:

$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(int,int)')"
200000 loops, best of 5: 1.37 usec per loop

$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(int,(address,address),(address,address),int)')"
20000 loops, best of 5: 7.44 usec per loop

$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('((address,address),int,(address,address),int,(address,address))')"
20000 loops, best of 5: 10.1 usec per loop

$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(int,(address,(int,int)),int)')"
50000 loops, best of 5: 5.22 usec per loop

$ python3 -m timeit -s "from web3._utils import abi" "abi.get_tuple_component_types('(((address,address),int,int),int)')"
50000 loops, best of 5: 6.15 usec per loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefanmendoza Thank you for the scrutiny! I've re-written get_tuple_component_types(), and it satisfies all of the test cases you presented (all of which I wrote into actual test cases), and I think it's pretty intuitive and efficient. Please take a look and tell me what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to leverage the parsing facilities in eth-abi. They use a properly defined grammar. You could do something like this:

In [1]: from eth_abi.grammar import parse

In [2]: tt = parse('(int,(address,(int,int)),int)')

In [3]: tt
Out[3]: <TupleType '(int,(address,(int,int)),int)'>

In [4]: tt.components
Out[4]: (<BasicType 'int'>, <TupleType '(address,(int,int))'>, <BasicType 'int'>)

In [5]: [str(t) for t in tt.components]
Out[5]: ['int', '(address,(int,int))', 'int']

Copy link
Contributor

@stefanmendoza stefanmendoza Jan 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davesque welp... That's a lot simpler than what either of us were doing lol

>>> tuple1
'(int,int)'
>>> [str(t) for t in parse(tuple1).components]
['int', 'int']

>>> tuple2
'(int,(address,address),(address,address),int)'
>>> [str(t) for t in parse(tuple2).components]
['int', '(address,address)', '(address,address)', 'int']

>>> tuple3
'((address,address),int,(address,address),int,(address,address))'
>>> [str(t) for t in parse(tuple3).components]
['(address,address)', 'int', '(address,address)', 'int', '(address,address)']

>>> tuple4
'(int,(address,(int,int)),int)'
>>> [str(t) for t in parse(tuple4).components]
['int', '(address,(int,int))', 'int']

>>> tuple5
'(((address,address),int,int),int)'
>>> [str(t) for t in parse(tuple5).components]
['((address,address),int,int)', 'int']

I need to do a better job of digging through the helper libraries / repos for web3 to see if there are already functions implemented to do what I'm trying to do :P

Thanks, David!

@feuGeneA this method worked with the tests:

def get_tuple_component_types(tuple_type):
    """Return a list of the types of the components of the given tuple.

    Expects string representation of the tuple, a parenthesized,
    comma-separated list of the types of the tuple's components.  Returns a
    list representation of those component types.

    >>> get_tuple_component_types('(uint256,uint256)')
    ['uint256', 'uint256']

    Properly handles the case where some component types are themselves tuples.

    >>> get_tuple_component_types('(uint256,(address,address),uint256)')
    ['uint256', '(address,address)', 'uint256']
    """
    return [str(component) for component in parse(tuple_type).components]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davesque ! Much simpler indeed!

@stefanmendoza I just removed the method entirely, since there was only one caller. And I migrated our test cases over to tests/core/utilities/test_abi_is_encodable.py.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@feuGeneA @stefanmendoza No problem! I'm still sort of on holiday but am starting to get back in the groove of things during this week. I think there might be some other places where the functionality in eth-abi could be used. Part of my plan for merging this PR is to go through and make sure there aren't still any places where we could be doing that.

@feuGeneA feuGeneA force-pushed the feature/naive-tuple-support branch from 4e0827a to 5221587 Compare December 31, 2018 21:09
@davesque
Copy link
Contributor

davesque commented Jan 2, 2019

@feuGeneA Thanks for all your work on this. Is it possible to squash some of the commits in this branch? Also, if any of the resulting squashed commits correspond to fixes for a particular error, can you omit the error message since it takes up quite a bit of space in the log? I'm finding it a bit hard to read and follow the commit history.

@feuGeneA
Copy link
Contributor Author

feuGeneA commented Jan 2, 2019 via email

@davesque
Copy link
Contributor

davesque commented Jan 2, 2019

@feuGeneA Thanks for that, Eugene!

@feuGeneA feuGeneA force-pushed the feature/naive-tuple-support branch from 25fa7b0 to 218fa4f Compare January 11, 2019 03:28
Naive in the sense that there's no support for value munging (eg
translating an ENS name into its associated address) for values within
tuples.

Changes to _utils.abi's
    * is_encodable(),
    * get_abi_inputs(), and
    * check_if_arguments_can_be_encoded()
are all tested via
tests.core.contracts.contract_util_functions.test_find_matching_fn_abi(),
since find_matching_fn_abi() uses check_if_arguments_can_be_encoded(),
which in turn uses get_abi_inputs() and is_encodable().  Also,
get_abi_inputs() has its own test_get_abi_inputs() in that same test
module.  Finally, check_if_arguments_can_be_encoded() is also exercised
by the new test_filter_by_encodability(), which was added for
consistency with the other test_filter_by_* tests.

Changes to _utils.abi.data_tree_map() are tested via new function
tests.core.utilities.test_abi.test_data_tree_map().

Changes to _utils.abi.abi_sub_tree() are tested via a new test case on
existing tests.core.utilities.test_map_abi_data(), since map_abi_data()
calls abi_data_tree(), which in turn calls abi_sub_tree().

Changes to _utils'
    * _utils.abi.get_abi_input_types(),
    * _utils.abi.get_abi_output_types(), and
    * _utils.contracts.get_function_info()
are currently not covered by any tests.
Includes fixes for previously broken doctests.
Renamed some tests/core/utilities/test_abi_filter*.py modules so that
their names are all consistent.

Also, in test_abi_filter_by_argument_name.py, renamed
tests.core.utilities.test_abi_filter_by_name.test_filter_by_arguments_1()
to ...test_filter_by_argument_name(), to match the function under test.

And, in test_abi_filter_by_name.py, renamed test_filter_by_arguments()
to test_filter_by_name(), again to match the function under test.
@feuGeneA feuGeneA force-pushed the feature/naive-tuple-support branch from 218fa4f to 2b8d47e Compare January 11, 2019 03:48
@feuGeneA
Copy link
Contributor Author

Okay @davesque , I've gotten this to a place I'm happy with. In the interest of having commits that are completely functional and independently verifiable, nearly all of the changes ended up in a single commit (though I was able to tease a few things out). Let me know what you think, and in particular let me know if there's anything else I can do to facilitate. Thanks again!

@feuGeneA feuGeneA force-pushed the feature/naive-tuple-support branch from baaf104 to a2e0583 Compare January 11, 2019 16:55
@feuGeneA feuGeneA force-pushed the feature/naive-tuple-support branch from a2e0583 to 3549fde Compare January 11, 2019 17:07
Copy link
Collaborator

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a full review, I want to make sure recursive cases are handled first.

@pytest.mark.parametrize(
'types, data, funcs, expected',
[
( # like web3._utils.rpc_abi.RPC_ABIS['eth_getCode']
['address', None],
Copy link
Collaborator

@carver carver Jan 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has come up a few times, so let's all be on the lookout for it: recursive structures must be handled. For one example, what happens when we try to run a map function over ABI data that has an array inside a tuple?

Something like:

        (
            ['(string,address[])'],
            [(b'a string', [b'\xf2\xe2F\xbbv\xdf\x87l\xef\x8b8\xae\x84\x13\x0fOU\xde9['],
            [addresses_checksummed, abi_string_to_text],
            [('a string', ['0xF2E246BB76DF876Cef8b38ae84130F4F55De395b'])],
        ),

data_type[0] == "(" and
isinstance(data_value, tuple)
):
return ABITypedData([data_type, data_value])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note below how the array-type ABI values have abi_sub_tree applied recursively? I was expecting something like that for a tuple. Otherwise, I suspect data_tree_map will fail to apply to the types within the tuple (or arrays within the tuple).

@davesque
Copy link
Contributor

@feuGeneA @stefanmendoza Thanks for all your work on this guys! We'll take things from here to do the work of finalizing things and merging them.

@feuGeneA feuGeneA force-pushed the feature/naive-tuple-support branch from 1510980 to bf0a2fd Compare February 13, 2019 21:41
@kclowes
Copy link
Collaborator

kclowes commented Mar 11, 2019

This is in as of #1235 🎉 Thanks everyone!

@kclowes kclowes closed this Mar 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants