Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: core.computation, mostly typing #29436

Closed
wants to merge 5 commits into from

Conversation

jbrockmendel
Copy link
Member

cc @simonjayhawkins

Largely using annotations as an excuse to put eyeballs on parts of the code that would otherwise be left alone.

@jreback jreback added the Clean label Nov 6, 2019
@jreback jreback added this to the 1.0 milestone Nov 6, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@jreback
Copy link
Contributor

jreback commented Nov 6, 2019

can you rebase this; like @simonjayhawkins to eyeball.

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel generally lgtm. There's a couple of things I want to check locally regarding the asserts and the abstract method. I'll do that tomorrow. just one comment for now.

since mypy green and tests passing am happy if you want to merge sooner.

pandas/core/computation/scope.py Outdated Show resolved Hide resolved
# AbstractMethodError instead
from pandas.errors import AbstractMethodError

raise AbstractMethodError(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy can afford additional checking since it has some understanding of abstract classes. so I'm not 100% happy with this, if it sets a precedent for dealing with Cannot instantiate abstract class 'AbstractEngine' with abstract attribute '_evaluate' style mypy errors.

in this case,

_engines: Dict[str, Type[Union[NumExprEngine, PythonEngine]]] = {
    "numexpr": NumExprEngine,
    "python": PythonEngine,
}

also passes mypy without changes to the code and only type annotations added.

It also means we retain the modern Python functionality for dealing with abstract methods instead of homegrown.

wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also means we retain the modern Python functionality for dealing with abstract methods instead of homegrown.

The homegrown doesn't really bother me since we use it in a lot of places. But annotations is an area where I'm going to defer to you pretty consistently.

_engines: Dict[str, Type[Union[NumExprEngine, PythonEngine]]]

Is this supported in py35? I rebel at this pattern because I always think "mypy should be able to figure this out" (which is only true if this were some kind of frozen_dict, but still).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supported in py35?

#29212 seems to have gone stale. so will need to use the py3.5 syntax

I rebel at this pattern because I always think "mypy should be able to figure this out"

It is required in a lot of places and in this case on my branch I currently have

_engines: Dict[str, Type[AbstractEngine]] = {
    "numexpr": NumExprEngine,
    "python": PythonEngine,
}

but can't remember why it was needed, but it looks like it will need to be updated to account for the abstract method if not done in this PR.

super().__init__(expr)

def convert(self):
def convert(self) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could also remove this method since super().convert returns a str. could this be a remnant of py2 compat?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll double-check. I tried removing all of the encoding kwargs and found that io.pytables still relies on them.

Comment on lines +841 to +842
) # type: Optional[BaseExprVisitor]
assert isinstance(self._visitor, BaseExprVisitor), type(self._visitor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the type should be Optional or that the type annotation should be added here.

adding type annotations directly to _parsers...

_parsers: Dict[str, Type[BaseExprVisitor]] = {
    "python": PythonExprVisitor,
    "pandas": PandasExprVisitor,
}

negates the need for the assert, but does mean that the type annotations added later in this PR in pandas/core/computation/pytables.py need to be revisted.

@@ -446,6 +447,7 @@ def evaluate(self, env, engine, parser, term_type, eval_in_python):
def convert_values(self):
"""Convert datetimes to a comparable value in an expression.
"""
assert self.encoding is None, self.encoding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? The method has code for self.encoding is not None, is this unreachable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is left over from when i tried removing encoding entirely.

Copy link
Member

@simonjayhawkins simonjayhawkins Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on my branch, i've removed some of the *args and **kwargs in the modules touched here and just added encoding directly as a argument in Op.__init__.

self.resolvers = DeepChainMap(*resolvers)
self.temps = {}
self.resolvers = DeepChainMap(*resolvers) # type: DeepChainMap
self.temps = {} # type: Mapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapping -> Dict

and then resolve error: Unsupported operand types for + ("List[Dict[Any, Any]]" and "List[Mapping[Any, Any]]") in full_scope either using return DeepChainMap(self.temps, *self.resolvers.maps, *self.scope.maps) or list.append to create the maps list and then remove assert in add_tmp.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will give this a try

@@ -290,19 +304,20 @@ def add_tmp(self, value):

# add to inner most scope
assert name not in self.temps
assert isinstance(self.temps, dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove. see above

@@ -223,6 +236,7 @@ def swapkey(self, old_key, new_key, new_value=None):
maps.append(self.temps)

for mapping in maps:
assert isinstance(mapping, (DeepChainMap, dict)), type(mapping)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to see this and have the dictionaries making up the DeepChainMap correctly recognised as a MutableMapping but I think that's out of scope here.

IIUC the maps of a normal ChainMap are immutable and writing to a ChainMap stores the new value in the top-level.

@jbrockmendel
Copy link
Member Author

Closing to clear the queue

@simonjayhawkins
Copy link
Member

@jbrockmendel we should in general limit the number of asserts added to silence mypy.

for our established code base with extensive testing, then many mypy errors are false positives and it is frustrating to deal with them.

but in a few cases mypy is highlighting a potential issue where we would just move the failure to the assert by adding it.

The benefit of type checking will come downstream. (promise!) when new code is added, then mypy errors are more likely to pinpoint issues. This will hopefully reduced the amount of time reviewing PRs since these issues will already have been picked up by mypy and hopefully the OP fixed it.

@jbrockmendel jbrockmendel deleted the cln-comp branch November 21, 2019 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants