-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: core.computation, mostly typing #29436
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
can you rebase this; like @simonjayhawkins to eyeball. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel generally lgtm. There's a couple of things I want to check locally regarding the asserts and the abstract method. I'll do that tomorrow. just one comment for now.
since mypy green and tests passing am happy if you want to merge sooner.
# AbstractMethodError instead | ||
from pandas.errors import AbstractMethodError | ||
|
||
raise AbstractMethodError(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy can afford additional checking since it has some understanding of abstract classes. so I'm not 100% happy with this, if it sets a precedent for dealing with Cannot instantiate abstract class 'AbstractEngine' with abstract attribute '_evaluate'
style mypy errors.
in this case,
_engines: Dict[str, Type[Union[NumExprEngine, PythonEngine]]] = {
"numexpr": NumExprEngine,
"python": PythonEngine,
}
also passes mypy without changes to the code and only type annotations added.
It also means we retain the modern Python functionality for dealing with abstract methods instead of homegrown.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also means we retain the modern Python functionality for dealing with abstract methods instead of homegrown.
The homegrown doesn't really bother me since we use it in a lot of places. But annotations is an area where I'm going to defer to you pretty consistently.
_engines: Dict[str, Type[Union[NumExprEngine, PythonEngine]]]
Is this supported in py35? I rebel at this pattern because I always think "mypy should be able to figure this out" (which is only true if this were some kind of frozen_dict, but still).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supported in py35?
#29212 seems to have gone stale. so will need to use the py3.5 syntax
I rebel at this pattern because I always think "mypy should be able to figure this out"
It is required in a lot of places and in this case on my branch I currently have
_engines: Dict[str, Type[AbstractEngine]] = {
"numexpr": NumExprEngine,
"python": PythonEngine,
}
but can't remember why it was needed, but it looks like it will need to be updated to account for the abstract method if not done in this PR.
super().__init__(expr) | ||
|
||
def convert(self): | ||
def convert(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could also remove this method since super().convert returns a str. could this be a remnant of py2 compat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll double-check. I tried removing all of the encoding
kwargs and found that io.pytables still relies on them.
) # type: Optional[BaseExprVisitor] | ||
assert isinstance(self._visitor, BaseExprVisitor), type(self._visitor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the type should be Optional or that the type annotation should be added here.
adding type annotations directly to _parsers...
_parsers: Dict[str, Type[BaseExprVisitor]] = {
"python": PythonExprVisitor,
"pandas": PandasExprVisitor,
}
negates the need for the assert, but does mean that the type annotations added later in this PR in pandas/core/computation/pytables.py need to be revisted.
@@ -446,6 +447,7 @@ def evaluate(self, env, engine, parser, term_type, eval_in_python): | |||
def convert_values(self): | |||
"""Convert datetimes to a comparable value in an expression. | |||
""" | |||
assert self.encoding is None, self.encoding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed? The method has code for self.encoding is not None, is this unreachable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is left over from when i tried removing encoding entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on my branch, i've removed some of the *args and **kwargs in the modules touched here and just added encoding directly as a argument in Op.__init__
.
self.resolvers = DeepChainMap(*resolvers) | ||
self.temps = {} | ||
self.resolvers = DeepChainMap(*resolvers) # type: DeepChainMap | ||
self.temps = {} # type: Mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mapping -> Dict
and then resolve error: Unsupported operand types for + ("List[Dict[Any, Any]]" and "List[Mapping[Any, Any]]")
in full_scope either using return DeepChainMap(self.temps, *self.resolvers.maps, *self.scope.maps) or list.append to create the maps list and then remove assert in add_tmp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will give this a try
@@ -290,19 +304,20 @@ def add_tmp(self, value): | |||
|
|||
# add to inner most scope | |||
assert name not in self.temps | |||
assert isinstance(self.temps, dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove. see above
@@ -223,6 +236,7 @@ def swapkey(self, old_key, new_key, new_value=None): | |||
maps.append(self.temps) | |||
|
|||
for mapping in maps: | |||
assert isinstance(mapping, (DeepChainMap, dict)), type(mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to see this and have the dictionaries making up the DeepChainMap correctly recognised as a MutableMapping but I think that's out of scope here.
IIUC the maps of a normal ChainMap are immutable and writing to a ChainMap stores the new value in the top-level.
Closing to clear the queue |
@jbrockmendel we should in general limit the number of asserts added to silence mypy. for our established code base with extensive testing, then many mypy errors are false positives and it is frustrating to deal with them. but in a few cases mypy is highlighting a potential issue where we would just move the failure to the assert by adding it. The benefit of type checking will come downstream. (promise!) when new code is added, then mypy errors are more likely to pinpoint issues. This will hopefully reduced the amount of time reviewing PRs since these issues will already have been picked up by mypy and hopefully the OP fixed it. |
cc @simonjayhawkins
Largely using annotations as an excuse to put eyeballs on parts of the code that would otherwise be left alone.