-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
delegate (most) datetimelike Series arithmetic ops to DatetimeIndex #18817
Conversation
Hello @jbrockmendel! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on December 19, 2017 at 16:11 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #18817 +/- ##
=========================================
Coverage ? 91.6%
=========================================
Files ? 154
Lines ? 51452
Branches ? 0
=========================================
Hits ? 47135
Misses ? 4317
Partials ? 0
Continue to review full report at Codecov.
|
@@ -107,7 +107,7 @@ def test_shift(self): | |||
# incompat tz | |||
s2 = Series(date_range('2000-01-01 09:00:00', periods=5, | |||
tz='CET'), name='foo') | |||
pytest.raises(ValueError, lambda: s - s2) | |||
pytest.raises(TypeError, lambda: s - s2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an API change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in needs to be noted in whatsnew or needs to be avoided?
pandas/core/ops.py
Outdated
return all(isinstance(x, ABCDateOffset) for x in arr_or_obj) | ||
return False | ||
|
||
def _is_offset(arr_or_obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to move this to pandas.core.dtypes.common and de-privatize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and add tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on all three points.
@@ -664,6 +665,12 @@ def __add__(self, other): | |||
return self.shift(other) | |||
elif isinstance(other, (Index, datetime, np.datetime64)): | |||
return self._add_datelike(other) | |||
elif (isinstance(self, DatetimeIndex) and | |||
isinstance(other, np.ndarray) and other.size == 1 and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh? I believe we already check is_integer_dtype(other). this check is much too specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under the status quo adding np.array(1, dtype=np.int64)
to a DatetimeIndex
behaves like adding Timedelta(nanosecond=1)
. Series
raises (and a Series
test fails unless this is caught here).
I'm open to ideas for how to make this less hacky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative would be to -- for the time being -- check for this case in the Series op and avoid dispatching to DatetimeIndex
.
pandas/core/ops.py
Outdated
@@ -715,6 +726,29 @@ def wrapper(left, right, name=name, na_op=na_op): | |||
if isinstance(right, ABCDataFrame): | |||
return NotImplemented | |||
|
|||
if _is_offset(right): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic does not belong here. this is what happens in Timeop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic does not belong here. this is what happens in Timeop.
Yes. This is preventing this case from being dispatched to DatetimeIndex so that it continues to be handled by TimeOp for now. DatetimeIndex makes a mess of this case (try it...), but I decided to fix that separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs some work. let's clear the decks and you can revist
@@ -198,6 +198,7 @@ Other API Changes | |||
- Rearranged the order of keyword arguments in :func:`read_excel()` to align with :func:`read_csv()` (:issue:`16672`) | |||
- :func:`pandas.merge` now raises a ``ValueError`` when trying to merge on incompatible data types (:issue:`9780`) | |||
- :func:`wide_to_long` previously kept numeric-like suffixes as ``object`` dtype. Now they are cast to numeric if possible (:issue:`17627`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.23.0
The other arithmetic PRs are a couple steps away from really being ready for this. The only part of this that is separately actionable (and on which I need input) is for where (and I guess if) to fix the mismatched behavior in
This PR addressed that here and here, but got some pushback because it was (admittedly) ugly. Alternative suggestions requested. Closing this PR, will (eventually) open another one with reduced scope. |
DatetimeIndex arithmetic ops do overflow checks that Series does not. Instead of re-implementing those checks (which I tried and its a PITA) this PR just delegates the appropriate operations to the DatetimeIndex implementation.
There are a couple of things that are missing from the DatetimeIndex implementations, e.g. the int-array case added to indexes.datetimelike. If there's consensus that having a single implementation for these ops is the way to go, I'll port the other cases over in their own PRs.
If/when these implementations get merged, ops._TimeOp can be trimmed/removed.
git diff upstream/master -u -- "*.py" | flake8 --diff