-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: Deprecate use of strings denoting units with 'M', 'Y' or 'y' in pd.to_timedelta (36666) #36838
Conversation
Do we only want to deprecate variations with a space ( |
pandas/core/tools/timedeltas.py
Outdated
raise ValueError( | ||
"unit must not be specified if the input is/contains a str" | ||
) | ||
elif arg.upper().endswith(" M") or arg.upper().endswith(" Y"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in your other commented, we also want to catch arguments without the space as well e.g. "1Y"
.
Also we need a test for this warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Ah I see some tests are failing. I will update the code. |
fde639e
to
c12eeaa
Compare
All tests related to |
c12eeaa
to
f56796a
Compare
Updated my code and merged master. Everything passes now. |
], | ||
) | ||
def test_unambiguous_timedelta_values(self, val, warning): | ||
with tm.assert_produces_warning(warning): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you but the github issue number here as a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. CI fails due to other tests on master though. Will merge master later once that is solved.
pandas/core/tools/timedeltas.py
Outdated
if unit is not None: | ||
raise ValueError( | ||
"unit must not be specified if the input is/contains a str" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm we already have this check on L103, so something must be going wrong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you do: pd.to_timedelta("1 Y")
then unit is not specified and by default None. L103 checks only if unit is "Y", "M", 'y' or 'm' (e.g. pd.to_timedelta(1, unit="Y")
), and therefore you never reach the conditional code.
@@ -258,7 +258,9 @@ def test_unit_parser(self, units, np_unit, wrapper): | |||
if unit == "M": | |||
expected = Timedelta(np.timedelta64(2, "m").astype("timedelta64[ns]")) | |||
|
|||
result = to_timedelta(f"2{unit}") | |||
warning = None if unit != "m" else FutureWarning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this to parameterize this (e.g. elminate the for loop and parameterize on units)
then add a parameter whether should test for FutureWarning or not (could be bool is ok).
then in the body you can easily check this for both to_timedelta
and Timedelta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have approx 50 values that have no warning and just one ("m") that has a warning. IMO a parameter for this would be overkill. But if we want this then I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test. IMO test_unit_parser
could be refactored even further: split in 2 tests: test_unit_parser_array
and test_unit_parser_scalar
and apply a parameter on the function that converts its to timedelta. Do we want to do this?
I found out that the following was also possible: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to do the warning inside the string to timedelta conversion code in timedelta.pyx itself;
pandas/core/tools/timedeltas.py
Outdated
@@ -101,6 +110,16 @@ def to_timedelta(arg, unit=None, errors="raise"): | |||
"represent unambiguous timedelta values durations." | |||
) | |||
|
|||
if isinstance(arg, (list, tuple, ABCSeries, ABCIndexClass, np.ndarray)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the warning shouldn't be here
pandas/_libs/tslibs/timedeltas.pyx
Outdated
@@ -1467,6 +1468,16 @@ cdef _broadcast_floordiv_td64( | |||
return res | |||
|
|||
|
|||
def check_unambiguous_timedelta_values(object[:] values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than this you can do it in the Timedelta string converter
pandas/core/tools/timedeltas.py
Outdated
"unit must not be specified if the input is/contains a str" | ||
) | ||
elif re.search(r"^\d+\s?[M|Y|m|y]$", arg): | ||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wouldn't do this here at all
Thanks @jreback , moved the warning to timedelta.pyx and the code is much cleaner now. |
cef33b7
to
cdc5821
Compare
b83228b
to
c76486e
Compare
c76486e
to
0cc6dff
Compare
@jreback implemented all suggestions, could you have a 2nd look? CI is still failing after merging master, but that is related to other issues than introduced in this PR. |
CI failed due to FATAL ERROR: Committing semi space failed. Allocation failed - process out of memory. Can someone please restart it? Thanks! |
why is 'm' (lowercase) m deprecated? this is minutes as in not ambiguous. (agree on Y, y, M) |
My bad, I thought we wanted to deprecate all variations of m (lowercase and uppercase). Will revert it then. |
kk great, otherwise lgtm. ping on green. |
@jreback reverted the depr of lowercase m, merged master and CI on green. |
thanks @avinashpancham very nice! |
… pd.to_timedelta (36666) (pandas-dev#36838)
… pd.to_timedelta (36666) (pandas-dev#36838)
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff