Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(fix): allow all extension array data types in pandas adapters #1

Open
wants to merge 66 commits into
base: any-time-resolution-2
Choose a base branch
from

Conversation

ilan-gold
Copy link

@ilan-gold ilan-gold commented Oct 23, 2024

This probably needs some work, but since pandas datetime handling often goes through extension arrays, I think these two issues are completely linked

cc: @shoyer @kmuehlbauer

@ilan-gold
Copy link
Author

It's tough to do this without CI - I will push a few of the sorts of things re: datetimes that will change so you get a sense, but generally, things are even cleaner (more datetimes preserved).

@ilan-gold
Copy link
Author

Not sure we want to keep allowing pandas in-memory date time stuff + dask: 59b03f2 maybe should start blanket converting extension arrays before dask?

@ilan-gold ilan-gold force-pushed the ig/fix_extension_indexer branch from a9c9386 to 7c32bd0 Compare October 24, 2024 07:25
@kmuehlbauer kmuehlbauer reopened this Oct 24, 2024
@kmuehlbauer
Copy link
Owner

@ilan-gold Tried to activate CI, but seems it doesn't work. I'm back to desk next week, won't have time to check this now.

@shoyer
Copy link

shoyer commented Oct 24, 2024

Can you open this up as a pull request against pydata/xarray?

ilan-gold and others added 16 commits October 25, 2024 08:40
Co-authored-by: Stephan Hoyer <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: Stephan Hoyer <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix scalar handling for timedelta based indexer

* remove stale error message and "ignore:Converting non-default" in testsuite

* add per review suggestions

* add/remove todo

* rename timeunit -> format

* return "ns" resolution per default for timedeltas, if not specified

* Be specific on types/dtpyes

* add comment

* add suggestions from code review

* fix docs

* fix test which isn't run for numpy2 atm

* add notes on to_datetime section, update examples showing usage of 'as_unit'

* use np.timedelta64 for to_timedelta example, update as_unit example, update note

* remove note

* Apply suggestions from code review

Co-authored-by: Deepak Cherian <[email protected]>

* refactor timedelta decoding to _numbers_to_timedelta and res-use it within decode_cf_timedelta

* fix conventions test, add todo

* run times through pd.Timestamp to catch possible overflows

* fix tests for cftime_to_nptime

* fix cftime_to_nptime in cftimeindex

* introduce pd.Timestamp instance check

* warn if out-of-bound datetimes are encoded with standard calendar, fall back to cftime encoding, add fix for cftime issue where python datetimes are not encoded correctly with date2num.

* fix time-coding.rst, add reference to time-series.rst.

* try to fix typing, ignore one

* try to fix docs

* revert doc-changes

* Add a non-ns test for polyval, polyfit

* more doc cosmetics

* add whats-new.rst entry

* add/fix coder docstring

* add xr.date_range example as suggested per review

* Apply suggestions from code review

Co-authored-by: Spencer Clark <[email protected]>

* Implement `time_unit` option for `decode_cf_timedelta` (pydata#3)

* Fix timedelta encoding overflow issue; always decode to ns resolution

* Implement time_unit for decode_cf_timedelta

* Reduce diff

* fix typing

* use nanmin/nanmax, catch numpy RuntimeWarnings

* Apply suggestions from code review

Co-authored-by: Kai Mühlbauer <[email protected]>

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Stephan Hoyer <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
jacobbieker and others added 30 commits January 27, 2025 12:41
…ata#9948)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Deepak Cherian <[email protected]>
* Add time_unit argument to CFTimeIndex.to_datetimeindex

* Update xarray/tests/test_cftime_offsets.py

Co-authored-by: Kai Mühlbauer <[email protected]>

* Apply suggestions from code review

Co-authored-by: Kai Mühlbauer <[email protected]>

* Update xarray/tests/test_cftimeindex.py

Co-authored-by: Kai Mühlbauer <[email protected]>

* Apply Deepak's wording suggestion

Co-authored-by: Deepak Cherian <[email protected]>

* Update xarray/tests/test_groupby.py

---------

Co-authored-by: Kai Mühlbauer <[email protected]>
Co-authored-by: Kai Mühlbauer <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
…ydata#9977)

* fix mean for datetime-like by using the respective dtype time resolution unit, adapting tests

* fix mypy

* add PR to existing entry for non-nanosecond datetimes

* Update xarray/core/duck_array_ops.py

Co-authored-by: Spencer Clark <[email protected]>

* cast to "int64" in calculation of datime-like mean

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Spencer Clark <[email protected]>

---------

Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
* Allow passing a CFTimedeltaCoder instance to decode_timedelta

* Updates based on @kmuehlbauer's branch

https://github.com/kmuehlbauer/xarray/tree/split-out-coders

* Increment what's new PR number

* Add FutureWarning for change in decode_timedelta behavior

* Include a note about opting out of timedelta decoding

Co-authored-by: Kai Mühlbauer <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix typing

* Fix typo

* Fix doc build

* Fix order of arguments in filterwarnings

* Switch to :okwarning:

* Fix missing :okwarning:

---------

Co-authored-by: Kai Mühlbauer <[email protected]>
Co-authored-by: Deepak Cherian <[email protected]>
Co-authored-by: Kai Mühlbauer <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…#9999)

* Fix infer_freq, check for subdtype "datetime64"/"timedelta64"

* update infer_freq test

* add whats-new.rst entry

* add typing to test function

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
pydata#9940)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Deepak Cherian <[email protected]>
* check that aggregations result in array objects

* don't consider numpy scalars as arrays

* changelog [skip-ci]

* retrigger CI

* Update xarray/tests/test_namedarray.py

---------

Co-authored-by: Kai Mühlbauer <[email protected]>
* FIX: do not sort datasets in combine_by_coords

* add test

* add whats-new.rst entry

* use groupby_defaultdict

* Apply suggestions from code review

Co-authored-by: Michael Niklas  <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update xarray/core/combine.py

* fix typing, replace other occurrence

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix groupby

* fix groupby

---------

Co-authored-by: Michael Niklas <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Deepak Cherian <[email protected]>
…9855)

* new blank whatsnew

* FAQ answer on API stability

* link from API docs page

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* whatsnew

* Update doc/getting-started-guide/faq.rst

Co-authored-by: Maximilian Roos <[email protected]>

* use hyphen in target names

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Maximilian Roos <[email protected]>
Co-authored-by: Kai Mühlbauer <[email protected]>
* finalize release notes

* add contributors

* Tweak main what's new entry for time coding (pydata#4)

---------

Co-authored-by: Spencer Clark <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.