Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2262-05-01 00:00:00 #282

Closed
pochedls opened this issue Jul 27, 2022 · 3 comments · Fixed by #283
Closed
Assignees
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@pochedls
Copy link
Collaborator

What happened?

In opening a dataset with time units months since 1800-01-01 I am getting the following error: OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2262-05-01 00:00:00.

What did you expect to happen?

Ideally, xcdat could handle this situation and open the dataset.

Minimal Complete Verifiable Example

import xcdat

fn = '/p/user_pub/climate_work/pochedley1/cmip6_msu/spliced/ttt_Amon_MRI-ESM2-0_historical-ssp585_r1i1p1f1_gn_185001-230012.nc'
ds = xcdat.open_dataset(fn)

Relevant log output

---------------------------------------------------------------------------                                                                                          
OutOfBoundsDatetime                       Traceback (most recent call last)
Input In [65], in <cell line: 1>()
----> 1 ds = xcdat.open_dataset(fn)

File ~/code/xcdat/xcdat/dataset.py:89, in open_dataset(path, data_var, add_bounds, decode_times, center_times, lon_orient, **kwargs)
     87     ds = xr.open_dataset(path, decode_times=False, **kwargs)
     88     # attempt to decode non-cf-compliant time axis
---> 89     ds = decode_non_cf_time(ds)
     90 else:
     91     ds = xr.open_dataset(path, decode_times=True, **kwargs)

File ~/code/xcdat/xcdat/dataset.py:320, in decode_non_cf_time(dataset)
    316     return ds
    318 ref_date = pd.to_datetime(ref_date)
--> 320 data = [ref_date + pd.DateOffset(**{units: offset}) for offset in time.data]
    321 decoded_time = xr.DataArray(
    322     name=time.name,
    323     data=data,
   (...)
    326     attrs=time.attrs,
    327 )
    328 decoded_time.encoding = {
    329     "source": ds.encoding.get("source", "None"),
    330     "dtype": time.dtype,
   (...)
    333     "calendar": time.attrs.get("calendar", "none"),
    334 }

File ~/code/xcdat/xcdat/dataset.py:320, in <listcomp>(.0)
    316     return ds
    318 ref_date = pd.to_datetime(ref_date)
--> 320 data = [ref_date + pd.DateOffset(**{units: offset}) for offset in time.data]
    321 decoded_time = xr.DataArray(
    322     name=time.name,
    323     data=data,
   (...)
    326     attrs=time.attrs,
    327 )
    328 decoded_time.encoding = {
    329     "source": ds.encoding.get("source", "None"),
    330     "dtype": time.dtype,
   (...)
    333     "calendar": time.attrs.get("calendar", "none"),
    334 }

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/offsets.pyx:444, in pandas._libs.tslibs.offsets.BaseOffset.__add__()

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/offsets.pyx:450, in pandas._libs.tslibs.offsets.BaseOffset.__add__()

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/offsets.pyx:180, in pandas._libs.tslibs.offsets.apply_wraps.wrapper()

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/offsets.pyx:1092, in pandas._libs.tslibs.offsets.RelativeDeltaOffset._apply()

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/timestamps.pyx:1399, in pandas._libs.tslibs.timestamps.Timestamp.__new__()

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/conversion.pyx:436, in pandas._libs.tslibs.conversion.convert_to_tsobject()

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/conversion.pyx:517, in pandas._libs.tslibs.conversion.convert_datetime_to_tsobject()

File ~/bin/anaconda3/envs/xcdat_dev/lib/python3.9/site-packages/pandas/_libs/tslibs/np_datetime.pyx:120, in pandas._libs.tslibs.np_datetime.check_dts_bounds()

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2262-05-01 00:00:00

Anything else we need to know?

This seems to be related to a known limitation of pandas. Maybe using errors="coerce" could help?

Environment

main branch

@pochedls pochedls added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Jul 27, 2022
@pochedls
Copy link
Collaborator Author

pochedls commented Jul 27, 2022

Addendum that this has apparently been addressed in xarray (but not for non-CF time, which we are supporting in this case).

pochedls added a commit that referenced this issue Jul 28, 2022
tomvothecoder pushed a commit that referenced this issue Jul 28, 2022
add unit test

add comments for test, cleanup extraneous code

initial work on #282

Bugfix/278 cannot generate bounds (#281)

* initial solution for #278

* add unit test

* add comments for test, cleanup extraneous code
@tomvothecoder tomvothecoder moved this to Todo in v0.3.1 Aug 2, 2022
@tomvothecoder tomvothecoder moved this from Todo to Review In Progress in v0.3.1 Aug 2, 2022
@tomvothecoder tomvothecoder moved this from Review In Progress to In Progress in v0.3.1 Aug 2, 2022
@tomvothecoder tomvothecoder moved this from In Progress to Reviewer Approved in v0.3.1 Aug 2, 2022
pochedls added a commit that referenced this issue Aug 2, 2022
* initial solution for #278

add unit test

add comments for test, cleanup extraneous code

initial work on #282

Bugfix/278 cannot generate bounds (#281)

* initial solution for #278

* add unit test

* add comments for test, cleanup extraneous code

* PR review refactor
- Add `types-python-dateutil` to `mypy` dependencies
- Update `ref_date` var to `ref_dt_obj` to avoid mypy error `error: Unsupported operand types for + ("str" and "relativedelta")`
- Use dictionary unpacking for units variable
- Use `datetime.strptime` instead of `pd.datetime()` which runs into the `pd.Timestamp` limitation
- Add logger.warning when non-CF compliant time coords cannot be decoded

* Consider calendar type when decoding non-Cf time
- Fix tests

* Update xcdat/dataset.py

Co-authored-by: pochedls <[email protected]>

* Update xcdat/dataset.py

Co-authored-by: pochedls <[email protected]>

* Fix datetime reference

* return more specific cftime datetime types

* removing extraneous calendar specification

* Refactor using xarray methods
- Move try and except statements into `decode_non_cf_time()`
- Extract function `_get_cftime_coords()` to encapsulate related logic from `decode_non_cf_time()`

* Update docstrings and silence logger warnings in tests

* Fix Timestamp limitation from dtype
- Update logger warning

* Add space in logger warning

Co-authored-by: tomvothecoder <[email protected]>
Repository owner moved this from Reviewer Approved to Done in v0.3.1 Aug 2, 2022
@durack1
Copy link
Collaborator

durack1 commented Aug 9, 2022

@pochedls @tomvothecoder was this to be fixed in 0.3.1? It seems I have hit an almost identical issue:

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 2315-07-01 00:00:00

The offending file time dimension looks like:

...
        float time(time) ;
                time:standard_name = "time" ;
                time:long_name = "time" ;
                time:units = "months since 1955-01-01 00:00:00" ;
                time:axis = "T" ;
                time:climatology = "climatology_bounds" ;
...

In [8]: xc.__version__
Out[8]: '0.3.0'

Is 0.3.1 expected to be available on the conda-forge channel imminently?

@tomvothecoder
Copy link
Collaborator

@durack1 This is fixed in #283, which is merged on the latest main.

We are addressing a few more bugs so 0.3.1 is not released yet, but should be pretty soon.
https://github.com/orgs/xCDAT/projects/3/views/2?query=is%3Aopen+sort%3Aupdated-desc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants