-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xr.DataSet.from_dataframe / xr.DataArray.from_series does not preserve DateTimeIndex with timezone #3291
Comments
You should be getting a warning about this if you use the latest version of pandas. In the future, this behavior will change to return an object dtype array full of pandas Datetime objects. Unfortunately NumPy doesn't have a built-in datetime with time-zone stype, so this is about the best we can do. |
Just wanted to rekindle discussion here and ping @dcherian and @benbovy , the current workaround for pandas DatetimeIndex with timezone info (dtype='datetime64[ns, EST]') is to drop the timezone piece or use If I'm following https://github.com/pydata/xarray/blob/master/design_notes/flexible_indexes_notes.md this is another potential example of improved user-friendliness where we could have timezone-aware indexes and therefore call pandas methods like This would definitely be great for remote sensing data that is usually stored with UTC timestamps, but often analysis requires converting to local time. |
I am confused on the following point after reading the indexing refactor design notes on removing IndexVariable. If |
No, unfortunate it is not possible to use a The bigger issue is that elsewhere in Xarray probably needs updates to avoid assuming that all dtype objects are |
While trying to prescribe an open-ended adjustments, I noticed that it currently causes an error. When we read the start end end times of adjustments, they receive a time zone info due to their ISO format. The AWS xarray dataset does not have a time zone info (because of an [xarray limitation](pydata/xarray#3291)). So the timezone info is removed from the adjustments time bounds (l.183-184). What was missing is that when start or end date of adjustments are blank (meaning open-start, open-ended bounds), we use a timestamp (then time-zone-naive) from the AWS dataset, and that it then causes an error later on when trying to remove the time-zone info from these same time-zone-naive bounds.
Problem Description
When using DataSet.from_dataframe (DataArray.from_series) to convert a pandas dataframe with DateTimeIndex having a timezone - xarray convert the datetime into a nanosecond index - rather than keeping it as a datetime-index type.
MCVE Code Sample
Expected Output
After removing the tz localization from the DateTimeIndex of the dataframe , the conversion to a DataSet preserves the time-index (without converting it to nanoseconds)
Output of
xr.show_versions()
xarray: 0.12.3+81.g41fecd86
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 1.1.4
distributed: 1.26.0
matplotlib: 3.0.3
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 40.8.0
pip: 19.0.3
conda: 4.7.11
pytest: 4.3.1
IPython: 7.4.0
sphinx: 1.8.5
The text was updated successfully, but these errors were encountered: