-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: switch Dataset and DataArray to use explicit indexes #2639
Conversation
This change switches Dataset.indexes and DataArray.indexes to be backed by explicit dictionaries of indexes, instead of being implicitly defined by the set of coordinates with names matching dimensions. There are no changes to the public interface yet: these will come later. For now, indexes are recreated from coordinates every time a new DataArray or Dataset is created. In follow-up PRs, I will refactor indexes to be propagated explicitly in xarray operations. This will facilitate future API changes, when indexes will no longer only be associated with dimensions.
It looks nice and reasonable to me.
Does it add some overhead (though I don't see any reason of it)? |
xarray/core/indexes.py
Outdated
Mapping[Any, pandas.Index] mapping indexing keys (levels/dimension names) | ||
to indexes used for indexing along that dimension. | ||
""" | ||
return {key: coords[key].to_index() for key in dims if key in coords} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use OrderedDict
instead of dict
to preserve the order?
It is probably necessary for the backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this makes sense, for now. Eventually we are going to break this, though (when we add new entries).
I suppose another option would be to leave I think we do something like for a few other attributes already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Looking foward to seeing the new structure in xarray :)
* master: Remove broken Travis-CI builds (pydata#2661) Type checking with mypy (pydata#2655) Added Coarsen (pydata#2612) Improve test for GH 2649 (pydata#2654) revise top-level package description (pydata#2430) Convert ref_date to UTC in encode_cf_datetime (pydata#2651) Change an `==` to an `is`. Fix tests so that this won't happen again. (pydata#2648) ENH: switch Dataset and DataArray to use explicit indexes (pydata#2639) Use pycodestyle for lint checks. (pydata#2642) Switch whats-new for 0.11.2 -> 0.11.3 DOC: document v0.11.2 release Use built-in interp for interpolation with resample (pydata#2640) BUG: pytest-runner no required for setup.py (pydata#2643)
xref #1603
This change switches Dataset.indexes and DataArray.indexes to be backed by
explicit dictionaries of indexes, instead of being implicitly defined by
the set of coordinates with names matching dimensions.
There are no changes to the public interface yet: these will come later.
My current plan:
or Dataset is created.
indexes
for MultiIndex levels that are checked instead of MultiIndex variables. Still no public API changes (aside from adding more entries to.indexes
).indexes
.