Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cf.bounds? #210

Closed
malmans2 opened this issue Apr 20, 2021 · 13 comments · Fixed by #214
Closed

Add cf.bounds? #210

malmans2 opened this issue Apr 20, 2021 · 13 comments · Fixed by #214

Comments

@malmans2
Copy link
Member

Should we add a bounds attribute and make obj.cf show all bounds?

If yes, how do you think it should look like?
Maybe: {"bound_lon": "lon", "bound_lat": "lat"}

I don't think you can have a bound variable associated with multiple variables, so both keys and values can be str.

For example, we could use cf.bounds to take care of bounds in rename_like...

@aulemahal
Copy link
Contributor

I think its a good idea! Wouldn't it be the opposite? I.e.: da.cf.bounds returing a dict mapping from dimension name to bounds variable name? {'lat': 'lat_bnds', 'lon': 'lon_bnds'} ?

@malmans2
Copy link
Member Author

Yes I think you're right, it's more consistent with the other attributes.

@dcherian
Copy link
Contributor

👍 I like @aulemahal's suggestion.

But shouldn't we use our CF names? so {"longitude": "lon_bnds", "latitude": "lat_bnds"}?

@malmans2
Copy link
Member Author

Sounds good. Does that mean that we only allow bounds associated with supported CF coordinates (i.e., longitude, latitude, time, vertical)?

@malmans2
Copy link
Member Author

Also, if we use CF names I think there could be multiple bounds associated with the same CF name, so values should be lists.

@aulemahal
Copy link
Contributor

Hum, restricting to coordinates seems too strict, may be keys could be the standard names? I believe it makes no difference for the common trio longitude, latitude, time.
And I can't think of a case where there are multiple values? For a given variable, wouldn't the bounds attribute always refer to other variables in the same dataset. cf-xarray would list those with a mapping where the key is CF-compliant, but the value still refers to a variable name, no? (so it's unique)

@malmans2
Copy link
Member Author

may be keys could be the standard names?

One issue with standard_names is that we use other attributes as well to identify coordinates (e.g., units).

I can't think of a case where there are multiple values?

For example with staggered grids, I think one could have multiple bounds associated with the same key and the dataset is still CF-compliant.
{"depth": ["deptht_bounds", "depthw_bounds"]} or {"vertical": ["deptht_bounds", "depthw_bounds"]}

	float deptht(deptht) ;
		deptht:standard_name = "depth" ;
		deptht:long_name = "Vertical T levels" ;
		deptht:units = "m" ;
		deptht:positive = "down" ;
		deptht:bounds = "deptht_bounds" ;
	float deptht_bounds(deptht, axis_nbounds) ;
		deptht_bounds:units = "m" ;
	float depthw(depthw) ;
		depthw:standard_name = "depth" ;
		depthw:long_name = "Vertical W levels" ;
		depthw:units = "m" ;
		depthw:positive = "down" ;
		depthw:bounds = "depthw_bounds" ;
	float depthw_bounds(depthw, axis_nbounds) ;
		depthw_bounds:units = "m" ;

Another example could be a dataset with multiple time axes and coordinates (e.g., daily and monthly means).

@aulemahal
Copy link
Contributor

One issue with standard_names is that we use other attributes as well to identify coordinates (e.g., units).

Hmm, I'm not sure I fully understand this one. You mean that the standard_name of a variable could be absent but that cf_xarray would still pick it up through other attributes?
Would it be inelegant / too complex to have some kind of failback? Like keys would be chosen in this order or priority: coordinate name (latitude, time, etc) -> standard name -> variable name.

and the dataset is still CF-compliant.

Oh I see the misunderstanding! I was thinking of the dataarray accessor, where this mapping should be 1 to 1 (like in your example). It is true, this doesn't hold at the dataset level.

@dcherian
Copy link
Contributor

Like keys would be chosen in this order or priority: coordinate name (latitude, time, etc) -> standard name -> variable name.

We actually do this for ds.cf.sizes e.g. {'T': 3, 'time': 3, 'eta_rho': 108, 'xi_rho': 392}

Though now I wonder if we shouldn't just always add the variable name too. For this dataset ds.cf.sizes would be {'ocean_time': 3, 'T': 3, 'time': 3, 'eta_rho': 108, 'xi_rho': 392}

@malmans2
Copy link
Member Author

So the keys for cf.bounds would be: set(ds.variables) | ds.cf.keys()? I think it would work well.

Maybe we should change a bit cf.keys() and return set(ds.variables) as well? keys() is also discussed here #105

@dcherian
Copy link
Contributor

Maybe we should change a bit cf.keys() and return set(ds.variables) as well?

-0.5 from me because I read .cf.keys as "CF keys" so it makes sense to return special names. Similarly with "latitude" in ds.cf

@dcherian
Copy link
Contributor

Here's a harder question (maybe): Should the dict have variable names as values or actual DataArrays?

.axes, .standard_names return variable names so it makes sense to have variable names. Is this OK as a general principle — properties added by cf_xarray (axes, coordinates, standard_names, formula_terms, bounds) will always have variable names as values. When cf_xarray wraps an existing Xarray property (sizes, chunks) the value will be whatever xarray returns as a value.

@malmans2
Copy link
Member Author

properties added by cf_xarray (axes, coordinates, standard_names, formula_terms, bounds) will always have variable names as values. When cf_xarray wraps an existing Xarray property (sizes, chunks) the value will be whatever xarray returns as a value.

I think this is the way to go!

@malmans2 malmans2 mentioned this issue Apr 21, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants