Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CF units / udunits #26

Closed
dcherian opened this issue Aug 24, 2020 · 11 comments · Fixed by xarray-contrib/cf-xarray#197
Closed

CF units / udunits #26

dcherian opened this issue Aug 24, 2020 · 11 comments · Fixed by xarray-contrib/cf-xarray#197

Comments

@dcherian
Copy link

Carrying on the degrees_east vs degrees_west conversation from #20 ...

What is the solution here? Should cf_xarray define a CF units / UDUNITS registry?

@jthielen jthielen mentioned this issue Aug 24, 2020
@jthielen
Copy link
Collaborator

jthielen commented Aug 24, 2020

We've been working on this in MetPy in our custom Pint registry piece-by-piece for a while, essentially fixing CF unit problems as they arise with user-submitted issues. The overall tracking issue for this is Unidata/MetPy#1362, which may have some relevant discussion.

So far, MetPy does the following:

  • start with Pint's default registry
  • set autoconvert_offset_to_baseunit to True
  • add a regex preprocessor to convert exponentiation of the form m2 to m**2
  • add a regex preprocessor to convert % to percent
  • define a new percent unit
  • define a new degrees_north unit, which has value of 1 degree and aliases of the other acceptable forms in the CF conventions
    • likewise a new degrees_east unit

Recently I've implemented a case insensitive registry option, which was in the just-released Pint 0.15, but on MetPy's end we haven't enabled that option yet (waiting for dependabot to bump to Pint 0.15 and to decide if we want to bump our minimum required Pint to 0.15 to guarantee that option or not). Once we have that in place, our next goal is to do some tests against the UDUNITS database to see where we're still failing.

I could imaging taking MetPy's registry and copying it to a more centrally available place to help solve this issue. cf_xarray makes the most sense among existing packages. I'm not sure if there is a large enough audience that needs CF compatible Pint but not xarray to justify its own package, but if there is, then perhaps Unidata may be able to host it.

@dopplershift do you have thoughts here? Particularly, this may mean moving MetPy's unit handling to an external library, and I'm not sure where your stance on that would be.

@dopplershift
Copy link

Long term, I'm not opposed to moving it to something external (Pint's already external). Deduplication of effort is a good thing.

I'm also happy to host whatever makes sense.

My only caveat here is that separate projects come with maintenance burden.

@keewis
Copy link
Collaborator

keewis commented Aug 26, 2020

I still don't think degrees_east or similar are normal units, but I guess I'd have to take that issue up with the CF conventions.

Other than that, I think we should document extending the default registry used by .quantify, setting a new default registry or overriding it with the unit_registry parameter.

@lukelbd
Copy link

lukelbd commented Dec 24, 2020

Just in case folks weren't aware, there appear to already be two dedicated CF unit-parsing python engines: "cfunits" and "cf_units". They appear to do almost exactly the same thing, but for some reason there are two of them ¯\_(ツ)_/¯. The latter is under SciTools and appears to be more popular (and unlike the former, it is integrated with cftime). So I'm more inclined to use it.

I wonder if pint-xarray and/or cf-xarray should eventually relegate unit parsing to cf_units... perhaps with some kind of lightweight cf_units.Unit to pint.Quantity conversion function.

@dcherian
Copy link
Author

dcherian commented Apr 5, 2021

I'd like to make some progress here.

It seems like pint-xarray does not understand the CF standard "1" for nondimensional units but metpy does. Neither understands the oceanography standard psu or "practical salinity units" which is again nondimensional.

Is it easy to copy over Metpy's registry to cf-xarray? Then we can add more climate-relevant units there.

@keewis
Copy link
Collaborator

keewis commented Apr 5, 2021

is that always a string "1", or would we also need to add support for the integer 1? In any case, the internal code uses None as a marker for "not-a-unit" (as opposed to dimensionless, which is a unit), so it wouldn't be difficult to get "1" (or 1, or both) to do the same. Not sure if you meant that, or if you wanted to have those become a alias of dimensionless?

For psu I think you would need to add a new definition to the selected registry (the application registry?).

@jthielen
Copy link
Collaborator

jthielen commented Apr 5, 2021

@dcherian Let me know if I can be of help with this! MetPy's CF unit compatibility has not progressed much since the prior discussion here, so the same caveats with MetPy's registry (case sensitivity, etc.) apply.

I'd also be curious to see where pint-xarray is failing with "1" as a non-dimensional unit, since as far as I can recall pint-xarrayand MetPy both use the .parse_units method in handling the units attribute string, and both the default pint registry and MetPy's customized one return <Unit('dimensionless')> from registry.parse_units("1").

@keewis
Copy link
Collaborator

keewis commented Apr 5, 2021

In the released version of pint_xarray we have been using parse_expression instead of parse_units (see #41), so once we release 0.2 this should be fixed.

Edit: the _repr_inline_ implementation seems to display dimensionless as [], which is a bit unfortunate

@dcherian
Copy link
Author

dcherian commented Apr 5, 2021

From http://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#units

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The Udunits package defines a few dimensionless units, such as percent, but is lacking commonly used units such as ppm (parts per million). This convention does not support the addition of new dimensionless units that are not udunits compatible. The conforming unit for quantities that represent fractions, or parts of a whole, is "1". The conforming unit for parts per million is "1e-6". Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see below) rather than the units .

I guess this is solved but for completeness, here's the error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-106-3906a8dd4379> in <module>
----> 1 cmor.pint.quantify()

~/miniconda3/envs/dcpy/lib/python3.8/site-packages/pint_xarray/accessors.py in quantify(self, units, unit_registry, **unit_kwargs)
    508         new_obj = conversion.strip_unit_attributes(self.ds)
    509 
--> 510         units = {
    511             name: _decide_units(unit, registry, attr)
    512             for name, (unit, attr) in zip_mappings(units, unit_attrs).items()

~/miniconda3/envs/dcpy/lib/python3.8/site-packages/pint_xarray/accessors.py in <dictcomp>(.0)
    509 
    510         units = {
--> 511             name: _decide_units(unit, registry, attr)
    512             for name, (unit, attr) in zip_mappings(units, unit_attrs).items()
    513             if unit is not None or attr is not None

~/miniconda3/envs/dcpy/lib/python3.8/site-packages/pint_xarray/accessors.py in _decide_units(units, registry, unit_attribute)
    133     elif units is None:
    134         # TODO option to read and decode units according to CF conventions (see MetPy)?
--> 135         units = registry.parse_expression(unit_attribute).units
    136     elif isinstance(units, Unit):
    137         # TODO do we have to check what happens if someone passes a Unit instance

AttributeError: 'int' object has no attribute 'units'

and unit_attribute is "1"

@dcherian
Copy link
Author

dcherian commented Apr 5, 2021

Let me know if I can be of help with this!

@jthielen do you have time to make a PR? 😉 you're the expert!

@jthielen
Copy link
Collaborator

jthielen commented Apr 5, 2021

@dcherian I should be able to do so later this evening!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants