-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To what extent should MetPy try to ensure CF units/UDUNITS compatibility? #1362
Comments
I think there's some value in it. Really, what we need to support are standards in our field's common data formats. With the exception of random text units in CSV file headers or free text fields in HDF5/netCDF files, the Luckily, thanks to @lesserwhirls and netCDF-Java, there's already a place where all the XML unit information from UDUnits2 is aggregated together. I don't think we want to be parsing that at runtime in MetPy, but I think it would be good to do some testing against it and see how much we need to add (and hopefully not modify or remove). One thing I will note is (that I just learned through digging into the "Celsius" issue) is that UDUnits-2 is case insensitive for names only, not symbols. That could present some challenges. I think we should at least be somewhat pro-active here--at least by following this particular thread (e.g. "Celsius" and look at the XML unit database). I'm not saying we need to exhaustively test 10^5 different unit strings and make sure we get the right results. I bet just solving the two things mentioned here, on top of the mods you've already made, would get us 99% there. |
Out of curiosity, how much of Pint's parser can we override or outright rip out? |
I'm not too sure about that! The most extensible parts of Pint right now are unit definitions, constants, and preprocessors functions. Beyond that, I'm pretty sure we'd have to rely on monkey patching or implementing changes upstream to make currently unexposed APIs extensible. |
Also adding to the list here: "Meter" (from #1459) Should there be a short-term fix for this capitalization issue until more substantial testing can be done? Or is this something we should comment on in the unit support docs? |
Well, can we make it so we only consider case for symbols? Hell, I don't even know if there are symbols that collide if you make everything lowercase. |
Only considering case for symbols would not be easy to do exactly, but there may be ways around that. Since Pint's parsing internals currently don't seem to distinguish between symbols and any other alias, this would at the very least require making low-level changes in Pint's unit parsing and adding a bunch of extra references back to underlying Symbols do collide if you make everything lowercase in a preprocessor (e.g., That being said, Pint does have proper case insensitive handling on its
So, if we are okay with a bit of extra leniency on case-insensitive symbols, then this should actually be achievable by adding a If this all sounds reasonable, should we propose this upstream in Pint? |
Given that we're not a conformance checker and our users frequently are not in control of their data, we should also lean towards being lenient, provided we're not in an ambiguous situation--which we're not here. I think that's reasonable to propose your suggestion in Pint. |
Once hgrecco/pint#1147 is merged (and 0.15 released?), all we have to do is add the Update: hgrecco/pint#1147 has been merged, so this can proceed forward now with upstream dev or wait until 0.15 release. |
We'll have to ponder whether it's better to bump Pint. I'm doubtful of the existence of a user-base that is capable of installing a new MetPy but is unable to update their version of Pint. |
So for the record, 0.15 has been released and conda-forge packages are out. I'm going to wait for dependabot to pick it up and that should open us up to testing on it. |
I went ahead and enabled the |
Le sigh. |
For the record, this dataset from #1917 contains an expression that failed our regex ( |
xarray-contrib/xwrf#50 uncovered another expression that fails our regex: |
This comment by @rsignell-usgs, combined with issues like #1134 and #1350, led me to wonder: what efforts should MetPy be taking to maintain compatibility with CF units/UDUNITS? Right now, it seems to have been just implementing a fix when a particular problem arises (which is a reasonable enough way to continue going forward), but I've also wondered whether it's worth it at some point to parse the UDUNITS xml files and either run tests on MetPy's Pint registry with respect to this database or use the database to build a custom Pint registry configuration file.
The text was updated successfully, but these errors were encountered: