-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow CRS WKT to represent the CRS without requiring reader to compare with grid mapping parameters #222
Comments
The status quo (giving the CF attributes precedence over WKT) was discussed at great length when the possibility of including WKT strings was added. I have not reviewed that discussion but it would be relevant to do so to avoid repeating it! It's in https://cf-trac.llnl.gov/trac/ticket/69 and https://cf-pcmdi.llnl.gov/trac/ticket/80. I opposed the introduction of WKT strings because I didn't like redundancy, which would probably lead to inconsistency, but I agreed with the resolution that we have, in which the CF attributes take precedence. Without reviewing the previous discussion, these points occur to me:
In view of these points, I don't think this proposal is the best way to proceed. Instead, if there are elements of the CRS that can't currently be represented in CF but are needed, we should consider adding them, as we have done before (your points 1 and 3). If the equivalence between CF and WKT is unclear or incomplete (related to my first point above) it should be improved (your points 2 and 4). |
I am a GDAL/PROJ user, so from my biased perspective life would be much easier from the WKT form :). Additionally, since WKT is already a standard from the OGC geospatial community, most geospatial software should be able to support it.
Correct. That is why I propose the CRS WKT take precedence. The CF grid mapping parameters only provides support for a limited subset of projection parameters. (Ref: https://cf-trac.llnl.gov/trac/ticket/69):
So, in this proposal, if the CRS WKT exists and can be read in, the CF projection parameters should be ignored entirely and no checks made between the two. However, the CF projection parameters are there for both backwards compatibility as well as for programs that do not support the WKT form of the projection.
I should clarify that in this proposal that |
Here is the WKT2 form of the >>> from pyproj import CRS
>>> cc = CRS("OSGB 1936 / British National Grid")
>>> cc
<Projected CRS: EPSG:27700>
Name: OSGB 1936 / British National Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: UK - Britain and UKCS 49°46'N to 61°01'N, 7°33'W to 3°33'E
- bounds: (-9.2, 49.75, 2.88, 61.14)
Coordinate Operation:
- name: British National Grid
- method: Transverse Mercator
Datum: OSGB 1936
- Ellipsoid: Airy 1830
- Prime Meridian: Greenwich
>>> print(cc.to_wkt(pretty=True))
PROJCRS["OSGB 1936 / British National Grid",
BASEGEOGCRS["OSGB 1936",
DATUM["OSGB 1936",
ELLIPSOID["Airy 1830",6377563.396,299.3249646,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4277]],
CONVERSION["British National Grid",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",49,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",-2,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996012717,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",400000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",-100000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["unknown"],
AREA["UK - Britain and UKCS 49°46'N to 61°01'N, 7°33'W to 3°33'E"],
BBOX[49.75,-9.2,61.14,2.88]],
ID["EPSG",27700]] The |
Dear @snowman2 -- Maybe the core of your proposal is actually best made to the GDAL / PROJ project to modify default behavior when working with CF data? When different, a warning could be issued and the WKT used with preference? Regards - Dave |
@snowman2 @dblodgett-usgs @JonathanGregory GDAL is a great library and a lot of work has gone into it, but its netcdf support has always been sketchy . When I first was directed to it years ago, it could only do 2-D files, and would flip the data, even when the metadata clearly said the axes went in the other direction (it just ignored the metadata attributes). That problem lasted for a long time (for all I know it still does this). GDAL has had problems with greater than 3-D files, forecast files, DSG files, files that are part of the NCEI examples for sending in data, some issues with time, and some of the newer features in netcdf4 files. Things that can improve CF are most welcome, things that would potentially break most present CF based software should have to make an awfully strong case for the benefits. |
Thanks all for the comments! My desire here is to unite the geospatial (OGC) and CF-conventions here to simplifying things when transitioning between the communities. Much of the inspiration for this thought came when attempting to match PROJ parameters to the CF conventions as documented here. There are several parameters that do not match up and in several cases a grid mapping does not exist.This is problematic for users who wish to convert back and forth between the two. However, since PROJ supports reading in the WKT string, the full CRS can be properly represented in that manner and no information is lost. Additionally, the PROJ FAQ strongly discourages the use of PROJ strings to represent the CRS and instead recommends using the WKT string.
This would indeed be problematic and confusing for users of GDAL to change this as the behavior would differ from the CF spec. This is already done in Alternative proposal?Thoughts on stating in the spec that if the CRS cannot be properly represented using the CF grid mapping parameters, that the CRS WKT form is recommended as a fall back (noting of course that this may not be compatible with some software)? Also, it would be good to note for users to make an issue in this repo with their CRS WKT that cannot be represented using the grid mapping parameters so the CF spec can be updated accordingly. |
Would point out that the newest version of Proj4 in its latest incarnation just introduced a bunch of changes in how things are done, and in the CRS. See some of the discussion related to the |
I agree with @dblodgett-usgs and @JonathanGregory - we already have a clear hierarchy that establishes which values have precedence over which ones in the case of conflict. Data producers already have the possibility of omitting CF attributes in favour of using WKT, although this is discouraged. I would see this as an acceptable solution if one wanted to produce data now and the relevant parameters weren't supported by CF. Optimally, one would pursue the adoption of the needed parameters in CF in parallel. |
This is a bit of a sidebar, but one thing that would make it easier for people EPSG and WKT folks to create the CF representation would be if we could get the friendly folks over at spatialreference.org to supply the CF representation. If I google "EPSG 4326", I end up at https://spatialreference.org/ref/epsg/wgs-84/ |
@rsignell-usgs, that would definitely be nice. However, it will also require a lot of work, so I imagine some kind of funding would be needed. |
Hello @snowman2 this is an interesting topic and I am grateful that you have raised it I think there are some fine details that are being picked out here that are interesting, as well as the big picture. Whilst the big picture comes with a lot of considerations, there are small scale benefits we can try to get to. One example stands out for me from your comments:
I have also been looking at the axis order with respect to CRS-WKT. I agree that this is important. I think that there is an in situ feature that can be extended to provide some extra clarity on this topic. With this in mind, I have opened a new issue I very much support the broader scope discussion on this topic, hence my approach to separate out #223 so that the discussion on that targeted topic does not get in the way of these valuable considerations. I hope this is a useful step |
I'm afraid that Google may be somewhat unhelpful with its advice the resources at https://spatialreference.org are not very well maintained, and the process of maintenance has been far from clear for some time: The EPSG maintain the official registry for EPSG codes, providing URI and URN notation for encodings, e.g. Comparing this resource to At present the only well maintained resource for EPSG codes in WKT encoding that I am confident of using is all the best |
On the detail point of the proposal, I would support amending the current text:
To remove the latter precedence statement.
It is my view that there is too much of an onus placed on the data consumer here, to parse both content representations, map terms to one another and interpret outputs. This is complicated and difficult to implement. There are many opportunities for mistakes and problems. If there is WKT in a file, I want my application to trust it, not to have to parse it to look for mistakes. If i can just parse it then I can delegate this to a supporting application, which is great for maintainability. I think that placing the onus on the data producer to produce content that they assert is consistent is sufficient. I think the value of data consumers being able to simply parse the WKT directly is very large. I think the cost of managing the assertion of consistency on data producers is much smaller. In a sense the status quo is standardising for mistakes in encoding, which i don't think the standard should do, especially given the cost here. all the best |
I read both #69 and #80, and was startled by the sudden acceptance of these tickets after such long discussion of possible issues. (Credit here to Jonathan for flexibility!) Many of those issues are raised in this context, but this ticket proposes WKT be dominant in a much narrower sense (see detailed item (b) below). I agree strongly with @Margh's recent points, including the large value of data consumers being able to simply parse the WKT directly. It's key to recognize this is an augmentation, not a restriction. My detailed reasons follow, but first, I think the phrasing at the beginning of the proposal is creating unneeded alarm. Despite the misleading title, the proposal doesn't make WKT dominant, it just makes it directly usable (but still secondary, because the WKT is not required). I offer this as an equivalent rewrite of the proposal's first paragraph:
In the text you wouldn't say anything like this of course. The text already describes how WKT is an optional augmentation, and that the non-WKT CF must be as complete as possible. I'd only tweak one line, just before the paragraph marqh highlighted, by replacing "as well as by crs_wkt" with "even if a crs_wkt is present", so now it reads:
With the proposed precedence deletion of marqh (item (c) below), I believe this fully captures the intent of the proposal. Detailed responses to a few points: (a) Yes ideally CF could be equally capable. On the other hand, WKT will continue to improve and many tools are and will be built around it. Does CF want to take on the job of "keeping up with WKT" and expect tool developers to "keep up with the CF version of WKT"? Even if we want that to happen, who in CF wants to volunteer to make it happen for CF? And in the tool community? I'm assuming positive assessments of the prevalence of WKT, its features, and its community support for upgrades. If you agree these are favorable indicators, then there are two ways to consider the options. (1) How good will this be for existing CF users going forward? Although maybe not many of them need WKT yet, it will be favorable on balance, with little or no downside that I can see. And more broadly, (2) How much will this encourage/allow the geospatial community to easily adopt and use CF? I think it will be quite encouraging. |
@graybeal, thanks for clarifying! I used your clarified version as I think it does a much better job of capturing the intent of the proposal. |
Although I'm watching this repository, and I contributed to this thread, GitHub has sent me only one of the contributions to this issue, namely the most recent (before this one, 10 h ago by @snowman2). Shouldn't I receive all of them by email? I depend on email to be informed that some discussion is taking place. |
A few comments on the discussion to this point. I think the discussion is moving in the overall right direction. If seems to me that there was confusion at first between implementations and uses on the one hand and design and conventions on the other. I think we need to seriously consider how big a job it would be to "re-invent the wheel" by trying to add to CF, even piecemeal, all the parameters needed to represent all coordinate reference systems (CRSs). The vast majority of us are not geodesists. We need to acknowledge that this is a significant discipline that we know little about, and allow the experts in that field to be the experts. Let's use the standards they have developed rather than build an inferior substitute. CF added the ability to specify a few projected coordinate systems. We clearly must continue to honor those for backward compatibility purposes, but let's not add any new ones. I think we should encourage the use of WKT CRS declarations going forward and focus on what might need to be added to CF to resolve ambiguities that might be present. If I understood correctly, @JonathanGregory thought there were possible issues. I didn't see any specifics given, but I'd rather try to clear those up than follow a "make our own" approach any longer. I've worked with a few data providers that attempted to add grid_mapping variables to their netCDF files. The majority of them botched it. They would have been much better off if they could have copied and pasted a WKT string rather than try to figure out how to read CRS definitions and map elements to CF grid_mapping attributes. |
Great strategy @JimBiardCics. Having contributed an implementation to map CF conventions to WKT in R -- I know how error prone and hard it can be. Moving toward support of WKT as a fully fledged option within CF is unambiguously a good thing in my mind. @marqh's suggested text changes make a ton of sense to me. Should we also add something that emphasizes the points about "graceful co-habitation" ? |
Are you thinking something along the lines of: "If both a CRS WKT and grid mapping parameters exist, it is assumed that they are equivalent. As such, either one may be used to represent the CRS of the file." |
Or to deal with the edge cases and be consistent with our expectations: "If both a CRS WKT and grid mapping parameters exist, it is assumed that they do not conflict. As such, information from either one (or both) may be used to represent the CRS of the file, recognizing that the grid mapping parameters should always be completed as fully as possible." |
One minor addition: |
@snowman2 Are there any applications that actively read in and use the CF grid mapping parameters? |
The only application I am aware of that does so is GDAL. However, it also checks for the WKT string and compares the two at present. I am not sure about other applications, but I assume there are based on the current cf-conventions. 🤷♂️ |
GDAL is one more than I was aware of. I'm not aware of any others. |
I agree that informing the data provider of conflicts when found is good practice. But what behavior should software have? Should it just stop, or can we tell it which of the two conflicting pieces of information it should rely on (until the conflict has been resolved)? My general ignorance about WKT prevents me from understanding what is meant by "then the value specified by the single-property attribute shall take precedence." Is the single-property attribute sometimes the WKT property and sometimes the CF attribute, or is it invariably the CF attribute? |
I agree that the data-producer should be the best authority on what was intended. However, knowing that doesn't give the data-user an immediate solution to an inconsistency. I think the current wording (the precedence of CF metadata over WKT) makes sense, since this is a CF dataset. As Karl says, that default also gives an incentive to the data-producer to ensure consistency. However, I think it's also fine to recommend contacting the data-producer. |
The final wording from the breakout meeting: There will be occasions when a given CRS property value is duplicated in both a single-property grid mapping attribute and the crs_wkt attribute. In such cases the onus is on data producers to ensure that the property values are consistent. If both crs_wkt and grid mapping attributes exist, the attributes must be the same and grid mapping parameters should always be completed as fully as possible. As such, information from either one (or both) may be read in by the user without needing to check both. However, in those situations where the two values of a given property are different, the CRS information cannot be interpreted accurately and users should inform the provider so the issue can be addressed. |
Which parts of the sentence: "If both crs_wkt and grid mapping attributes exist, the attributes must be the same and grid mapping parameters should always be completed as fully as possible." should trigger an error (warning?) in a compliance checker? Is the file compliant if the two are not the same? Is a file compliant if the grid mapping parameters are incomplete (when it is possible for them to be complete)? |
I would say an error.
I would say no based on this part: "in those situations where the two values of a given property are different, the CRS information cannot be interpreted accurately"
I would say no based on this part: "the attributes must be the same and grid mapping parameters should always be completed as fully as possible" |
How about an entirely different solution: When there are multiple grid descriptions in a file, the creator must add a metadata flag that indicates which of the grid descriptions is 'primary'. The onus to make grid descriptions as equivalent as possible can still be on the creator, but the user will know which one to trust if there is a discrepancy. And the CF checkers will only need to check that there is one, and only one, 'primary' flag. |
That would require all applications to be able to support the WKT format. Currently that is not possible due to software limitations (see comments in this thread). |
Dear all Before the meeting yesterday I was arguing, like Karl @taylor13, to retain the presumption that if the Therefore I support the change to remove this assumption, and state that the metadata is invalid if Unfortunately, the CF checker won't be able to detect this error unless we write down the mapping between I would be concerned about adopting Philip @cameronsmith1's suggestion, because I fear that might lead to data-producers not being so careful with one or the other of the representations, thinking that they could set the flag to indicate it's not to be trusted. Jonathan |
This issue doesn't have a moderator - I think that's why it's not progressed. I will moderate it. @snowman2's current proposal (#282) is to replace However, in those situations where two values of a given property are different, then the value specified by the single-property attribute shall take precedence. For example, if the semi-major axis length of the ellipsoid is defined by the grid mapping attribute semi_major_axis and also by the crs_wkt attribute (via the WKT SPHEROID[…] element) then the former, being the more specific attribute, takes precedence. with If both crs_wkt and grid mapping attributes exist, the attributes must be the same and grid mapping parameters should always be completed as fully as possible. As such, information from either one (or both) may be read in by the user without needing to check both. However, in those situations where the two values of a given property are different, the CRS information cannot be interpreted accurately and users should inform the provider so the issue can be addressed. For example, if the semi-major axis length of the ellipsoid is defined by the grid mapping attribute semi_major_axis and also by the crs_wkt attribute (via the WKT SPHEROID[…] element), the value of this attribute cannot be interpreted accurately. I think this is OK, except for the last sentence, which I think should be For example, if the semi-major axis length of the ellipsoid is defined by the grid mapping attribute semi_major_axis disagrees with the crs_wkt attribute (via the WKT SPHEROID[…] element), the value of this attribute cannot be interpreted accurately. That leads naturally to the unaltered final sentence, "Naturally if the two values are equal then no ambiguity arises." Philip @cameronsmith1 and Karl @taylor13, are you content with this? Alan @snowman2, is my amendment OK with you? Jonathan |
Yes, I think the intent of this is fine. I don't recall the text that precedes the revised text, but should the first sentence read: "If, for a given property, both crs_wkt and grid mapping attributes exist, the attributes must be the same and grid mapping parameters should always be completed as fully as possible" Also, is the second clause dependent on the first clause, or in general is it true that "grid mapping parameters should always be completed as fully as possible." If it is generally true, I think the second clause should be its own sentence (and maybe it should appear elsewhere?). |
Minor tweak: For example, if the semi-major axis length of the ellipsoid |
@snowman2 My poor brain can't seem to detect what exactly was tweaked. Could you please point to the specific change made? |
Oh, I just saw the crossed out "is". Guess I mistook it for dust on my monitor. Sorry. |
I am OK with this (as amended). |
Thanks for correcting my typo, @snowman2. I agree with Karl @taylor13's second point. It is a general statement. However, I think we're introducing unnecessary repetition. I appreciate that the modified text is in the pull request, but our guidelines are that we should discuss it as far as possible in the issue, so there's only one place to look to see the discussion. So I'm repeating the whole of paragraph and the previous one for context. I propose minor deletions for conciseness and to reduce repetition. How's this: The crs_wkt attribute is intended to act as a supplement to other single-property CF grid mapping attributes (as described in Appendix F); it is not intended to replace those attributes. If data producers omit the single-property grid mapping attributes in favour of the compound crs_wkt attribute, software which cannot interpret crs_wkt will be unable to use the grid_mapping information. Therefore the CRS should be described as thoroughly as possible with the single-property attributes as well as by crs_wkt. In cases where CRS property values can be represented by both a single-property grid mapping attribute and the crs_wkt attribute, the grid mapping should be provided, and if both are provided, the onus is on data producers to ensure that their property values are consistent. Therefore information from either one (or both) may be read in by the user without needing to check both. However, if the two values of a given property are different, the CRS information cannot be interpreted accurately and users should inform the provider so the issue can be addressed. For example, if the semi-major axis length of the ellipsoid is defined by the grid mapping attribute semi_major_axis disagrees with the crs_wkt attribute (via the WKT SPHEROID[…] element), the value of this attribute cannot be interpreted accurately. Naturally if the two values are equal then no ambiguity arises. Jonathan |
Sounds like a reasonable change to me. Minor tweaks:
I am thinking the
I am thinking the |
Thanks, @snowman2. I agree with both of those changes. Please could you update your pull request so it's the same text as above (with those two changes)? If Karl @taylor13 and Philip @cameronsmith1 think that's OK still, we can count them as supporters, which means the proposal meets the conditions for acceptance. It will be accepted three weeks from now (3rd August) if there are no further concerns raised before then. |
Sounds good, just updated the PR (459e514). (Note: just edited commit hash). |
Yes, count me as a supporter. Thanks. |
Thanks, Karl
|
These changes look good to me. |
Thanks for assisting getting this proposal accepted @JonathanGregory 👍 |
Title: Allow CRS WKT to represent the CRS without requiring reader to compare with grid mapping parameters
Moderator: ???
Moderator Status Review [last updated: YY/MM/DD]: ???
Requirement Summary:
I propose the requirement be changed like so:
There will be occasions when a given CRS property value is duplicated in both a single-property grid mapping attribute and the crs_wkt attribute. In such cases the onus is on data producers to ensure that the property values are consistent. If both crs_wkt and grid mapping attributes exist, the attributes must be the same and grid mapping parameters should always be completed as fully as possible. As such, information from either one (or both) may be read in by the user without needing to check both. However, in those situations where the two values of a given property are different, the CRS information cannot be interpreted accurately and users should inform the provider so the issue can be addressed.
, then the value specified by the single-property attribute shall take precedence.For example, if the semi-major axis length of the ellipsoid is defined by the grid mapping attribute semi_major_axis and also by the crs_wkt attribute (via the WKT SPHEROID[…] element), the value of this attribute cannot be interpreted accurately.then the former, being the more specific attribute, takes precedence.Naturally if the two values are equal then no ambiguity arises.Benefits:
Status Quo:
http://cfconventions.org/cf-conventions/cf-conventions.html#use-of-the-crs-well-known-text-format mentions
The text was updated successfully, but these errors were encountered: