-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract bounding box from NetCDF/HDF5 files and insert into the geospatial metadata block #9541
Conversation
insert the bounding box into the geospatial metadata block
I've been saying at standup that I need to tweak something for S3 but I was wrong. S3 works fine so in f5be4a8 I removed the FIXME and replaced it with some details. I'll note that @JR-1991 @atrisovic and I are still discussing a bit the getStandardLongitude method in this PR. I'm happy to discus this more with a reviewer. Here's an awesome diagram Jan made and posted: I'm putting this in "ready for review". |
return null; | ||
} | ||
if (westAsFloat > 180 || eastAsFloat > 180) { | ||
Float westStandard = westAsFloat - 360; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if we're converting 0-360 to -180-180 then we need to subtract just 180 from the given value, not 360.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sekmiller I was also puzzled with this at the beginning, but subtracting 360° is the correct procedure. The math only tells half the truth, because the prime meridian (0° longitude) needs to stay at Greenwich and thus the interval's "cycle point" where both ends meet changes by 180°. Thus, this has to be attributed for, which leads to the 360°.
That's how I understood it so far, but this illustration may explain it better:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JR-1991 thanks! Even with that explanation, it's hard to keep straight in my head.
I was relying on a tool to check that maps are the same. That is these two URLs (minus 360) show the same map:
- https://linestrings.com/bbox/#343.68,41.8,353.78,49.62
- https://linestrings.com/bbox/#-16.320007,41.8,-6.220001,49.62
Here's a screenshot of the maps (and a bit more commentary) I originally posted at https://dataverse.zulipchat.com/#narrow/stream/376593-geospatial/topic/extract.20lat.2Flong.20.239331/near/348542658
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JR-1991 OK. Looking at the globe and the example location makes sense when the value is between 180 and 360, but what the code does here is if the value for either is over 180 it subtracts 360 from both so there is a potential for a situation where one val is > 180 and the other is less and in that case, say 170, and subtracting 360 results in -190 which is off the scale below -180.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sekmiller thanks for the review! Indeed subtracting from both would lead to a failure. I tested it visually and in the case of
I understand it in this way, that within this domain transformation only one half of the circle is changing. Note, in 0° - 360° domain the first half is the same as in the domain of (-180°) - 180° and thus there is a 1:1 relation if degrees are within 0-180. Yet, the other half does change to another range in a more drastic way. Tipping over 180° by a degree suddenly leads you to -179° which in the other domain corresponds to 181° - Hence the 360° provides a valid transformation for this case.
It's quite complicated though 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should add to the figure, that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to address all this in 690b8a4 just now and left this comment about it: #9541 (comment)
Maybe I should have resolve other conversation and gotten it down to just this one. Oh well. Please see the other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through the thread and agree with the comments! Instead of the lines 154-182, we can check each coordinate by itself and it would work well, ie:
Float westStandard = westAsFloat;
Float eastStandard = eastAsFloat;
if (westAsFloat > 180) {
westStandard = westAsFloat - 360;
}
if (eastAsFloat > 180){
eastStandard = eastAsFloat - 360;
}
WestAndEastLongitude updatedWeLong = new WestAndEastLongitude(westStandard.toString(), eastStandard.toString());
return updatedWeLong;
assertEquals(new WestAndEastLongitude("-16.320007", "-6.220001"), extractor.getStandardLongitude(new WestAndEastLongitude("343.68", "353.78"))); | ||
// One is over 180. Change | ||
assertEquals(new WestAndEastLongitude("-260.0", "-179.0"), extractor.getStandardLongitude(new WestAndEastLongitude("100", "181"))); | ||
// Both are under 180. No change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JR-1991 it seems to me that the test above should fail as the result of -260 is outside the range of -180 - 180.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. I'll ping Jan for some input on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdurbin just for clarification, the right hand-side extractor transforms by subtracting 360° from both?
If so, then I have to agree, this test should fail and truly evaluate to
Given the proposed change in the calculation, the test should look like this:
assertEquals(
new WestAndEastLongitude("100.0", "-179.0"),
extractor.getStandardLongitude(new WestAndEastLongitude("100", "181"))
);
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's late on a Friday but I powered through and pushed some fixes around longitude at 2edb351.
Almost as importantly, I added docs! Rules in the User Guide about how users (and QA) should expect the code to behave. I'd love to hear what you (@JR-1991) and @atrisovic and @sekmiller (whom I've been bugging all day) think! Here they are:
- West Longitude and East Longitude are expected to be in the range of -180 and 180. (When using :ref:
geospatial-search
, you should use this range for longitude.) - If West Longitude and East Longitude are both over 180 (outside the expected -180:180 range), 360 will be subtracted to shift the values from the 0:360 range to the expected -180:180 range.
- If either West Longitude or East Longitude are less than zero but the other latitude is greater than 180 (which would imply an indeterminate domain, a lack of clarity of if the domain is -180:180 or 0:360), metadata will be not be extracted.
I also added more tests. I'm happy to add more if you all think of more edge cases. I don't have real NetCDF files to test with that have wacky values. It might be a challenge for QA to find NetCDF files to exercise all the rules.
This comment has been minimized.
This comment has been minimized.
It seems to be working! \o/
@@ -1531,72 +1654,71 @@ private void processDatasetMetadata(FileMetadataIngest fileMetadataIngest, Datas | |||
// create a new compound field value and its child | |||
// | |||
DatasetFieldCompoundValue compoundDsfv = new DatasetFieldCompoundValue(); | |||
int nonEmptyFields = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whitespace change and below is from me removing this code and then (just) re-adding it because it seems to be working afterall! My test still passes at least. 😅
I'm ready for more review so I'm removing myself from this PR. Here are the changes since the last round: 1fb2798 add API test for FITS file metadata extraction #9331 Yes, I was convince the code to extract from compound fields wasn't working. I put it back and now it works! Well, it was always working, I suppose. Now I'm feeding it the data it expects! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. thanks for the changes
This comment has been minimized.
This comment has been minimized.
return null; | ||
} | ||
if (westAsFloat > 180 || eastAsFloat > 180) { | ||
Float westStandard = westAsFloat - 360; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through the thread and agree with the comments! Instead of the lines 154-182, we can check each coordinate by itself and it would work well, ie:
Float westStandard = westAsFloat;
Float eastStandard = eastAsFloat;
if (westAsFloat > 180) {
westStandard = westAsFloat - 360;
}
if (eastAsFloat > 180){
eastStandard = eastAsFloat - 360;
}
WestAndEastLongitude updatedWeLong = new WestAndEastLongitude(westStandard.toString(), eastStandard.toString());
return updatedWeLong;
Issues found:
![]() ![]()
|
Sadly, this is a known issue. It's in "sprint ready", at least: The fix will need to happen over in the previewers repo so there's an issue to track it there: |
📦 Pushed preview application image as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
@@ -367,7 +369,20 @@ public List<DataFile> saveAndAddFilesToDataset(DatasetVersion version, | |||
} else { | |||
logger.fine("Failed to extract indexable metadata from file " + fileName); | |||
} | |||
} else if (FileUtil.MIME_TYPE_INGESTED_FILE.equals(dataFile.getContentType())) { | |||
} else if (fileMetadataExtractableFromNetcdf(dataFile, tempLocationPath)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks direct uploads for me. When using a direct upload, the file is unattached
and the owner is set to 'null' on line 336. However, 'fileMetadataExtractableFromNetcdf' needs the owner dataset to determine the folder inside storageIO.open();
. This results in a NullPointerException
, preventing adding the files to the dataset. I have only tested with the local filesystem, but from looking at the code, this probably also breaks uploading files directly to S3. Can you investigate and fix it? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ErykKul yikes! This has already been merged. 😬 Could you please create an issue about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reproduced this in develop: direct file upload via UI appears to upload correctly, ie the progress bar completes and the file is present in the uploaded file list but fails on save with a grayed out screen as if waiting for a completion signal and a server log error:
diredct_upload_ui_err.txt
Note that direct file upload continues to work via API. Also, deleting the file continues to work via ui.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can reproduce it too. I created an issue:
Thanks, both!
insert the bounding box into the geospatial metadata block
What this PR does / why we need it:
Prevent users from having to manually enter bounding box values. As with FITS, we get the values from the files themselves.
Which issue(s) this PR closes:
Special notes for your reviewer:
I tried pretty hard to get the compound field code working but ended up switching to something that works,I switched it back in 189c774.jsonParser.parseField
which is also used by the native API.If the longitude "domain" is 0 to 360 instead of -180 to 180, we convert it. See also this issue:
Suggestions on how to test this:
Upload various NetCDF files, especially ones that have a bounding box. Here's a nice small one with a bounding box that the code tests: https://www.ncei.noaa.gov/data/international-comprehensive-ocean-atmosphere/v3/archive/nrt/ICOADS_R3.0.0_1662-10.nc
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Not really. The bounding box is populated. Here's what it looks like when you are just looking at (not editing) a dataset:
It seems a little weird that we don't show any labels (north, south, east, west). There's an open issue for this at #6589 but it's out of scope.
The values look more reasonable when you are editing:
Update: this seems like an easy fix so I made a PR:
Is there a release notes update needed for this change?:
Yes, included.
Additional documentation:
Included. I updated the User Guide.