Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for NOAA GFS AWS/NCAR Model Support #78

Merged
merged 22 commits into from
Dec 19, 2023
Merged

Conversation

jacobbieker
Copy link
Member

@jacobbieker jacobbieker commented Dec 15, 2023

Pull Request

Description

This adds support for pulling the Global Forecast System forecasts from AWS (if since Feb 2021) or NCAR (for archives back to 2015). AWS files are hourly up for the first 120 hours, then 3 hourly, while NCAR is only 3 hourly. Otherwise, the parameters and files are identical.

Relates to #12

How Has This Been Tested?

Unit tests

  • Yes

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings

@jacobbieker jacobbieker self-assigned this Dec 15, 2023
@jacobbieker jacobbieker added the enhancement New feature or request label Dec 15, 2023
@jacobbieker
Copy link
Member Author

Also planning on adding AWS Open Dataset support as well now, they seem to have hourly forecasts going back to Feb 2021, potentially instead of NCEP.

@jacobbieker
Copy link
Member Author

@devsjc for the tests, GFS grib files are one file per timestep, and fairly huge (~500mb), so I don't particularly want to include it in the repo. Older ones are smaller at around 300mb, but still too large for Git. Should the test just pull in the data from AWS? Or is there a different way you would prefer?

@jacobbieker jacobbieker marked this pull request as ready for review December 19, 2023 11:51
@jacobbieker jacobbieker requested a review from devsjc December 19, 2023 11:51
@jacobbieker
Copy link
Member Author

I'm working on adding the integration tests, but the rest of the actual processing is ready to go, and the unit tests pass locally with local grib files

@jacobbieker jacobbieker changed the title Add Support for NOAA GFS NCEP/NCAR Model Support Add Support for NOAA GFS AWS/NCAR Model Support Dec 19, 2023
@jacobbieker jacobbieker mentioned this pull request Dec 19, 2023
10 tasks
@jacobbieker
Copy link
Member Author

One more thing that's come up is that for GFS, the step 0 forecast has a few extra variables, and many missing variables compared to the other steps, primarily accumulated variables, but others as well. As each mapTemp takes a single path to process, I don't think I can have it correctly add in zero'd out values for those data variables, as the shapes don't match. I think the changes should happen in the service.py where the xr.merge is replaced that does do the merge for everything, unless that fails, in which case it can fill in the missing variables with the correct shape into the step 0 dataset. Does that sound alright?

src/nwp_consumer/internal/inputs/noaa/_consts.py Outdated Show resolved Hide resolved
src/nwp_consumer/internal/inputs/noaa/aws.py Outdated Show resolved Hide resolved
src/nwp_consumer/internal/inputs/noaa/aws.py Show resolved Hide resolved
src/nwp_consumer/internal/inputs/noaa/aws.py Outdated Show resolved Hide resolved
src/nwp_consumer/internal/inputs/noaa/ncar.py Outdated Show resolved Hide resolved
src/nwp_consumer/internal/inputs/noaa/ncar.py Show resolved Hide resolved
src/nwp_consumer/internal/inputs/noaa/aws.py Show resolved Hide resolved
src/nwp_consumer/internal/service/service.py Show resolved Hide resolved
@devsjc
Copy link
Collaborator

devsjc commented Dec 19, 2023

@devsjc for the tests, GFS grib files are one file per timestep, and fairly huge (~500mb), so I don't particularly want to include it in the repo. Older ones are smaller at around 300mb, but still too large for Git. Should the test just pull in the data from AWS? Or is there a different way you would prefer?

I'd use grib tools to cull the number of variables in the test files right down to 2 to reduce their size!

@jacobbieker
Copy link
Member Author

@devsjc for the tests, GFS grib files are one file per timestep, and fairly huge (~500mb), so I don't particularly want to include it in the repo. Older ones are smaller at around 300mb, but still too large for Git. Should the test just pull in the data from AWS? Or is there a different way you would prefer?

I'd use grib tools to cull the number of variables in the test files right down to 2 to reduce their size!

Turns out NOMADs has a data subsetter service https://nomads.ncep.noaa.gov/gribfilter.php?ds=gfs_0p25 so used that to get a file with 1 variable for each of the 3 different types of levels assumed in the data. Might be worth adding support for NOMADS GFS later, AWS has a more comprehensive archive of live hourly forecasts, but NOMADS subsetting is nice.

@jacobbieker jacobbieker requested a review from devsjc December 19, 2023 14:32
@jacobbieker jacobbieker merged commit b268cae into main Dec 19, 2023
10 checks passed
@jacobbieker jacobbieker deleted the jacob/noaa branch December 19, 2023 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants