Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daymet (and other dataset) subsetting performance #91

Open
stuckyb opened this issue Aug 24, 2022 · 1 comment
Open

Daymet (and other dataset) subsetting performance #91

stuckyb opened this issue Aug 24, 2022 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@stuckyb
Copy link
Owner

stuckyb commented Aug 24, 2022

Daymet dataset subsetting for monthly data is currently implemented such that the entire dataset for a year is clipped, the result is cached, and then individual months are returned from the cached result. This does not appear to working as expected: requests for clipped monthly data either run indefinitely or crash.

@stuckyb stuckyb added the bug Something isn't working label Aug 24, 2022
@stuckyb stuckyb self-assigned this Aug 24, 2022
@stuckyb
Copy link
Owner Author

stuckyb commented Aug 24, 2022

After some investigation, the performance problems appear to be caused by rioxarray attempting to load the entire dataset into memory before subsetting. The from_disk parameter should address this, but it appears not to work for NetCDF files. Turning that parameter on and switching to the TIFF versions of the data files fixes the problem. This is implemented in c05b49a. I am not closing this issue, though, because the question of optimal strategy still requires further investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant