-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request: missing files #374
Comments
👋 @ghislainp, thanks for this question. Are you able to determine which specific dates are missing prior to writing the recipe? If so, you could employ a pattern like this: to drop them from the file list before the recipe is executed. This is probably the easiest way to handle this case at the moment. Automatically skipping over missing dates during recipe execution is not currently supported, though that would certainly be worth aiming for eventually. |
I could but the resulting structure of the output data is not regular in time if some dates are skipped. Is it possible to re-align/ the dataset after the concatenation, before writting the zarr ? I assume by using the process_chunk function, but the output of the process_chunk would be larger than the input and what would happen if the missing date is between two chunks... |
I see. If I understand correctly, you would ideally like arrays of NaNs (or some other filler value) in place of the empty dates, so that the dataset chunking remains correctly aligned, despite the missing dates? To the best of my knowledge, this is not currently possible (at least, without some seriously hacky maneuvers), but the ongoing work to resolve #256, which is a current priority, would probably make this much more possible. I'll be curious to know if @rabernat agrees with this assessment of if I've overlooked something. |
Noting that pangeo-forge/cesm-atm-025deg-feedstock#2 would benefit from a similar feature (failing gracefully in the case of missing files). |
I have a dataset with one file per day but some files are missing. Is there a way to deal with this case programmatically ? For instance a function like process_input that would be called when a file is missing. process_missing ?
The text was updated successfully, but these errors were encountered: