-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LRU Caching for NetCDF lumped format timeslice chunks #421
LRU Caching for NetCDF lumped format timeslice chunks #421
Conversation
4daa471
to
637cc16
Compare
d71ffdb
to
22c4974
Compare
80f1654
to
a5e594e
Compare
5b9f13e
to
c7db0e7
Compare
…ike 10 should be sufficient but this causes horrible thrashing.
c7db0e7
to
2e9ed48
Compare
Only some superficial reading of the codes so far. One note on commit "Tweaks to allow creation from either "standard" CSV format": one line 29, should the second parameter be 1 instead of 2, as noted somewhere in a C++ code? |
One comment I have involves NetCDFPerFeatureDataProvider.hpp lines 401 - 424. In particular, I don't fully understand line 422. I assume i iterate through cache slices, while j varies within a slice? Then for raw_values[i+j] the indices i and j would be equivalent, i.e., (i=0, j=1) and (i=1, j=0) would be the same. Would this potentially mix up different slices? In the current set setup, since cache_slice_t_size = 1, this won't have any effect. Is this going to be generalized? Are the comment lines between lines 67 - 69, and 74 - 75 still necessary? One other question doesn't have anything to do with the current codes, but with the format of the netcdf files. The original format of the AORC netcdf forcing files' attributes contain the scale_factor and offset. If we deal with that in pre-processing, then we don't need to deal with that here. Otherwise, it needs to be considered. |
Yes, this is probably flawed right now for any case where |
No, but they might be needed at larger scales. Left in for reference. |
Good question... my assumption was that the library handles these attributes, but that may be incorrect. If we support those in these files, we would need to test that... however these files have |
This question comes up because in my ESMPY based codes the output netcdf
files copy the original attributes. If we don't need this, I'll need to
modify the output netcdf file attributes.
…On Thu, Jul 21, 2022 at 2:59 PM Matt Williamson ***@***.***> wrote:
One other question doesn't have anything to do with the current codes, but
with the format of the netcdf files. The original format of the AORC netcdf
forcing files' attributes contain the scale_factor and offset. If we deal
with that in pre-processing, then we don't need to deal with that here.
Otherwise, it needs to be considered.
Good question... my assumption was that the library handles these
attributes, but that may be incorrect. *If* we support those in these
files, we would need to test that... however these files have float
values so those metadata scale values are far less useful.
—
Reply to this email directly, view it on GitHub
<#421 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SRKVACBVIEEXTSRRPQTVVGTYXANCNFSM5ZMPVSEQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment already provided in PR.
Those attributes should only be copied if the values in the NetCDF need to be scaled and offset... do they? Are the values copied into the new NetCDF (floats) already scaled and offset? Or are they the same values from the original NetCDF before they are scaled and offset? |
I believe scale and offset are used to convert integer encoded float back
into float automatically. If we are getting floating point values for those
fields then these values have been applied. We would only need to duplicate
the values if we where also story as integer encoded float. Also the
correct scale and offset values would be different for each field in any
case.
…On Fri, Jul 22, 2022 at 8:19 AM Matt Williamson ***@***.***> wrote:
The original format of the AORC netcdf forcing files' attributes contain
the scale_factor and offset...This question comes up because in my ESMPY
based codes the output netcdf files copy the original attributes
Those attributes should only be copied if the values in the NetCDF need to
be scaled and offset... do they? Are the values copied into the new NetCDF
(floats) already scaled and offset? Or are they the same values from the
original NetCDF *before* they are scaled and offset?
—
Reply to this email directly, view it on GitHub
<#421 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF6KABAMG47ATBC2WVBNLSTVVKNT3ANCNFSM5ZMPVSEQ>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
--
Donald W Johnson
205-347-1467
National Water Center
Tuscaloosa AL
|
If you just read the netcdf file, the scale factor and offset are handled
automatically, based on my reading. So you should have the correct values
when you read in the file, unless you do anything else.
On Fri, Jul 22, 2022 at 9:01 AM Donald W Johnson -- NOAA <
***@***.***> wrote:
… I believe scale and offset are used to convert integer encoded float back
into float automatically. If we are getting floating point values for those
fields then these values have been applied. We would only need to duplicate
the values if we where also story as integer encoded float. Also the
correct scale and offset values would be different for each field in any
case.
On Fri, Jul 22, 2022 at 8:19 AM Matt Williamson ***@***.***>
wrote:
> The original format of the AORC netcdf forcing files' attributes contain
> the scale_factor and offset...This question comes up because in my ESMPY
> based codes the output netcdf files copy the original attributes
>
> Those attributes should only be copied if the values in the NetCDF need
to
> be scaled and offset... do they? Are the values copied into the new
NetCDF
> (floats) already scaled and offset? Or are they the same values from the
> original NetCDF *before* they are scaled and offset?
>
> —
> Reply to this email directly, view it on GitHub
> <#421 (comment)>, or
> unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AF6KABAMG47ATBC2WVBNLSTVVKNT3ANCNFSM5ZMPVSEQ
>
> .
> You are receiving this because your review was requested.Message ID:
> ***@***.***>
>
--
Donald W Johnson
205-347-1467
National Water Center
Tuscaloosa AL
—
Reply to this email directly, view it on GitHub
<#421 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SRPLEBM7HGDHAU42NBDVVKSSRANCNFSM5ZMPVSEQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Also, if you copy the attributes including scale_factor and offset, the
output values in netcdf file also take into account these automatically.
On Fri, Jul 22, 2022 at 9:40 AM Shengting Cui - NOAA Affiliate <
***@***.***> wrote:
… If you just read the netcdf file, the scale factor and offset are handled
automatically, based on my reading. So you should have the correct values
when you read in the file, unless you do anything else.
On Fri, Jul 22, 2022 at 9:01 AM Donald W Johnson -- NOAA <
***@***.***> wrote:
> I believe scale and offset are used to convert integer encoded float back
> into float automatically. If we are getting floating point values for
> those
> fields then these values have been applied. We would only need to
> duplicate
> the values if we where also story as integer encoded float. Also the
> correct scale and offset values would be different for each field in any
> case.
>
> On Fri, Jul 22, 2022 at 8:19 AM Matt Williamson ***@***.***>
> wrote:
>
> > The original format of the AORC netcdf forcing files' attributes contain
> > the scale_factor and offset...This question comes up because in my ESMPY
> > based codes the output netcdf files copy the original attributes
> >
> > Those attributes should only be copied if the values in the NetCDF need
> to
> > be scaled and offset... do they? Are the values copied into the new
> NetCDF
> > (floats) already scaled and offset? Or are they the same values from the
> > original NetCDF *before* they are scaled and offset?
> >
> > —
> > Reply to this email directly, view it on GitHub
> > <#421 (comment)>, or
> > unsubscribe
> > <
> https://github.com/notifications/unsubscribe-auth/AF6KABAMG47ATBC2WVBNLSTVVKNT3ANCNFSM5ZMPVSEQ
> >
> > .
> > You are receiving this because your review was requested.Message ID:
> > ***@***.***>
> >
>
>
> --
> Donald W Johnson
> 205-347-1467
> National Water Center
> Tuscaloosa AL
>
> —
> Reply to this email directly, view it on GitHub
> <#421 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACA4SRPLEBM7HGDHAU42NBDVVKSSRANCNFSM5ZMPVSEQ>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
@donaldwj @stcui007 Here's what's in the latest file example, via ncdump:
and
First, is this valid, then? Will these floats be rescaled by the library when read? Second, I don't think that's what we want, if they're floats. There's no cost to storing the values without scaling if they're floats, I don't think--it's only a source of confusion. |
It is correct. If you look at the attributes, T2D has a scale factor 0.1.
If I copy the original attributes where there is a scale factor, it
automatically does that. If we don't want that, we need to change the
attributes.
…On Fri, Jul 22, 2022 at 10:14 AM Matt Williamson ***@***.***> wrote:
@donaldwj <https://github.com/donaldwj> @stcui007
<https://github.com/stcui007> Here's what's in the latest file example,
via ncdump:
float T2D(catchment-id, time) ;
T2D:_FillValue = -99999.f ;
T2D:missing_value = -32767.f ;
T2D:long_name = "Temperature" ;
T2D:short_name = "TMP_2maboveground" ;
T2D:scale_factor = 0.1f ;
T2D:units = "K" ;
T2D:level = "2 m above ground" ;
and
T2D =
{2930.423, 2920.427, 2910.339, 2900.349, 2894.427, 2888.816, 2882.97,
2879.969, 2876.969, 2873.97},
{2951.988, 2938.251, 2924.16, 2910.149, 2898.901, 2887.482, 2875.898,
2871.486, 2866.829, 2862.471},
First, is this valid, then? Will these floats be rescaled by the library
when read? Second, I don't think that's what we want, if they're floats.
There's no cost to storing the values without scaling if they're floats, I
don't think--it's only a source of confusion.
—
Reply to this email directly, view it on GitHub
<#421 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SROWDGVFJY7S2GUKR2LVVK3ERANCNFSM5ZMPVSEQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Caches slices of data in
NetCDFPerFeatureDataProvider
for performance improvement.Additions
cat-52
andcat-67
in addition tocat-27
and a realization config for testing all three with NetCDF lumped forcing sample file.Removals
Changes
Time
field) and extracts catchment ID more inteligently.Testing
NetCDFPerFeatureDataProvider
tests.Screenshots
Notes
Todos
Checklist
Testing checklist (automated report can be put here)
Target Environment support