bug fix with negative test for dual DH operating on data handlers tha… #179

grantbuster · 2023-11-22T17:25:08Z

…t did not load from cache

bnb32

I think instead of requiring the lr_handler cache to be loaded we could move the stats related methods from BaseDataHandler into the TrainingPrepMixIn class so they're inherited by dual handler and then have the methods operate directly on lr_data/lr_val_data (with .data/.val_data aliases)

For example, the DualDataHandler._get_stats method could be:

super()._get_stats()
self.hr_dh._get_stats()

where super()._get_stats() will define self._means and self._stds.

and the normalize method could be

super().normalize(...)
self.hr_dh.normalize(...)

The .means property could be

out = copy.deepcopy(self.hr_dh.means)
out.update(self._means)
return out

etc.

Maybe I'm missing it but I don't see lr_data data being normalized, just lr_handler.data, but lr_data is what is sampled by the get_next method.

grantbuster · 2023-11-22T21:51:25Z

@bnb32 GO ENJOY YOUR DAY OFF

Bnb/ddh norm

bnb32 · 2023-11-27T22:40:21Z

sup3r/preprocessing/data_handling/dual_data_handling.py

@@ -253,18 +251,15 @@ def normalize(self, means=None, stds=None, max_workers=None):
        super().normalize(means=means, stds=stds,
                          features=self.lr_dh.features,
                          max_workers=self.lr_dh.norm_workers)
+        self.lr_dh.normalize(means=means, stds=stds,


Why do we need this? We should be able to compute stats without loading lr_dh.data.

If the lr data handler data is loaded i definitely want to normalize that data too. It's really confusing having copies/views of these data floating around with different norm factors.

Ok but you're forcing it to be loaded a few lines above.

I just fixed that exception raise clause in the last commit (i think this is what you're talking about and i agree this was leftover from previous)

Ok nice. Works for me!

bnb32 · 2023-11-27T22:41:23Z

sup3r/preprocessing/data_handling/dual_data_handling.py

-                   'DataHandler.data=None! Try initializing DualDataHandler '
-                   'with load_cached=True')
+        if self.hr_dh.data is None:
+            msg = ('High-res DataHandler object has DataHandler.data=None! '


Maybe this should say try initializing the high-res handler with load_cached=True.

bnb32 · 2023-11-27T22:42:52Z

sup3r/preprocessing/data_handling/dual_data_handling.py

@@ -113,7 +113,7 @@ def get_data(self):
        """Check hr and lr shapes and trim hr data if needed to match required
        relationship to lr shape based on enhancement factors. Then regrid lr
        data and split hr and lr data into training and validation sets."""
-        self._shape_check()
+        self._set_hr_data()


Much better name

this confused me so much hahaha

Lol yeah it's a confusing method for sure

Honestly just the name was confusing, I kept trying to figure out where hr_data was being set and took me a while to realize this method did that

bnb32 · 2023-11-27T22:55:27Z

sup3r/preprocessing/data_handling/dual_data_handling.py

-        std_arr = np.array([stds[fn] for fn in self.lr_dh.features])
-        self.lr_data = (self.lr_data - mean_arr) / std_arr
-        self.lr_data = self.lr_data.astype(np.float32)
-
        if id(self.hr_data.base) != id(self.hr_dh.data):


Should we actually alias hr_dh.data with hr_data so the latter is updated with normalization?

at this point i'd prefer not to... The two variables deviate if you set val_data so it would get confusing with an alias (or it wouldnt work) and if you don't set val data then they always have the same base array and this wont get run

Well I think it would just get updated after val_data is set since the val_split method just updates the hr_dh.data attribute, but I agree it might be confusing. Theres also the option of just using the _normalize method to directory normalize hr_data but since you want everything normalized maybe this current way makes the most sense.

bug fix with negative test for dual DH operating on data handlers tha…

bnb32 and others added 3 commits November 22, 2023 07:56

rebasing

b81e70f

rebasing

067a928

bug fix with negative test for dual DH operating on data handlers tha…

4854c71

…t did not load from cache

grantbuster requested a review from bnb32 November 22, 2023 17:25

grantbuster marked this pull request as ready for review November 22, 2023 20:51

bnb32 requested changes Nov 22, 2023

View reviewed changes

grantbuster and others added 15 commits November 22, 2023 17:49

new test on dual data handler normalization that should pass but is not

cdbc960

fixed bug on not normalizing ddh.lr_data and ddh.hr_data attributes

8a9b384

only set model mean/std if not None

a3cfb08

fixed crazy numpy view vs. copy issues with normalization

5fbc3e1

extra protection and tests against accidentally casting to float64

c1812fe

cast condmom model means/stdevs to float before saving

908cdf0

found spot where numpy array view was being broken, added tests

0d3e9fe

alternative setup for ddh with inheritance from mixin

7a00a1f

pr changes

67c3c73

Merge pull request #180 from NREL/bnb/ddh_norm

5fec6f7

Bnb/ddh norm

added new nan nearest neighbor gap fill kwargs

958fa72

means/stdevs to float32

94296bd

fixed up dual data handle normalization

72d3fe0

fixed tests

5803217

remove lr and hr dh load cached from dual dh load cached

357440f

bnb32 requested changes Nov 27, 2023

View reviewed changes

better handling of load cache vs. dont load cache

b32d7d6

bnb32 approved these changes Nov 27, 2023

View reviewed changes

grantbuster merged commit 1c6ad1e into main Nov 28, 2023
8 checks passed

grantbuster deleted the gb/ddh_stats_bug branch November 28, 2023 15:41

github-actions bot pushed a commit that referenced this pull request Nov 28, 2023

Merge pull request #179 from NREL/gb/ddh_stats_bug

ef52d2f

bug fix with negative test for dual DH operating on data handlers tha…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug fix with negative test for dual DH operating on data handlers tha… #179

bug fix with negative test for dual DH operating on data handlers tha… #179

grantbuster commented Nov 22, 2023

bnb32 left a comment •

edited

Loading

grantbuster commented Nov 22, 2023

bnb32 Nov 27, 2023

grantbuster Nov 27, 2023

bnb32 Nov 27, 2023

grantbuster Nov 27, 2023

bnb32 Nov 27, 2023

bnb32 Nov 27, 2023

grantbuster Nov 27, 2023

bnb32 Nov 27, 2023

grantbuster Nov 27, 2023

bnb32 Nov 27, 2023

grantbuster Nov 28, 2023

bnb32 Nov 27, 2023

grantbuster Nov 27, 2023

bnb32 Nov 27, 2023

bug fix with negative test for dual DH operating on data handlers tha… #179

bug fix with negative test for dual DH operating on data handlers tha… #179

Conversation

grantbuster commented Nov 22, 2023

bnb32 left a comment • edited Loading

Choose a reason for hiding this comment

grantbuster commented Nov 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bnb32 left a comment •

edited

Loading