Gb/solar experiment #250

grantbuster · 2024-12-19T17:24:48Z

Mostly changing the loss function for the SolarCC model so that all timesteps are encouraged to somewhat match the low-res inputs but we get pointwise loss on a few hours in the middle of the day with continuity provided by the adversarial loss.

Added a few other edits that are not solar related

weird bug in extended rasterizer: https://github.com/NREL/sup3r/pull/250/files#diff-67ba2e9078e12e55874258e42c6b99cf4f81647358b5eddc64c3dc8626dda45fR200

added lat/lon as features for geospatial encoding: https://github.com/NREL/sup3r/pull/250/files#diff-027a96c03ce10528eb3c5bec99820feedb7150a5e28036caf7050c86a9d09621R394-R411

…actuals, 24hrs for synthetic, to encourage realism in nighttime hours for synthetic

…twise

…p3rcc-solar dynamics are more dynamic

…ctually shouldn't even be used here

…->100kmhourly training experiments

…ot CST as was stated

…emoved stride var no longer used

bnb32 · 2024-12-19T17:53:31Z

sup3r/preprocessing/derivers/methods.py

@@ -402,6 +418,8 @@ def compute(cls, data):
    'cloud_mask': CloudMask,
    'clearsky_ratio': ClearSkyRatio,
    'sza': Sza,
+    'latitude_feature': LatitudeFeature,


You can just do 'latitude_feature': 'latitude' here. The data handler can already get 'latitude' through dh.data it just isn't seen in dh.features because dh.features is defined as list(dh.data.data_vars) and 'latitude' is in dh.data.coords.

oh nice, done here: 9f5dfcd

…olar_experiment

bnb32 · 2024-12-19T18:18:50Z

sup3r/preprocessing/rasterizers/extended.py

        if hasattr(self.full_lat_lon, 'vindex'):
            return self.full_lat_lon.vindex[self.raster_index]
-        return self.full_lat_lon[self.raster_index.flatten]
+        return self.full_lat_lon[self.raster_index]


Good catch here. This getting through must mean there was no test coverage for the h5 rasterizers with data loaded in memory? What did you run to hit this? Would you mind adding something similar as a simple smoke test, in test_rasterizer_general.py?

I think this got raised when running a multistep fwp pipeline on h5 files but i honestly can't remember exactly. I'm trying to setup a test based on test_rasterizer_general.py::test_topography_h5 but even with preload via chunks=None i can't get self.full_lat_lon to be anything but a dask array 🤷‍♂️

numpy or dask? With chunks=None it should be a numpy array and then run into the misplaced flatten

oops sorry, edited the comment, lat/lon is always a dask array even with chunks=None. Here's the draft test (stupid simple)

def test_preloaded_h5(): """test preload of h5 file""" rasterizer = Rasterizer( file_paths=pytest.FP_WTK, target=(39.01, -105.15), shape=(20, 20), chunks=None )

It's possible that i ran into this error before #247 got merged and then rebased and now this code is never used?

Yeah this is weird actually bc Sup3rX.compute only loads data_vars into memory:

sup3r/sup3r/preprocessing/accessor.py

Line 233 in 94eebae

for f in self._ds.data_vars:

and chunks=None forces a call to compute after rasterization is complete anyway:

sup3r/sup3r/preprocessing/rasterizers/extended.py

Line 113 in 94eebae

if preload:

It seems like it would take some funky manual method calls to get to _get_flat_data_lat_lon with a numpy array at the current version but I don't remember changing anything in that PR that would have impacted this. Either way, I added the test and coordinate compute in PR #251.

Yeah definitely an edge case but i ran into it at some point and we should keep the fix for posterity

bnb32 · 2024-12-19T18:33:44Z

sup3r/models/solar_cc.py

-          daily true high res sample.
-        - Discriminator sees random n_days of 8-hour samples of the daily
-          synthetic high res sample.
+        - Pointwise content loss (MAE/MSE) is only on the center 2 daylight


Can you say a bit about the benefit / your thinking of going from center 8 hours to just 2 and the mean for pointwise loss? Are you trying to weight the mid day peak more?

Yeah so the problem was that if you're doing a pointwise loss on the center 8 hours you end up getting a regression to the mean as it tries to be accurate during daytime with very little cloud movement across timesteps and then big weird changes in cloud pattern as you deviate off of those pointwise loss hours into sunset/sunrise. The temporal mean loss ensures that all hours including through sunrise/sunset are reasonable but still allows enough freedom for some cloud movement. The pointwise loss on two noonish hours ensures you have a realistic cloud field in the middle of the day and then the discriminator makes sure that transitions to/from these hours is realistic including realistic cloud movement. The only other test i'd run is doing a temporal-mean-only content loss and removing the pointwise loss altogether... but i can just set this class attribute to zero. This is kind of interesting actually i'm going to run a test overnight and maybe add one more commit tomorrow changing the default.

Ok, this mostly makes sense. Wouldn't using just the temporal mean in the content loss encourage regression to the mean though?

It could but the adversarial loss should protect against that. Really the problem with too much pointwise loss is that you don't get cloud movement in the middle 8 hours of the day.

Okay i finished a test with no pointwise loss and the model fell apart and didn't have thick enough clouds in the day. Going to merge this PR with the default pointwise loss set to two hours.

preload of coords added to compute and h5 preload test added.

Gb/solar experiment

grantbuster added 10 commits December 13, 2024 12:23

calculate solarcc model content loss on daily mean fields - 8hrs for …

5a6f100

…actuals, 24hrs for synthetic, to encourage realism in nighttime hours for synthetic

solarcc content loss needs to be a mix of 24h average and 8 hour poin…

31a8dbc

…twise

fix solar loss test - no longer zero

34fc8ca

pointwise loss should only encourage the central few hours so that su…

139c966

…p3rcc-solar dynamics are more dynamic

bug fix - broken syntax with flatten method not called, but flatten a…

5e43b05

…ctually shouldn't even be used here

updated number of pointwise loss hours based on successful 100kmdaily…

7862646

…->100kmhourly training experiments

default timezone roll in sup3rcc NSRDB training data is -7 for MST, n…

505ab5f

…ot CST as was stated

added latitude/longitude features to the base registry

c197b9b

updated pointwise loss hours based on latest successful experiment, r…

4daf683

…emoved stride var no longer used

added lat/lon_feature to data handler test for cc data

da48b14

grantbuster requested a review from bnb32 December 19, 2024 17:24

bnb32 reviewed Dec 19, 2024

View reviewed changes

grantbuster added 2 commits December 19, 2024 11:00

simplify lat/lon_feature registry

9f5dfcd

Merge branch 'gb/solar_experiment' of github.com:NREL/sup3r into gb/s…

eeabcf7

…olar_experiment

bnb32 reviewed Dec 19, 2024

View reviewed changes

bnb32 and others added 2 commits December 19, 2024 13:37

preload of coords added to compute and h5 preload test added.

9c51bec

minor bug: disc never saw last timestep because of a +1 index error

ad62263

grantbuster force-pushed the gb/solar_experiment branch from 9410dfd to ad62263 Compare December 19, 2024 21:33

Merge pull request #251 from NREL/h5_preload

dc0f42c

preload of coords added to compute and h5 preload test added.

bnb32 approved these changes Dec 19, 2024

View reviewed changes

grantbuster merged commit 651f396 into main Dec 26, 2024
12 checks passed

grantbuster deleted the gb/solar_experiment branch December 26, 2024 17:27

github-actions bot pushed a commit that referenced this pull request Dec 26, 2024

Merge pull request #250 from NREL/gb/solar_experiment

1b2a300

Gb/solar experiment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gb/solar experiment #250

Gb/solar experiment #250

grantbuster commented Dec 19, 2024 •

edited

Loading

bnb32 Dec 19, 2024

grantbuster Dec 19, 2024

bnb32 Dec 19, 2024

grantbuster Dec 19, 2024 •

edited

Loading

bnb32 Dec 19, 2024

grantbuster Dec 19, 2024

grantbuster Dec 19, 2024

bnb32 Dec 19, 2024 •

edited

Loading

grantbuster Dec 19, 2024

bnb32 Dec 19, 2024

grantbuster Dec 19, 2024

bnb32 Dec 19, 2024

grantbuster Dec 19, 2024

grantbuster Dec 26, 2024

Gb/solar experiment #250

Gb/solar experiment #250

Conversation

grantbuster commented Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grantbuster Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bnb32 Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grantbuster commented Dec 19, 2024 •

edited

Loading

grantbuster Dec 19, 2024 •

edited

Loading

bnb32 Dec 19, 2024 •

edited

Loading