Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support monthly average and variable name convention for EAMxx #712

Merged
merged 8 commits into from
Feb 8, 2024
99 changes: 92 additions & 7 deletions e3sm_diags/derivations/acme.py
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,13 @@ def restom(fsnt, flnt):
return var


def restom3(swdn, swup, lwup):
"""TOM(top of model) Radiative flux"""
var = swdn - swup - lwup
var.long_name = "TOM(top of model) Radiative flux"
return var


def restoa(fsnt, flnt):
"""TOA(top of atmosphere) Radiative flux"""
var = fsnt - flnt
Expand Down Expand Up @@ -619,6 +626,10 @@ def cosp_histogram_standardize(cld: "FileVariable"):
(("pr",), lambda pr: qflxconvert_units(rename(pr))),
(("PRECC", "PRECL"), lambda precc, precl: prect(precc, precl)),
(("sat_gauge_precip",), rename),
(
("PrecipLiqSurfMassFlux", "PrecipIceSurfMassFlux"),
lambda precl, preci: prect(precl, preci),
), # EAMxx
]
),
"PRECST": OrderedDict(
Expand Down Expand Up @@ -654,9 +665,18 @@ def cosp_histogram_standardize(cld: "FileVariable"):
("prw",),
lambda prw: convert_units(rename(prw), target_units="kg/m2"),
),
(
("VapWaterPath",), # EAMxx
lambda prw: convert_units(rename(prw), target_units="kg/m2"),
),
]
),
"SOLIN": OrderedDict(
[
(("rsdt",), rename),
(("SW_flux_dn_at_model_top",), rename), # EAMxx
]
Comment on lines +674 to 678
Copy link
Collaborator

@tomvothecoder tomvothecoder Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"SOLIN": OrderedDict(
[
(("rsdt",), rename),
(("SW_flux_dn_at_model_top",), rename), # EAMxx
]
"SOLIN": {
("rsdt",): rename,
("SW_flux_dn_at_model_top",): rename, # EAMxx
}

I don't think you need need to use an OrderedDict in the nested derived variables dictionary. I just use a regular dictionary in my refactored version (derivations.py). The old Dataset class will probably be able to parse this the same way as the new Dataset class in the CDAT migration branch because order of the keys shouldn't matter (as far as I'm aware).

# ISCCP
"CLDTOT_TAU1.3_ISCCP": {
("FISCCP1_COSP",): cosp_bin_sum,
("CLISCCP",): cosp_bin_sum,
},
"CLDTOT_TAU1.3_9.4_ISCCP": {
("FISCCP1_COSP",): cosp_bin_sum,
("CLISCCP",): cosp_bin_sum,
},
"CLDTOT_TAU9.4_ISCCP": {
("FISCCP1_COSP",): cosp_bin_sum,
("CLISCCP",): cosp_bin_sum,
},

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recognize you're updating existing elements that are already OrderedDict. Up to you if you want to use a regular dict for new or existing elements that you've modified. Just a suggestion since it is easier to read and cleaner, but I'm aware this file is going to be replaced anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tomvothecoder Thank you for reviewing. It's been a while since the ordered dictionary was designed. I think we made it ordered dictionary to support edge cases when some datasets include multiple variables in the key, and we wanted to loop over keys in order. I'm not sure if this is still needed given that lots of datasets are being updated over years.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe e3sm_diags development started around Python 3.5/3.6? I learned that the regular dict will maintain order with Python >=3.6, so we probably don't need to use OrderedDict anymore.

Starting with CPython 3.6, dictionaries return items in the order you inserted them.
Source - https://stackoverflow.com/a/47849121

Since Python 3.7, all dictionaries are guaranteed to be ordered. The Python contributors determined that switching to making dict ordered would not have a negative performance impact. I don't know how the performance of OrderedDict compares to dict in Python >= 3.7, but I imagine they would be comparable since they are both ordered.

Note that there are still differences between the behaviour of OrderedDict and dict. See also: Will OrderedDict become redundant in Python 3.7?
Source - https://stackoverflow.com/a/53535866

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i think the development started around 3.5 and 3.6. In this case we should go with regular dictionary. Good call for the cdat migration!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem!

),
"SOLIN": OrderedDict([(("rsdt",), rename)]),
"ALBEDO": OrderedDict(
[
(("ALBEDO",), rename),
Expand Down Expand Up @@ -705,6 +725,10 @@ def cosp_histogram_standardize(cld: "FileVariable"):
lambda fsntoa, fsntoac: swcf(fsntoa, fsntoac),
),
(("rsut", "rsutcs"), lambda rsutcs, rsut: swcf(rsut, rsutcs)),
(
("SW_flux_up_at_model_top", "SW_clrsky_flux_up_at_model_top"),
lambda rsutcs, rsut: swcf(rsut, rsutcs),
Copy link

@kaizhangpnl kaizhangpnl Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a beginner in using Python. Does the order of variable names and fields matter here (should "SW_flux_up_at_model_top" match rsut)?

Is it equivalent to the following?

("SW_flux_up_at_model_top", "SW_clrsky_flux_up_at_model_top"),
lambda rsut, rsutcs: swcf(rsut, rsutcs),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kaizhangpnl thanks for looking into it! I was also supersized that the final plot looks reasonable, still investigating...

Copy link
Contributor Author

@chengzhuzhang chengzhuzhang Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through the code for derived variable and documentation for lamda expression. It does seem like the order of the parameters after lambda does not matter.

In the derived variable dictionary, it seems like we can further simplify without using lamda expression, we can just to use the regular functions for most if not all cases. Tagging @tomvothecoder for confirmation..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through the code for derived variable and documentation for lamda expression. It does seem like the order of the parameters after lambda does not matter.

In the derived variable dictionary, it seems like we can further simplify without using lamda expression, we can just to use the regular functions for most if not all cases. Tagging @tomvothecoder for confirmation..

That's correct, the lambda isn't needed because the required args will be passed directly to the function when it is called. The lambda is doing the same thing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @tomvothecoder !

), # EAMxx
]
),
"SWCFSRF": OrderedDict(
Expand Down Expand Up @@ -913,29 +937,55 @@ def cosp_histogram_standardize(cld: "FileVariable"):
),
]
),
"FLUT": OrderedDict([(("rlut",), rename)]),
"FLUT": OrderedDict(
[
(("rlut",), rename),
(("LW_flux_up_at_model_top",), rename), # EAMxx
]
),
Comment on lines +940 to +945
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about not needing to use an OrderedDict.

"FSUTOA": OrderedDict([(("rsut",), rename)]),
"FSUTOAC": OrderedDict([(("rsutcs",), rename)]),
"FLNT": OrderedDict([(("FLNT",), rename)]),
"FLUTC": OrderedDict([(("rlutcs",), rename)]),
"FLUTC": OrderedDict(
[
(("rlutcs",), rename),
(("LW_clrsky_flux_up_at_model_top",), rename), # EAMxx
]
),
Comment on lines +949 to +954
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about not needing to use an OrderedDict.

"FSNTOA": OrderedDict(
[
(("FSNTOA",), rename),
(("rsdt", "rsut"), lambda rsdt, rsut: rst(rsdt, rsut)),
(
("SW_flux_dn_at_model_top", "SW_flux_up_at_model_top"),
lambda rsdt, rsut: rst(rsdt, rsut),
), # EAMxx
]
),
"FSNTOAC": OrderedDict(
[
# Note: CERES_EBAF data in amwg obs sets misspells "units" as "lunits"
(("FSNTOAC",), rename),
(("rsdt", "rsutcs"), lambda rsdt, rsutcs: rstcs(rsdt, rsutcs)),
(
("SW_flux_dn_at_model_top", "SW_clrsky_flux_up_at_model_top"),
lambda rsdt, rsutcs: rstcs(rsdt, rsutcs),
), # EAMxx
]
),
"RESTOM": OrderedDict(
[
(("RESTOA",), rename),
(("toa_net_all_mon",), rename),
(("FSNT", "FLNT"), lambda fsnt, flnt: restom(fsnt, flnt)),
(
(
"SW_flux_dn_at_model_top",
"SW_flux_up_at_model_top",
"LW_flux_up_at_model_top",
),
lambda swdn, swup, lwup: restom3(swdn, swup, lwup),
), # EAMxx
(("rtmt",), rename),
]
),
Expand All @@ -944,6 +994,14 @@ def cosp_histogram_standardize(cld: "FileVariable"):
(("RESTOM",), rename),
(("toa_net_all_mon",), rename),
(("FSNT", "FLNT"), lambda fsnt, flnt: restoa(fsnt, flnt)),
(
(
"SW_flux_dn_at_model_top",
"SW_flux_up_at_model_top",
"LW_flux_up_at_model_top",
),
lambda swdn, swup, lwup: restom3(swdn, swup, lwup),
), # EAMxx
(("rtmt",), rename),
]
),
Expand Down Expand Up @@ -983,6 +1041,10 @@ def cosp_histogram_standardize(cld: "FileVariable"):
[
(("PSL",), lambda psl: convert_units(psl, target_units="mbar")),
(("psl",), lambda psl: convert_units(psl, target_units="mbar")),
(
("SeaLevelPressure",),
lambda psl: convert_units(psl, target_units="mbar"),
), # EAMxx
]
),
"T": OrderedDict(
Expand Down Expand Up @@ -1011,23 +1073,31 @@ def cosp_histogram_standardize(cld: "FileVariable"):
lambda t: convert_units(t, target_units="DegC"),
),
(("tas",), lambda t: convert_units(t, target_units="DegC")),
(("T_2m",), lambda t: convert_units(t, target_units="DegC")), # EAMxx
]
),
# Surface water flux: kg/((m^2)*s)
"QFLX": OrderedDict(
[
(("evspsbl",), rename),
(("QFLX",), lambda qflx: qflxconvert_units(qflx)),
(("surf_evap",), lambda qflx: qflxconvert_units(qflx)), # EAMxx
]
),
# Surface latent heat flux: W/(m^2)
"LHFLX": OrderedDict(
[
(("hfls",), rename),
(("QFLX",), lambda qflx: qflx_convert_to_lhflx_approxi(qflx)),
(("surface_upward_latent_heat_flux",), rename), # EAMxx "s^-3 kg"
]
),
"SHFLX": OrderedDict(
[
(("hfss",), rename),
(("surf_sens_flux",), rename), # EAMxx
]
Comment on lines +1095 to 1099
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about not needing to use an OrderedDict.

),
"SHFLX": OrderedDict([(("hfss",), rename)]),
"TGCLDLWP_OCN": OrderedDict(
[
(
Expand Down Expand Up @@ -1454,7 +1524,12 @@ def cosp_histogram_standardize(cld: "FileVariable"):
),
# Surface temperature: Degrees C
# (Temperature of the surface (land/water) itself, not the air)
"TS": OrderedDict([(("ts",), rename)]),
"TS": OrderedDict(
[
(("ts",), rename),
(("surf_radiative_T",), rename), # EAMxx
]
),
Comment on lines +1527 to +1532
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about not needing to use an OrderedDict.

"PS": OrderedDict([(("ps",), rename)]),
"U10": OrderedDict([(("sfcWind",), rename), (("si10",), rename)]),
"QREFHT": OrderedDict(
Expand All @@ -1471,8 +1546,18 @@ def cosp_histogram_standardize(cld: "FileVariable"):
]
),
"PRECC": OrderedDict([(("prc",), rename)]),
"TAUX": OrderedDict([(("tauu",), lambda tauu: -tauu)]),
"TAUY": OrderedDict([(("tauv",), lambda tauv: -tauv)]),
"TAUX": OrderedDict(
[
(("tauu",), lambda tauu: -tauu),
(("surf_mom_flux_U",), lambda tauu: -tauu), # EAMxx
]
),
"TAUY": OrderedDict(
[
(("tauv",), lambda tauv: -tauv),
(("surf_mom_flux_V",), lambda tauv: -tauv), # EAMxx
]
),
Comment on lines +1549 to +1560
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about not needing to use an OrderedDict.

"CLDICE": OrderedDict([(("cli",), rename)]),
"TGCLDIWP": OrderedDict([(("clivi",), rename)]),
"CLDLIQ": OrderedDict([(("clw",), rename)]),
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[#]
sets = ["aerosol_budget"]
variables = ["bc, dst, mom, ncl, pom, so4, soa"]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[#]
sets = ["aerosol_budget"]
variables = ["bc, dst, mom, ncl, pom, so4, soa"]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,20 @@ case_id = "model_vs_model"
variables = ["COSP_HISTOGRAM_MISR"]
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]

[#]
sets = ["cosp_histogram"]
case_id = "model_vs_model"
variables = ["COSP_HISTOGRAM_MODIS"]
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]

[#]
sets = ["cosp_histogram"]
case_id = "model_vs_model"
variables = ["COSP_HISTOGRAM_ISCCP"]
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ ref_name = "MISRCOSP"
reference_name = "MISR COSP"
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]

[#]
sets = ["cosp_histogram"]
Expand All @@ -16,7 +16,7 @@ ref_name = "MODISCOSP"
reference_name = "MODIS COSP"
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]

[#]
sets = ["cosp_histogram"]
Expand All @@ -26,4 +26,4 @@ ref_name = "ISCCPCOSP"
reference_name = "ISCCP COSP"
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
seasons = ["ANN", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "DJF", "MAM", "JJA", "SON"]
Loading
Loading