Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diurnal climatology hardwired to eam.h4 #439

Closed
golaz opened this issue Jun 19, 2023 · 12 comments · Fixed by #457
Closed

Diurnal climatology hardwired to eam.h4 #439

golaz opened this issue Jun 19, 2023 · 12 comments · Fixed by #457

Comments

@golaz
Copy link
Collaborator

golaz commented Jun 19, 2023

The source for the e3sm_diags atmos diurnal climatology files is currently hardwired to eam.h4:

cp -s ${climo_diurnal_dir_source}/${nc_prefix}.eam.h4_*_${begin_year}??_${end_year}??_climo.nc .

This should not be the case as a user might decide to store the somewhere else.

@golaz
Copy link
Collaborator Author

golaz commented Jun 19, 2023

Maybe this information should be read from the climo subsection since we already have to specify which one it is:

climo_diurnal_subsection = "atm_monthly_diurnal_8xdaily_180x360_aave"
climo_diurnal_frequency = "diurnal_8xdaily"

In that case, so could climo_diurnal_frequency

@forsyth2
Copy link
Collaborator

forsyth2 commented Jul 6, 2023

cp -s ${climo_diurnal_dir_source}/${nc_prefix}.eam.h4_*_${begin_year}??_${end_year}??_climo.nc .

This line is part of the create_links_climo_diurnal() function, where climo_diurnal_dir_source is the first parameter.

It is called in two places.

For all diurnal runs:

climo_diurnal_dir_source={{ output }}/post/atm/{{ grid }}/clim_{{ climo_diurnal_frequency }}/{{ '%dyr' % (year2-year1+1) }}
create_links_climo_diurnal ${climo_diurnal_dir_source} ${climo_diurnal_dir_primary} ${case} ${Y1} ${Y2} 3

For model-vs-model diurnal runs:

climo_diurnal_dir_source={{ reference_data_path_climo_diurnal }}/{{ '%dyr' % (ref_year2-ref_year1+1) }}
climo_diurnal_dir_ref=climo_diurnal_ref
create_links_climo_diurnal ${climo_diurnal_dir_source} ${climo_diurnal_dir_ref} {{ ref_name }} ${ref_Y1} ${ref_Y2} 4

@golaz Is the suggestion here to have a new parameter where the user can themselves define climo_diurnal_dir_source? (And I'm assuming keep the original definition if the user provides no such parameter)

@mahf708
Copy link

mahf708 commented Jul 13, 2023

This came up in an email chain. I think what would likely be most helpful is to split the input_files parameter into two: input_component (for eam, elm, mosart, etc.) and input_tape (for h0, h1, h6, etc.) and have that apply to all components (e3sm_diags, climo, etc.)

  input_subdir = "archive/atm/hist"
  input_files = "eam.h0"

@forsyth2
Copy link
Collaborator

@mahf708 Sorry, I completely focused in on "might decide to store the[m] somewhere else" rather than the more important "currently hardwired to eam.h4". Yes, your solution seems sufficient, assuming that's the only thing that a user would want to change about their output path.

@forsyth2
Copy link
Collaborator

Maybe this information should be read from the climo subsection since we already have to specify which one it is

I'm not overly familiar with the h numbers. Is it the 8xdaily in atm_monthly_diurnal_8xdaily_180x360_aave that tells us we need h4? (I assume atm tells us we need eam).

a user might decide to store the[m] somewhere else.

@golaz Can I get a little clarification on what exactly we want the user to be able to change? Is it only the eam.h4 part or do we want complete customization of ${climo_diurnal_dir_source}/${nc_prefix}.eam.h4_*_${begin_year}??_${end_year}??_climo.nc?

@forsyth2
Copy link
Collaborator

forsyth2 commented Jul 13, 2023

split the input_files parameter into two

@mahf708 One other issue I foresee: currently, input_files is not used in https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/e3sm_diags.bash, so we could easily use that (or split it into two as you suggest) to customize the currently hardwired eam.h4. However, I don't think it would be intuitive at all to users that this parameter is specifically for the diurnal atmosphere data, when the whole task in E3SM Diagnostics in general...

@chengzhuzhang
Copy link
Collaborator

This came up in an email chain. I think what would likely be most helpful is to split the input_files parameter into two: input_component (for eam, elm, mosart, etc.) and input_tape (for h0, h1, h6, etc.) and have that apply to all components (e3sm_diags, climo, etc.)

  input_subdir = "archive/atm/hist"
  input_files = "eam.h0"

I think it is a good solution.

@mahf708
Copy link

mahf708 commented Jul 13, 2023

With the exception of the h0 tape, the problem is that these tape numbers can be chosen at runtime 😜 (along with the output frequency, say 8xdaily or 1-hourly, and the "density," essentially the numbers of times a variable is saved before a new file is produced)

If I understood what Chris was after, the coupled group reduced the output significantly recently for v3 and the tapes got shuffled a little. So, I believe Chris simply wants the user to be able to specify the tape number for specific tasks. It just so happens that's only a problem now for diurnal_8xdaily. So, to answer your question directly, I think it's just eam.h4 (but wait to hear from Chris). Everything else currently is pretty organized and no need to add more work for the e3sm_diags team 😄

However, I don't think it would be intuitive at all to users that this parameter is specifically for the diurnal atmosphere data, when the whole task in E3SM Diagnostics in general...

Totally agree. I didn't think through this carefully initially. How easy is it to cross-reference user specifications between [climo] and [ts] and later tasks such as [e3sm_diags]? That would be the best approach and it builds nicely on existing features with the double-bracket specs [[ ... ]].

Btw I just looked at some older configs, I see [[ atm_monthly_180x360_aave ]] and [[ atm_monthly_diurnal_8xdaily_180x360_aave ]] under [climo], but only [[ atm_monthly_180x360_aave ]] under [e3sm_diags]. Would one need to start a new section under [e3sm_diags] for [[ atm_monthly_diurnal_8xdaily_180x360_aave ]]?

I never use anything outside the h0 in e3sm_diags yet, so I am not familiar enough... but I'd like to add a nifty little piece of diagnostics to e3sm_diags that we developed recently for AWG. It relies on 3-hourly data; I need to think of a clean way to implement since it is computationally intensive.

@rljacob
Copy link
Member

rljacob commented Jul 13, 2023

Yes EAM allows the user to specify at runtime how many hX tapes there are, what is in them, and the frequency. Its just been the convention for v1 and v2 that h4 has daily. (The h0 frequency can also be changed to something besides monthly).

@golaz
Copy link
Collaborator Author

golaz commented Jul 13, 2023

Let me see if I can clarify.

When I implemented the reduced output, I ended up reshuffling file content and the 8xdaily is now in eam.h3 instead of eam.h4 previously.

The [climo] task can handle this because we specify

  input_subdir = "archive/atm/hist"
  input_files = "eam.h3"

The [e3sm_diags] tasks also knows where the post-processed files are. The only problem is that the diurnal climo filename is now different (because ncclimo includes eam.h? into the filename).

I see three options, two easy, one more difficult.

  1. We assume that the user is never going to create multiple diurnal climatologies from different input files for the same frequency (i.e. 8xdaily from eam.h3 and eam.h4). If that's the case, we can simply replace the line in question with

cp -s ${climo_diurnal_dir_source}/${nc_prefix}.eam.h*_*_${begin_year}??_${end_year}??_climo.nc .

  1. We add a new argument to [e3sm_diags], something like

climo_diurnal_input_files = "eam.h3"

  1. We cross-reference and grab the content of input_files from the [climo] task and subtask defined in climo_diurnal_subsection of [e3sm_diags].

(1) is cleaner, but there is a (very) small risk of unexpected behavior if a user does something odd. (2) is safer, but it adds one more parameter with redundant information. (3) is most elegant, but more complicated to implement. (1) and (3) would work out of the box (no modifications to cfg files), (2) would not.

My preference would be (3), (1), (2). But given time constraint, maybe we should go with (1).

Am I missing something?

@forsyth2
Copy link
Collaborator

@golaz Thanks for the explanation. I agree with your preference ordering. I'll try to scope out what would be required for (3) and if it's looking to be too much, we can just get (1) in by the release deadline. I imagine (3) will indeed be difficult -- I think we've historically had trouble getting information between tasks/subtasks.

@forsyth2
Copy link
Collaborator

forsyth2 commented Jul 14, 2023

@golaz Thank you for suggesting (3). That was remarkably simpler to implement than I thought it would be -- just a few iterations of determining how to properly access the configuration & where to place the code block. Merged in #457.

The key piece that makes it work is the fact that we already have the climo_diurnal_subsection parameter. Without that, we'd have no idea how to make a path through the configuration structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants