Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gdasfcst task failure with warnings during canopy conductance step, abnormal 2MT, HDF errors #652

Closed
BrettHoover-NOAA opened this issue Feb 11, 2022 · 6 comments

Comments

@BrettHoover-NOAA
Copy link

Description
gdasfcst task cannot complete, with the log-file citing an HDF error after a series of "ABNORMAL 2MT" warnings during a canopy conductance step.

starting computing canopy conductance
 ABNORMAL 2MT           71         736   221.5673       69877.80    
 ABNORMAL 2MT           72         736   221.3769       69857.98    
 ABNORMAL 2MT           73         736   221.1685       69832.61    
 ABNORMAL 2MT           74         736   220.9999       69806.18    
 ABNORMAL 2MT           75         736   220.9465       69777.02    
 ABNORMAL 2MT           76         736   220.8869       69748.76    
 ABNORMAL 2MT           77         736   220.8211       69721.45    
 ABNORMAL 2MT           78         736   220.7488       69695.11    
 ABNORMAL 2MT           79         736   220.6703       69669.68    
 ABNORMAL 2MT           80         736   220.5854       69645.26    
 ABNORMAL 2MT           81         736   220.4940       69621.80    
 ABNORMAL 2MT           82         736   220.3961       69599.29    
 ABNORMAL 2MT           83         736   220.2917       69577.77    
 ABNORMAL 2MT           84         736   220.7831       69537.96    
 ABNORMAL 2MT           85         736   221.2929       69498.41    
 ABNORMAL 2MT           86         736   221.7730       69460.70    
 ABNORMAL 2MT           69         737   221.5153       69838.55    
 ABNORMAL 2MT           70         737   221.2965       69814.84    
 ABNORMAL 2MT           71         737   221.0716       69791.30    
 ABNORMAL 2MT           72         737   220.8403       69767.98    
 ABNORMAL 2MT           73         737   220.6028       69744.85    
 ABNORMAL 2MT           74         737   220.3588       69721.93    
 ABNORMAL 2MT           75         737   220.1084       69699.23    
 ABNORMAL 2MT           76         737   219.9512       69672.94    
 ABNORMAL 2MT           77         737   219.8492       69644.45    
 ABNORMAL 2MT           78         737   219.7957       69615.77    
 ABNORMAL 2MT           79         737   219.8442       69587.48    
 ABNORMAL 2MT           80         737   219.8817       69558.89    
 ABNORMAL 2MT           81         737   219.9080       69530.07    
 ABNORMAL 2MT           82         737   219.9230       69501.01    
 ABNORMAL 2MT           83         737   219.9268       69471.67    
 ABNORMAL 2MT           84         737   219.9192       69442.05    
 ABNORMAL 2MT           85         737   219.9001       69412.21    
 ABNORMAL 2MT           86         737   219.8693       69382.09    
 ABNORMAL 2MT           87         737   219.8268       69351.72    
 ABNORMAL 2MT           88         737   220.0542       69313.51    
 ABNORMAL 2MT           89         737   220.4973       69269.26    
 ABNORMAL 2MT           90         737   220.9185       69225.46    
 ABNORMAL 2MT           91         737   221.3177       69182.15    
 ABNORMAL 2MT           92         737   221.6946       69139.30    
 ABNORMAL 2MT           93         737   222.1422       69093.52    
 ABNORMAL 2MT           73         738   221.8228       69594.29    
 ABNORMAL 2MT           74         738   221.7346       69566.14    
 ABNORMAL 2MT           75         738   221.6519       69537.63    
 ABNORMAL 2MT           76         738   221.5747       69508.74    
 ABNORMAL 2MT           77         738   221.5032       69479.52    
 ABNORMAL 2MT           78         738   221.4373       69449.88    
 ABNORMAL 2MT           79         738   221.3820       69419.81    
 ABNORMAL 2MT           80         738   221.4508       69387.46    
 ABNORMAL 2MT           81         738   221.5357       69354.02    
 ABNORMAL 2MT           82         738   221.6367       69319.47    
 ABNORMAL 2MT           83         738   221.7805       69285.31    
 ABNORMAL 2MT           84         738   221.9454       69249.78    
 ABNORMAL 2MT           85         738   222.1314       69212.84    
 end of MDL2THandpv
 nTLFLD=         113
G2: fall back to simple algorithm (glahn ier=714)
 ichunk2d,jchunk2d        1536          20
 ichunk3d,jchunk3d,kchunk3d        1536          20         127
 line          386 NetCDF: HDF error
application called MPI_Abort(comm=0x84000002, 1) - process 399

Requirements
I am using a v16x global workflow (fca3433) modified with updated obsproc packages in the config.base file:

export HOMEobsproc_prep="$BASE_GIT/obsproc/obsproc_prep.v5.5.0"
export HOMEobsproc_network="$BASE_GIT/obsproc/obsproc_global.v3.4.2"

experiment setup:
setup_expt.py --pslot v16x_sept_g17 --configdir /scratch1/NCEPDEV/da/Brett.Hoover/mit_g16_g17/global-workflow/develop.20201222.AMVQ/parm/config --idate 2020082200 --edate 2020110100 --comrot /scratch1/NCEPDEV/stmp2/Brett.Hoover/ROTDIRS/ --expdir /scratch1/NCEPDEV/da/Brett.Hoover/para --resdet 384 --resens 192 --nens 80 --gfs_cyc 1

workflow setup:
setup_workflow.py --expdir /scratch1/NCEPDEV/da/Brett.Hoover/para/v16x_sept_g17

Using cold-start C384/C192 ICs that were provided to me. The bug appears on the gfsfcst task initialized on 202008230000.

Acceptance Criteria (Definition of Done)
Successful completion of gdasfcst task, continuation of gdas cycling

@KateFriedman-NOAA, let me know if you have any suggestions

@DavidHuber-NOAA
Copy link
Contributor

@BrettHoover-NOAA The abnormal 2MT warnings are not too uncommon the Antarctic winter.

The HDF error usually signifies an error in creating the atm netCDF files in parallel. I thus often turn off the parallel writes. On Hera, you can accomplish this by setting OUTPUT_FILETYPES=" 'netcdf' 'netcdf' " in the experiment's config.fcst.

@BrettHoover-NOAA
Copy link
Author

Thanks @DavidHuber-NOAA, I'm running a test with this change to config.fcst right now.

@KateFriedman-NOAA
Copy link
Member

@BrettHoover-NOAA FYI, when you run serially (netcdf) the runtime will be up to twice as long, so adjust your walltime accordingly.

@WalterKolczynski-NOAA
Copy link
Contributor

Note the incoming PR #602 will change this, as the OUTPUT_FILETYPES is being split into two variables until being combined when writing the namelist. That PR also forces serial write for the EnKF on Hera because of this type of issue. I didn't encounter issues with non-ensemble data. Or, at least, none that I noticed.

@WalterKolczynski-NOAA
Copy link
Contributor

@BrettHoover-NOAA Is this still a problem when running develop, or can we close this issue?

@BrettHoover-NOAA
Copy link
Author

Hi @WalterKolczynski-NOAA, I think it's safe to close this issue, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants