cicecore: fix some use of uninitialized variables #560

phil-blain · 2021-01-28T22:13:04Z

PR checklist

This is the first one of a series of small PRs I'm preparing to upstream some changes I've had to make to CICE6 during my work on coupling with NEMO3.6. This one deals with use of uninitialized values, which I've detected by compiling with -init=snan,arrays (Intel). I will start the base_suite right away and update the description with the test results once they are finished running.

See the commit messages for complete details.

When CICE is compiled with the 'CICE_IN_NEMO' CPP macro, and 'calc_Tsfc' is false, ice_step::coupling_prep calls CICE_RunMod::srcflux_to_ocn to transfer heat fluxes on ice-free grid points to the ocean. The calculation in srcflux_to_ocn uses the Icepack parameter 'Lsub', but without declaring this variable nor querying it from Icepack, which results in a compile-time failure when using the standalone driver with 'CICE_IN_NEMO' defined. Fix that by correctly declaring 'Lsub' and querying it using the Icepack interface.

In subroutine 'init_grid2', the array 'tarea' is initialized in a loop on indices {ilo, ihi} and {jlo, jhi}, but is then used in subroutine 'makemask' in a loop on indices {1, nx_block} and {1, ny_block}, potentially causing an uninitialized value to be used (at line 1693). Initialize the whole array to zero first, to avoid using uninitialized values. The initialization to zero is actually commented out, following 10c446a (Dummy and unused variables. (CICE-Consortium#180), 2018-09-22) [1] , so uncomment it. [1] CICE-Consortium#180

In subroutine 'makemask', the array 'uvm' is initialized in a loop on indices {ilo, ihi} and {jlo, jhi}, but is then used in a loop on indices {1, nx_block} and {1, ny_block}, potentially causing an uninitialized value to be used (at line 1672). Initialize the whole array to zero first, to avoid using uninitialized values. The initialization to zero is actually commented out, following 10c446a (Dummy and unused variables. (CICE-Consortium#180), 2018-09-22) [1], so uncomment it. [1] CICE-Consortium#180

In subroutine 'init_coupler_flux', the initialization of the 'sst' array to 'Tf' is protected by a '#ifndef CICE_IN_NEMO' preprocessor directive. This has been the case since CICE-Consortium/CICE-svn-trunk@151b9af (cice-4.0~17, 2008-04-28), though that commit does not explain why. This is probably because in some conditions (depending on active CPPs at NEMO compilation time, as well as NEMO namelist parameters), the SST needs to be passed early to CICE, before calling CICE_Initialize (see the CICE coupling interface in NEMO [1]), and initializing it to 'Tf' in 'init_coupler_flux' would erase the values already passed from the ocean. If however the SST is *not* passed to CICE before CICE_Initialize is called, and 'CICE_IN_NEMO' is in use, then 'ice_init::set_state_var' reads from the uninitialized 'sst' array when placing ice where the ocean surface is cold. In ECCC's in-house version of the coupling interface (sbc_ice_cice), extensive changes have been made to work around bugs in the original version and limitations due to the sequence in which CICE and NEMO fields are initialized, such that we manually call 'ice_init::init_state' a second time at the end of subroutine sbc_ice_cice::cice_sbc_init. To avoid using a uninitialized array, remove the '#ifdef CICE_IN_NEMO' around the initialization of the 'sst' array to 'Tf', so that it is always initialized. These values will anyway be replaced by the "correct" ones when init_state is called a second time in our in-house version. [1] http://forge.ipsl.jussieu.fr/nemo/browser/NEMO/releases/release-3.6/NEMOGCM/NEMO/OPA_SRC/SBC/sbcice_cice.F90#L176

eclare108213

These changes might affect other coupled models, e.g. for restarts, so I'll ask @apcraig and @dabail10 to take a look.

TillRasmussen · 2021-01-29T10:02:13Z

Generally fine. A few comments.
Tarea: loops from 1 to nx_block when calculating northern or southern hemisphere average. This is used for diagnosticks and global sums. My question is more why it is necessary to include halo zones in a global sum of e.g ice concentration. Hence should the loop indices be changed instead?
uvm: There is a call to ice_halo_update with uvm just after its initialization. I would assume that the halos are updated there.
An offline topic: 'we should compare sbcice_cice between the nemo3.6 and the nemo 4

phil-blain · 2021-01-29T14:54:19Z

Thanks for taking a look Till!

Tarea: loops from 1 to nx_block when calculating northern or southern hemisphere average. This is used for diagnosticks and global sums. My question is more why it is necessary to include halo zones in a global sum of e.g ice concentration. Hence should the loop indices be changed instead?

I think I agree. I did not look into it further when I saw that the initialization was commented, I just uncommented it. What you suggest seems more sane.

uvm: There is a call to ice_halo_update with uvm just after its initialization. I would assume that the halos are updated there.

You mean that the initialization should not be necessary ? Well at least on our "CREG025-Pacific extension" grid, some values were uninitialized in the condition at line 1673 if I did not initialize uvm to zero... maybe something else is hiding here with the halos on that grid...

An offline topic: 'we should compare sbcice_cice between the nemo3.6 and the nemo 4

Yes, I looked at the changes to sbc_ice_cice in your nemo4cice project and they were mostly the same as mine, although our version has a lot of in-house modifications, that I don't yet fully understand. Sometimes stuff was implemented in a not so clean way because of lack of time, and it's one of the mid-term goals of my project to clean that up. We definitely will let you know.

dabail10 · 2021-01-29T15:47:18Z

cicecore/cicedynB/infrastructure/ice_grid.F90

@@ -384,7 +384,7 @@ subroutine init_grid2
      ! T-grid cell and U-grid cell quantities
      !-----------------------------------------------------------------

-!     tarea(:,:,:) = c0
+      tarea(:,:,:) = c0


I think this is unnecessary as tarea is in a HaloUpdate shortly after it is computed. I don't think you want any zero values in tarea at all.

Thanks for the review. I agree that it should not be necessary, but as I wrote, if I do not initialize here, an uninitialized value gets used... So that would mean that not all halo cells are initialized by the halo update. I'll have to investigate that further on the specific grid we are using.

It could be that your compiler is being very picky. What line does it abort on exactly with the uninitialized variable? I've done the check with NAG and Intel and it runs fine. As you say, it might be a particular feature of your grid. There was an issue awhile ago from one of the European groups that trapped this same error. I am fine with initializing uvm to zero, but tarea should never be zero.

It aborts at line 1694 in makemask

Ok. A cleaner initialization would be as follows:

do j = 1, ny_block
do i = 1, nx_block

im = max(i,ilo) jm = max(j,jlo) ip = min(i,ihi) jp = min(j,jhi)

Then replace all of the i,j indexes on the right with ip/jp. This will extrapolate the tarea values to the ghost cells including the four corners. Thoughts?

Sorry, these should all be im/jm or ip/jp.

If the maskhalo is turned on, it's possible that is also masking some part of the initialization. Just speculation on my part, we'd have to look into it further.

Actually, the maskhalo is not being used for the halo update of tarea, which is correct, so forget I said that. I assume tarea is not initialized on the entire halo because of the land mask? We could also fix the loop in makemask, but that's likely to affect other loops in other parts of the code in the same way. This is almost certainly a case of variables not being initialized in places they are not needed, but where they are referenced at some point later. The proper fix, of consistently implementing all loops, is probably not worth the effort, but if we want to update a loop here or there, that's probably OK too. I'm not opposed to this type of fix in this case.

dabail10 · 2021-01-29T15:48:01Z

cicecore/cicedynB/infrastructure/ice_grid.F90

@@ -1636,7 +1636,7 @@ subroutine makemask
      !-----------------------------------------------------------------

      bm = c0
-!     uvm = c0
+      uvm = c0


Similarly here. The HaloUpdate should take care of uvm values in the ghost cells.

It aborts at line 1673 in makemask

cicecore/drivers/standalone/cice/CICE_RunMod.F90

dabail10

No, I meant the Lsub initialization. This is done in ice_forcing.F90.

dabail10 · 2021-01-29T16:09:57Z

cicecore/cicedynB/infrastructure/ice_grid.F90

@@ -384,7 +384,7 @@ subroutine init_grid2
      ! T-grid cell and U-grid cell quantities
      !-----------------------------------------------------------------

-!     tarea(:,:,:) = c0
+      tarea(:,:,:) = c0


It could be that your compiler is being very picky. What line does it abort on exactly with the uninitialized variable? I've done the check with NAG and Intel and it runs fine. As you say, it might be a particular feature of your grid. There was an issue awhile ago from one of the European groups that trapped this same error. I am fine with initializing uvm to zero, but tarea should never be zero.

TillRasmussen · 2021-01-29T16:24:02Z

The European group may have been DMI. I should look at the report Jacob wrote some time ago

dabail10 · 2021-01-29T16:26:05Z

It was Issue #61 . I have reopened it and linked it here.

apcraig · 2021-02-04T22:52:35Z

@phil-blain, is testing complete, is this ready to merge?

phil-blain · 2021-02-04T22:56:17Z

not ready to merge at all. I've yet to go back into the code and come up with a better solution for tarea, and understanding why this issue comes up only on the grid we're using. I'm working on that.

phil-blain · 2021-03-02T18:06:33Z

Ok I'm beginning to understand a little bit more. It seems it is due to the fact that the domain decomposition we are using has some padded blocks.

I was able to plot tarea(:,:,iblk) from a core dump for some values of iblk and indeed some values are missing (here iblk = 33 and 34):

So my guess at this point is that the halo updates are not correctly updating padded blocks ? The darker line around the edge of the plots have value 1.0, I don't know yet where this value is coming from...

TillRasmussen · 2021-03-02T21:14:14Z

I think that 1. originates from the icehalo update
call ice_HaloUpdate (tarea, halo_info, &
field_loc_center, field_type_scalar, &
fillValue=c1)

dabail10 · 2021-03-02T21:18:27Z

Good catch. Cell padding has always been a headache for me. I try to avoid it. I think that the cells in the padded area are not even treated as ghost cells. So, that means halo updates do not touch them. However, the model should never reach in here. In my opinion, the compiler is just being pedantic about uninitialized variables here.

TillRasmussen · 2021-03-02T21:46:03Z

If I understand the code correct then block size (nx_block and ny_block) are constant wheras jhi and ihi are adjusted to the global domain size. Then in the last block towards east and north it may run out of the the global domain when it loops from 1, ny_block or 1, nx_block. (padding from e.g ihi ot nx_block).

apcraig · 2021-03-02T22:31:37Z

I believe that is all correct. Each block has it's own ilo, ihi, jlo, jhi defined. That's what this is all about,

      do iblk = 1, nblocks
         this_block = get_block(blocks_ice(iblk),iblk)         
         ilo = this_block%ilo
         ihi = this_block%ihi
         jlo = this_block%jlo
         jhi = this_block%jhi

The padded areas should not be used in general. If they are, we might want to try to fix that. I assume the halos do work correctly on the padded blocks. We do have some bit-for-bit checks for different decomps, some padded. Is the padded cell halo index ihi+1 or is it still nx+1? I actually don't know.

eclare108213 · 2021-03-02T22:36:23Z

Not looking at the code, so someone needs to confirm this, but what I recall is that the padded cells start outside of the halo cells, and they should be treated as 'missing'. Entire missing (land) blocks are never accessed, but it's possible that 'missing' padded cells still get noticed by compilers, even if they are never accessed. If they are being accessed, then there's definitely a bug.

dabail10 · 2021-03-02T22:39:48Z

Very good points. Don't we use nghost as we might have more than one ghost cell in each direction? The ilo, ihi, jlo, jhi and nx_block, ny_block are unique for each block including the padded areas. There could be loops that only go to +/- nghost relative to the lo/hi indices. The padded areas should still be "interior" cells, but just masked out right? This has always confused me.

eclare108213 · 2021-03-02T23:12:34Z

The padded cells are assigned a global index of zero, which can never occur for physical cells. This is how they are identified and avoided for boundary conditions, parallelization, etc. ilo:ihi, jlo:jhi index only the physical domain, and nx_block, ny_block are those physical cells plus ONE row or column of ghost cells [EDIT: on each side of the physical domain]. The code was written in a general way to allow more ghost cells, but this has never been tried and there are comments in the code saying that some routines will need to be updated to use more than one. Padded cells lie beyond nx_block, ny_block for the eastern- and northern-most blocks, and they should never be accessed in the main code, once the decomposition has been done. It's possible that the haloes along those edges aren't right (especially the two tripole forms), but the code has been exercised enough that I trust those pretty well.

eclare108213 · 2021-03-04T19:10:23Z

I agree. There are a lot of problems or inconsistencies in the code that we are gradually fixing, as we come across them. Let's definitely fix the loops that are causing the crashes in this case, and initialize stuff better if that's what's needed. Otherwise we can keep our eyes open for similar issues as new code modifications go in.

As an aside, it would be interesting to know how much of the code has 'aged' versus getting recoded or modified somehow. Probably a lot of it turns over, so speak, every 10 years or so. Although the 'blame' button gives some info, it doesn't necessarily capture white space changes... I'm not proposing we figure this out! But I've also wondered how many lines of code we actually have (as opposed to comments and blank lines). Just curious.
e

phil-blain · 2021-03-08T16:07:15Z

[I thought I had hit the "Comment" button on Friday, but is seems not!]

Yeah, after thinking about this since I posted my response above I also think that fixing the loop indices where they need to be fixed is a better approach. This way, we should easily find other places in the code that need to be fixed, whereas initializing the padded cells will "hide" possible loop indices "errors" in those other places.

I will also have a closer look at the halo update code to understand what's going on with the padded cells.

apcraig · 2021-03-08T17:08:03Z

I just ran the following test on cheyenne with 3 compiilers,

smoke_gx3_8x2x8x10x20_debug_diag1_droundrobin_run2day

and all pass. There is a long standing issue to update the debug flags to check more things and initialize data to Nans. @phil-blain, can you tell me what compiler flags you're using for your testing. This is with an intel compiler, correct?

It might be worthwhile to expand this PR to extend the debug flags, add a debug padded test into the base suite, test with multiple compilers, and fix loops. I think that requires some comprehensive efforts, and I'm willing to take that on. Maybe we can include it by the end of the month release? I'm also happy to just merge this PR as a starting point to fix the issues that @phil-blain has run into. We can then continue addressing the issue moving forward as I've suggested.

phil-blain · 2021-03-08T17:16:25Z

[@apcraig I was writing this as I received your answer, so I'll read it and answer separately]

OK, some updates now:

I ran the decomp_suite with arrays initialized to NaNs (without changing the loop indices and without initializing tarea) and I can reproduce the failure that I get on our grid:

$ cat results.log | \grep FAIL
FAIL daley_intel_restart_gx3_6x2x4x29x18_debug_dspacecurve run
FAIL daley_intel_restart_gx3_6x2x4x29x18_debug_dspacecurve test 
FAIL daley_intel_restart_gx3_8x2x8x10x20_debug_droundrobin run
FAIL daley_intel_restart_gx3_8x2x8x10x20_debug_droundrobin test 
FAIL daley_intel_restart_gx3_4x2x19x19x10_debug_droundrobin run
FAIL daley_intel_restart_gx3_4x2x19x19x10_debug_droundrobin test 
FAIL daley_intel_restart_gx3_16x2x5x10x20_debug_drakeX2 run
FAIL daley_intel_restart_gx3_16x2x5x10x20_debug_drakeX2 test 
FAIL daley_intel_restart_gx3_8x2x8x10x20_debug_droundrobin_maskhalo run
FAIL daley_intel_restart_gx3_8x2x8x10x20_debug_droundrobin_maskhalo test

so 5 tests are failing. Apart from first one ("dspacecurve" - see below ), they all abort at the same place in makemask when they access tarea (so at the same place as on our custom grid).

Next, I tried to change the loop indices for that loop in makemask :

diff --git i/cicecore/cicedynB/infrastructure/ice_grid.F90 w/cicecore/cicedynB/infrastructure/ice_grid.F90
index 5308236..33cae7f 100644
--- i/cicecore/cicedynB/infrastructure/ice_grid.F90
+++ w/cicecore/cicedynB/infrastructure/ice_grid.F90
@@ -1684,8 +1684,14 @@ subroutine makemask
          tarean(:,:,iblk) = c0
          tareas(:,:,iblk) = c0
 
-         do j = 1, ny_block
-         do i = 1, nx_block
+         this_block = get_block(blocks_ice(iblk),iblk)
+         ilo = this_block%ilo
+         ihi = this_block%ihi
+         jlo = this_block%jlo
+         jhi = this_block%jhi
+
+         do j = jlo, jhi
+         do i = ilo, ihi
 
             if (ULAT(i,j,iblk) >= -puny) lmask_n(i,j,iblk) = .true. ! N. Hem.
             if (ULAT(i,j,iblk) <  -puny) lmask_s(i,j,iblk) = .true. ! S. Hem.

Here I used [ilo, ihi], [jlo,jhi] since I don't see any reason why we should use [ilo - nghost, ihi + nghost], [jlo - nghost, jhi + nghost], since we should not be accessing the tarea[ns] arrays outside of [ilo, ihi], [jlo,jhi].

I then reran daley_intel_restart_gx3_8x2x8x10x20_debug_droundrobin and it failed with floating invalid (again, accessing a NaN-initialized location in to_ugrid, line 1929:

CICE/cicecore/cicedynB/infrastructure/ice_grid.F90

Lines 1922 to 1941 in c136267

    
                    jhi = this_block%jhi 
        
                    do j = jlo, jhi 
        
                    do i = ilo, ihi 
        
                       work2(i,j,iblk) = p25 * & 
        
                                         (work1(i,  j,  iblk)*tarea(i,  j,  iblk)  & 
        
                                        + work1(i+1,j,  iblk)*tarea(i+1,j,  iblk)  & 
        
                                        + work1(i,  j+1,iblk)*tarea(i,  j+1,iblk)  & 
        
                                        + work1(i+1,j+1,iblk)*tarea(i+1,j+1,iblk)) & 
        
                                        / uarea(i,  j,  iblk) 
        
                    enddo 
        
                    enddo 
        
                 enddo 
        
                 !$OMP END PARALLEL DO 
        
                 end subroutine to_ugrid 
        
           !======================================================================= 
        
           ! Transfer from U-cell centers to T-cell centers. Writes work into

so when accessing tarea(i, j+1, iblk). So this really makes me think that something is fishy with the halo updates in the case of padded cells.

One last thing, about the "dspacecurve" test: it's failing in a different way, so this might be a different bug, but just to document it here (we might want ot opne a separate issue for that):

 ice_domain work_unit, max_work_unit =          608          10
 ice_domain nocn =            0        6078      261744
 ice_domain work_per_block =            0          11         570
forrtl: severe (408): fort: (3): Subscript #1 of the array FACTORS has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
cice               00000000022AD476  Unknown               Unknown  Unknown
cice               00000000014E2A96  ice_spacecurve_mp        1451  ice_spacecurve.F90
cice               000000000150E9DA  ice_spacecurve_mp        1727  ice_spacecurve.F90
cice               00000000007327C0  ice_distribution_        2190  ice_distribution.F90
cice               00000000006FF72B  ice_distribution_         121  ice_distribution.F90
cice               000000000074A6B9  ice_domain_mp_ini         551  ice_domain.F90
cice               0000000000DC3AC4  ice_grid_mp_init_         298  ice_grid.F90
cice               0000000000402032  cice_initmod_mp_c         112  CICE_InitMod.F90
cice               0000000000401EB3  cice_initmod_mp_c          52  CICE_InitMod.F90
cice               000000000040168B  MAIN__                     43  CICE.F90
cice               00000000004015F2  Unknown               Unknown  Unknown
cice               0000000002392A4F  Unknown               Unknown  Unknown
cice               00000000004014DA  Unknown               Unknown  Unknown

phil-blain · 2021-03-08T17:25:14Z

@apcraig yes, this is with Intel. The flags that I use are:

CICE/configuration/scripts/machines/Macros.cesium_intel

Line 19 in 684e331

    
           FFLAGS     += -O0 -g -check -fpe0 -ftrapuv -fp-model except -check noarg_temp_created -init=snan,arrays

GCC/GFortran should have equivalent flag that I do not remember off the top of my head.

I think we should fix the loops (or the halo updates if that is the problem - although as @JFLemieux73 and I were discussing offline last Friday, if the decomp suite is bit for bit, I have trouble understanding how they could be wrong... meaning that if we are accessing non-initialized locations and still getting bit-for-bit results even between padded and non-padded decompositions, then maybe it does not influence the outputs in the end somehow...)

Anyway I think it's better to not do a band-aid tarea(:,:,:)=c0patch and really understand what's going on. I can help with that.

apcraig · 2021-03-08T18:13:03Z

I think we're converging on a plan. Lets try not to duplicate efforts, do you want me to look into the halo update? Or how can I help best?

The spacecurve thing has always been a little dicey in it's implementation. It might be better as a separate issue. I think it depends what tests we plan to add.

I also agree that the fact that we're getting bit-for-bit answers with different decomposition strongly suggests these problems are not influencing model results. I think that's allowed us to get away with some of this.

TillRasmussen · 2021-03-08T19:14:19Z

spacecurve failed the same way when I used machine freya and debug
freya_intel_logbfb_gx3_6x2x4x29x18_debug_diag1_dspacecurve_long_maskhalo_reprosum

On topic I think that it is dangerous to initialize to 0. If anything then a huge number in order to make the simulation fail in case that someone creates a loop that enters the padding zone or other areas that are not iniitialized. As you have agreed upon I much prefer not to initialize and then fix the problem when issues are found.

apcraig · 2021-03-08T20:59:05Z

I have duplicated @phil-blain's results. I have also looked at the halo implementation, at least initially. The halo points are always next to the ilo/ihi/jlo/jhi indices. One thing is that the fill in the halo seems to operate assuming no padding,

   do j = 1,nghost
      array(1:nx_block,           j,:) = fill
      array(1:nx_block,ny_block-j+1,:) = fill
   enddo
   do i = 1,nghost
      array(i,           1:ny_block,:) = fill
      array(nx_block-i+1,1:ny_block,:) = fill
   enddo

should probably be

   do j = 1,nghost
      array(1:nx_block, jlo-j,:) = fill
      array(1:nx_block, jhi+j,:) = fill
   enddo
   do i = 1,nghost
      array(ilo-i, 1:ny_block,:) = fill
      array(ihi+i, 1:ny_block,:) = fill
   enddo

I'm looking at the error in the to_ugrid subroutine now.

TillRasmussen · 2021-03-08T23:25:01Z

I think that the 1:ny_block should also be changed to jlo-nghost:jhi+nghost and the same in the other dimension

apcraig · 2021-03-08T23:54:34Z

@TillRasmussen, that's correct. The protocode above is wrong. What's even more difficult is that we don't know ilo/ihi/jlo/jhi in the ice_boundary.F90 where the halo update is done and prefilled, and I'm not sure we can query it because we'll end up with circular logic. The halo does a bunch of precomputation based on neighbor info and global info, but I'm not sure there is an awareness of it's local list of blocks. The problem is that you want to fill the entire halo with the fill value, not just the part of the halo that's going to be filled by the haloupdate.

apcraig · 2021-03-09T00:26:23Z

So, I'm pretty sure that the bug in the fill setting in ice_boundary is causing the problems in the to_ugrid. Unfortunately, I haven't found a way to fix that bug yet because we don't have access to the list of local blocks, so that will take a bit more work. I can create an issue.

I have duplicated all of @phil-blain's issues and fixes. I think we should proceed as follows

Set tarea=0 before it's initialized as @phil-blain has proposed until we can get a proper fix into ice_boundary.
Update the loop indices in makemask to operate only in the non-halo non-padded area
Add at least one test of debug with padding to our suite, maybe more
Update cheyenne to include snan initialization with intel
Look into new debug flags for gnu on cheyenne
Create an issue about the fill value problems in ice_boundary with padded blocks

@phil-blain, I can do a bunch of this and then PR a change onto your branch. Or if you prefer to delete this PR, I can create a new one. We could also merge this PR with what you like and then I can create a new PR with some additional stuff for cheyenne and maybe some new tests. I do want to test across the full test suite on cheyenne and cheyenne is going to be down the next 2 days, so that will slow me down a bit. But I'll start to expand my testing and see how that goes.

apcraig · 2021-03-09T06:38:13Z

It looks like some of these changes in makemask are changing answers. I need to dig a bit further.

apcraig · 2021-03-15T20:06:20Z

I've extended this PR in a branch in my fork and added some new tests turning on debug with various decompositions, updated cheyenne debug to add signaling nans, and fixed the spacecurve bug. I also did a bunch of validation testing for bit-for-bit. There are several subtle issues. I will run a full test suite soon and then maybe we can merge my changes to this branch or create a new PR.

I am going to look at #572 again to see if I can get that implemented properly on the branch. If I can figure it out quickly, I'll include it. Otherwise, we can defer it a bit.

phil-blain · 2021-03-15T20:46:32Z

Thanks Tony. Sorry for not being very responsive - I'm only working half days and have a lot going on both at and outside work.... Might be simpler in a new PR. Let's see what you come up with and then we can decide which commits we keep in which PR.

If you update the machines files, could you add the signaling NaNs also for banting and daley ? I tought I had them activated there but turns out they are not.

Nice to have the spacecurve bug also fixed!

apcraig · 2021-03-15T20:48:39Z

@phil-blain, sounds good and will do.

apcraig · 2021-03-16T19:36:27Z

I ran a full test suite with the "full" debug options and there are a bunch of other failing tests now. I have fixed a couple things, but can't take everything on at the moment. I will create a new PR and we need to continue to work on this over time. I am going to back off on the debug compiler options on cheyenne until all the tests can pass. I will update the issue related to extending the debug compiler options as well.

apcraig · 2021-03-16T23:41:03Z

Closing for #579

phil-blain added 4 commits January 27, 2021 11:01

phil-blain requested review from apcraig, dabail10, eclare108213 and TillRasmussen January 28, 2021 22:18

eclare108213 reviewed Jan 28, 2021

View reviewed changes

dabail10 reviewed Jan 29, 2021

View reviewed changes

cicecore/drivers/standalone/cice/CICE_RunMod.F90 Show resolved Hide resolved

dabail10 reviewed Jan 29, 2021

View reviewed changes

dabail10 linked an issue Jan 29, 2021 that may be closed by this pull request

uvm initialization #61

Closed

phil-blain mentioned this pull request Jan 29, 2021

uvm initialization #61

Closed

apcraig approved these changes Feb 3, 2021

View reviewed changes

phil-blain mentioned this pull request Mar 5, 2021

Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests #569

Closed

apcraig added Software Engineering Testing Type: Bug labels Mar 8, 2021

apcraig mentioned this pull request Mar 9, 2021

Halo Update Fill Implemention Bug with Padded Blocks #572

Closed

TillRasmussen mentioned this pull request Mar 9, 2021

Bug fixed 1d evp #568

Merged

12 tasks

This was referenced Mar 16, 2021

Zero out arrays before use CICE-Consortium/Icepack#354

Closed

Fixes for uninitialized variables (snan testing) #579

Merged

apcraig closed this Mar 16, 2021

cicecore: fix some use of uninitialized variables #560

cicecore: fix some use of uninitialized variables #560

Conversation

phil-blain commented Jan 28, 2021 • edited Loading

PR checklist

eclare108213 left a comment

Choose a reason for hiding this comment

TillRasmussen commented Jan 29, 2021

phil-blain commented Jan 29, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phil-blain Jan 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dabail10 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TillRasmussen commented Jan 29, 2021

dabail10 commented Jan 29, 2021

apcraig commented Feb 4, 2021

phil-blain commented Feb 4, 2021

phil-blain commented Mar 2, 2021 • edited Loading

TillRasmussen commented Mar 2, 2021

dabail10 commented Mar 2, 2021

TillRasmussen commented Mar 2, 2021

apcraig commented Mar 2, 2021

eclare108213 commented Mar 2, 2021

dabail10 commented Mar 2, 2021

eclare108213 commented Mar 2, 2021 • edited Loading

eclare108213 commented Mar 4, 2021

phil-blain commented Mar 8, 2021

apcraig commented Mar 8, 2021

phil-blain commented Mar 8, 2021

phil-blain commented Mar 8, 2021

apcraig commented Mar 8, 2021

TillRasmussen commented Mar 8, 2021

apcraig commented Mar 8, 2021

TillRasmussen commented Mar 8, 2021

apcraig commented Mar 8, 2021

apcraig commented Mar 9, 2021

apcraig commented Mar 9, 2021

apcraig commented Mar 15, 2021

phil-blain commented Mar 15, 2021

apcraig commented Mar 15, 2021

apcraig commented Mar 16, 2021

apcraig commented Mar 16, 2021

phil-blain commented Jan 28, 2021 •

edited

Loading

phil-blain Jan 29, 2021 •

edited

Loading

phil-blain commented Mar 2, 2021 •

edited

Loading

eclare108213 commented Mar 2, 2021 •

edited

Loading