-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of bounds write in w3prosfmd_pdlib.F90 #1013
Out of bounds write in w3prosfmd_pdlib.F90 #1013
Conversation
@ukmo-ccbunney I got no differences when I ran this and compared with develop, so I'm going to re-run to see if I messed something up on my end. |
It's the sort of memory bug that might behave differently on different compilers/systems... Interestingly for me, the only difference I saw was in |
Okay, I've run this again, and I get no differences. I'm not surprised we're getting different behavior on different systems and I agree this needs to be fixed. I think I was hoping we'd get a difference which would then resolve the other issues which is why I re-ran. This will get merged here shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While @ukmo-ccbunney got diffs, I did not see any:
**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR1_MPI_e (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2 (16 files differ)
mww3_test_03/./work_PR1_MPI_d2 (13 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c (15 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c (16 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2 (16 files differ)
mww3_test_03/./work_PR2_UQ_MPI_d2 (16 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e_c (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2 (15 files differ)
ww3_tp2.10/./work_MPI_OMPH (6 files differ)
ww3_tp2.16/./work_MPI_OMPH (4 files differ)
ww3_ufs1.3/./work_a (3 files differ)
Hi @JessicaMeixner-NOAA, @ukmo-ccbunney, as for this syntax I doubt it is a bug and there it would be interesting to see the error and the compiler output that u have found @ukmo-ccbunney. Here is my opinion on this: When assigning an array of larger size to an array of smaller size, Fortran only assigns as many elements as fit into the smaller array. So, if you have: A = B And A has size NX, it is equivalent to: A = B(1:NX) In other words, the assignment operation in Fortran truncates the larger array to the size of the smaller one. This behavior can be useful but also can lead to unintentional data loss if you are not careful about array sizes. Here it was used intentionally so i would not say that this a bug. I do not see also that this is compiler/environment depended it should fit the FORTRAN standard given in e.g. International Organization for Standardization. Information technology -- Programming languages -- Fortran. ISO/IEC 1539-1:2018, 2018. To my best knowledge this is the latest and most complete reference. |
Hi @aronroland Thanks for your input on this. This was bought to my attention when I got the following runtime error (GNU compiler): To be honest, I was not aware of that the original syntax was considered valid Fortran, so apologies for raising a fix for valid code! Are you happy if the fix is left in? To me the explicit index range makes things a bit clearer. Interestingly, I did get small differences in the output when I made the fix, which made me think that some sort of memory corruption was occuring, but this could possibly have been a red herring... |
@ukmo-ccbunney I think since you got differences in answers, and the fact that this generated a runtime error, we should keep the update in the code. That being said, we should certainly follow fortran standards and I didn't quite understand @aronroland concern about data loss in this particular case, so perhaps I'm missing something obvious here? |
Hi @JessicaMeixner-NOAA, actually I am quite happy that this happened. Since GNU is saying that this is an "error" i am afraid that I am wrong with what I have said before. I will test that on my side and get back to u. |
* origin/develop: handle NaN air-sea temperatures from nearest land points (NOAA-EMC#869) Increase valid_max for f in ounp (NOAA-EMC#1014) Bugfix deallocation of invalid memory in ww3_prnc (NOAA-EMC#1016) Update to orion intel module path and two typo corrections. (NOAA-EMC#1011) Bugfix to out of bounds array write in w3profsmd_pdlib.f90 (NOAA-EMC#1013) Simple logic fix for time interpolation of boundary nodes at the end of W3XYPFSNIMP. (NOAA-EMC#1005) in w3iors use NSEA instead of NSEAL in serial write/read of VA (NOAA-EMC#954) Update documentation for UNST namelist (NOAA-EMC#986) In certain coupled configurations, the piece of code testing the coupling frequency to check if 'receive' coupling exchanges need to take place fail, resulting in an infinite loop causing the integration between time zero and the first time step to repeat indefinitely. This check needs to be rewritten, which fixes also issue NOAA-EMC#816 in a simpler way. (NOAA-EMC#999)
Pull Request Summary
Bugfix for out-of-bounds write to CCOSA and CSINA in w3profsmd_pdlib.F90
Description
There is an out-of-bounds array bug in the
w3profsmd_pdlib.F90
module:WW3/model/src/w3profsmd_pdlib.F90
Lines 3645 to 3646 in a09335e
CCOSA
andCSINA
are defined locally as sizeNTH
, butECOS
andESIN
are defined inw3gdatmd
with sizeMSPEC+MTH
(MSPEC = NK * NTH)This is causing a potential out of bounds error and clobbering memory adjacent to CCOSA/CSINA
I think this is what was intended:
which takes the direction based cosine/sine values from the first frequency band (values are identical for each freq band).
Some changes are expected to the PDLIB regtests due to the possibility of memory being overwritten.
-->
Issue(s) addressed
Commit Message
Bugfix to out of bounds array write in w3profsmd_pdlib.f90
Check list
Testing
Results of PDLIB regtests below. Differences in field outputs are only present in
ww3_tp2.21
.Most differences seem to be very small, e.g.:
All differences in other regtests are due to solely the netCDF in the bugfix branch not having the "forecast_period" variable...not sure why it is missing. I will investigate.