-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Input netCDFs #432
base: release/devel_v8.1.0
Are you sure you want to change the base?
Input netCDFs #432
Conversation
N1ckP3rsl3y
commented
Aug 26, 2024
- SOILWAT2 currently has functionality to output simulated data through the netCDF (.nc) format (Output netCDFs #390)
- Similarly, SOILWAT2 should allow the user to provide netCDF files with input data to use for the simulation which will replace the text (we use *.in) input file format
- Allow user to choose which input variables are provided by netCDFs while the others are input through text files
- See Capability to read inputs from netCDFs #389
- This commit prepares the structs for what will happen in the branch - Renamed SW_NVARNC to SW_NVARDOM since the arrays in SW_NETCDF only holds netCDF information for constantly open files, which will only be the domain and progress files - Rename SW_NETCDF to SW_NETCDF_OUT * Moved netCDF output information from SW_OUT_DOM to SW_NETCDF_OUT * Placed within SW_OUT_DOM so output function have access to the information - New struct `SW_NETCDF_IN` does/will contain input netCDF information * Moved input netCDF file information from SW_NETCDF_OUT to SW_NETCDF_IN - Remove flag specifying that the domain template will be renamed automatically * `main()` gains local variable doing the same thing which is only sent in to `SW_CTL_setup_domain()` after being set my `sw_init_args()` - Rename the following structs to be more explicit in what information they hold * PATH_INFO -> SW_PATH_INPUTS * SW_FILE_STATUS -> SW_PATH_OUTPUTS - Update tests to reference SW_PATH_INPUTS
- Motivation: With nc inputs coming with the branch, it is good to modularize the different functionalities - Rename current `SW_netCDF.c` to `SW_netCDF_General.c` - Create two new files including their own header file * SW_netCDF_Input.c * SW_netCDF_Output.c - SW_netCDF_General.c * Provides general functionality between itself and the two other files * This file includes - The writing/getting of attribute/values from netCDF variables/dimensions - The full creation of templates and variables - Getting dimension sizes - Reading input/output information * Most functions that were previously static in SW_netCDF.c have been converted to functions with global-level visibility or were moved to the other two files - SW_netCDF_Output.c * Contains functionality that pertains to netCDF outputs * Main functionalities that are contained within this are - Reading the output variable information into the program - Conversion from SOILWAT2 units to output units - Creation of and writing to output files - SW_netCDF_Input.c * Contains functionality that pertains to netCDF inputs * Main functionalities that are contained within this are - Creating/modifying domain/progress files - Reading user inputted values - Modified function `SW_NC_init_ptrs()` to call helper functions within Output and Input files * `SW_NCOUT_init_ptrs()` and `SW_NCIN_init_ptrs()` do mostly the same of what `SW_NC_init_ptrs()` did minus the point below - Removed commented chunk of code from `SW_OUTDOM_init_ptrs()` and placed it in `SW_NCIN_init_ptrs()` - Moved static local variables and defines to their respective file
- Can be seen when using the command `CPPFLAGS=-DSWNC make clean test_sanitizer` or binary equivalent - When using the above command, the program would result in a segmentation fault due to the use of the general function `nc_put_vara(()` for unsigned integers Solution: - Bring back an old function `fill_netCDF_var_uint()` which was removed in the previous commit (fbea26e) - Replace call to `SW_NC_write_vals()` with this function in `fill_domain_netCDF_vals()`
- Appearing from commit 07afd21, nc outputs would not properly translate values to the specified units though they said they were - This was caused by a missing `!` in a check to see if the variable will be output in the function `SW_NCOUT_create_units_converters()` Solution: - Add the `!` back
- This commit is to prepare for upcoming changes that will read in input variable information - New define within SW_Defines.h - `SW_NINKEYSNC` (7) - New keys - `InKeys` which denote seven input keys that categorize the different input value purposes * Domain, spatial, topo, soil, vegetation, weather, and climate SW_netCDF_Input.h - New macro - `ForEachNCInKey` that loops over ever input key (similar to outputs `ForEachOutKey`) - New constant array - `numVarsInKey` which specifies the maximum number of input variables can be input within a key SW_PATH_INPUTS - Gains new variables that do the following things: * A list of all input files for every variable within each input key (`inFileNames`) * List of all weather input files (`weathInFiles`) * List of the number of weather input files for each weather variable (`numInWeathFiles`) * A list of stride information, stride year and number of years in stride, for every weather variable (`inWeathStrideInfo`) * A list of start/end years for every input weather file (`weathInStartEnd`) * Renamed old variable 'ncFileIDs' to 'ncDomFileIDs', which only holds nc file IDs from domain input files SW_NETCDF_IN - Gains the following modifications * Removed `varNC` and `InFilesNC` * Rename `ncVarIDs` to `ncDomVarIDs`, which only holds variable IDs from domain input variables * New variable - `readInVars` which specifies whether or not to read in a certain variable with the first index of every input key denoting if there is any variable to be read-in to reduce redundant searching * New variable - `weathCalOverride` which specifies any calendar override for weather input * New variable - `inVarInfo`, similar to nc outputs `outputVarInfo`, stores important attributes the user provided about input variables * New variable - `units_sw` and `uconv` which hold SW2 units for a variable and converters from user-provided units to SW2 units (exact same as on the nc output side)
- This commit introduces changes that make it possible to read an input spreadsheet (not added yet) that the user can use to let the program know about input variables/files they have provided through netCDF files SW_netCDF_Output.c/h - Removed allocation functions * alloc_outvars, alloc_unitssw, alloc_uconv, and alloc_outReq SW_netCDF_General.c/h - Convert the removed functions from SW_netCDF_Output.c into generalized functions which nc inputs will use - Fix misplacement of the call to `SW_NCOUT_deepCopy()` SW_netCDF_Input.c/h - New local constant static arrays that specify * possible input keys/variables * expected SW2 input units - New defines that are to be used to access certain attributes about input variables - New function `SW_NCIN_read_input_vars()` (replacing the function `SW_NCIN_read()` * Reads inputs from a provided spreadsheet and stores various points of information about input nc variables * Checks that required columns for variables are provided via the new functions `check_variable_for_required()` and `alloc_netCDF_domain_vars()` * Generates weather input files names (if weather variables are input) using the helper function `generate_weather_filenames()` - New function `SW_NCIN_alloc_inputkey_var_info()` * Allocates all dynamically-allocated information within SW_NETCDF_IN, this includes - Specifying if certain input variables are to be read-in - Variable array - Weather input files/stride information/start and end years - SW2 units and converters - New function `get_2d_input_key()` that will translate a read-in key and variable name and translate them to indices SW can understand - New function `alloc_overrideCalendars()` that allocates weather calendar override information - Rename `SW_NCIN_deconstruct()` to `SW_NCIN_dealloc_inputkey_var_info()` - Fill initialization/deallocation/deep copy of SW_NETCDF_IN SW_Files.c - Add initialization, deallocation, and deep copy of the new variables within SW_PATH_INPUTS Other general changes - Rename `SW_FILESTATUS_deepCopy()` to `SW_PATHOUT_deepCopy()` - Uncomment the deep copying of the nc output variables
- Previously, when creating a domain netCDF file, the latitude/longitude names were hard-coded to "latitude"/"longitude" or "y" and "x" - Now, the user-provided names from the provided spreadsheet (not yet added) of the "x" and "y" axis will be used for every generated netCDF file * This includes "*_bnds" variables * An example of this would be - The user provides the latitude name "lat" - The generated netCDF files will then adjust the "bnds" variable to be "lat_bnds"
- error: variable 'renameDomainTemplateNC' of type 'Bool' can be declared 'const' (sw_testhelpers.cc) - Rename `renameDomainFile` to `renameDomainTemplateNC` to match old name (SW_Main.c)
- Renamed `check_variables()` to `check_input_variables()` - Previously the function `check_variable_for_required()` would check all weather input variables every time the wrapper function called it instead of a single weather variable - `check_input_variables()` now keeps track of how many weather variables were turned on within the spreadsheet * If not all weather variables are turned on or off, the function throws an error
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release/devel_v8.1.0 #432 +/- ##
========================================================
+ Coverage 73.53% 73.68% +0.14%
========================================================
Files 20 20
Lines 6326 6452 +126
========================================================
+ Hits 4652 4754 +102
- Misses 1674 1698 +24 ☔ View full report in Codecov by Sentry. |
- The function `SW_NCIN_read_inputs()` was returning incorrect lat/lon values within the domain nc file - This this is due to always using the variable identifier of 0 resulting in only getting the value of the domain variable Solution: - Set the variable identifier to -1 every variable the function tries to get so that `SW_NC_get_single_val()` gets the correct variable identifier
- The stride information, number of weather files per variable and start/end years of each of these weather files should be the same - Removed a dimension off of the variables `weathInStartEnd` and `numInWeathFiles` within SW_NETCDF_IN - Removed the variable `inWeathStrideInfo` completely and is now local to the function `SW_NCIN_read_input_vars()` * Update the file deep copying, deconstruction, and initialization to match these changes * Update functions/function headers within `SW_netCDFIn.c` to match these changes - removing allocation and initialization - Within `SW_NCIN_read_input_vars()` and `generate_weather_filenames()`, only set stride information and number of weather files once * For stride information, if a following weather variable is read-in and does not have the same information, fail - Make the function `alloc_weath_input_info()` global so that `SW_F_deepCopy()` may use the functionality
- In a previous commit (9ffe886), the function `fill_netCDF_var_uint()` was brought back due to an unknown reason for a segmentation fault when creating the domain nc file - It was later investigated further and the realization appeared that the variable "site" was being written as a double instead of an unsigned integer * This made the underlying functions unable to properly copy the value of an unsigned integer into a variable of type double, resulting in a segmentation fault Solution: - Create the variable "site" within the domain file as type unsigned integer - Remove the exiting function `fill_netCDF_var_uint()` and go back to using `SW_NC_write_vals()`
- Remove the checking that all weather inputs were input * Any weather input information can be input by the user * This general checking will be conducted later in the program to make sure that the user input contains the necessary weather information, not *all* weather input variables as previously checked for - New function - `check_inputkey_columns()` * Checks to see that if an input key is turned on, will check that all activated input variables within said key has the same information for a certain range of the input attributes
- Follow the existing approach that happens in `SW_NCOUT_read_output_vars()` where * Check to see if the input SW2 unit is what SW2 expects, then if not, warn the user * Always copy the hard-coded SW2 unit to the variable `units_sw` within SW_NETCDF_IN - Update the warning message to use the hard-coded units and nc variable name instead of the SW2 variable name
- When reading in a row with a variable name following the general scheme of "<veg>.*", where "<veg>" is actually part of the name * Previously, when reading in this line, the copied variable name did not change when copying to all four vegetation types, trees, shrubs, grasses, and forbs * So the copied names would be "trees.*", "trees.*", "trees.*", and "trees.*" * Instead of "trees.", "shubs.*", "grasses.*", and "forbs.*"
error: narrowing conversion from 'unsigned int' to signed type 'int' is implementation-defined
- New local array - `generalSoilNames` which currently holds one variable to specify the general input variable `<veg>.transp_coeff` - Update `get_2d_input_key()` and `SW_NCIN_read_input_vars()` to expect this as a possiblilty * Update individual variables `transp_coeff[SW_TREES]`... to `Trees.transp_coeff` ... - Update units to a new format, e.g., cm3/cm3 --> cm3 cm-3 - Update fail message for when column values do not match within `check_inputkey_columns()` to show the attributes being compared
- Rename variables related to text input to the format `txt*` - Rename variables related to nc input to the formation `nc*` - Rename output_prefix to outputPrefix to stick with the camel case names
- In the future, it will be important to have a tolerance when comparing the domain within `domain.nc` and a given input file - The user can control this value by the new input so they can modify it depending on the input files' domain coordinates
- Make the classification variable for a general variable more readable by renaming it from `isGenVar` to `isAllVegVar` - Rename `genVegInc` to `allVegInc` within `SW_NCIN_read_input_vars()` - Within `SW_NCIN_read_input_vars()` check to see if the input V-axis is what we would expect * If the variable has all four types, expect the value to not be "NA" - fail if "NA" * Otherwise, do not expect a value that is not "NA" - fail if anything other than "NA" - Ignore the V-axis column within `check_inputkey_columns()` for any variables that may have been input as a general variable for all vegetation types
- In the future, this file will be replaced with a .tsv file that the program will use for input information * Update `files.in` to use this file - As of this commit, the file has not been committed but will be at a later time - Removing these files and updating `files.in` simplifies development, reducing the need to update `files.in` after every pull
- previously, has inputsProvideSWRCp was not correctly set from the input text because intRes was not updated
- the new inputsProvideSWRCp indicates if SWRC parameters are obtained from input files or estimated via a pedotransfer function - the existing site_has_swrcpMineralSoil indicates if SWRC parameters are currently available or not
…oil depth - "siteparam.in": transpiration regions are now inputted as matrix of transpiration region id and that region's lower soil depth (previously soil layer) - derive_TranspRgnBounds() is renamed from derive_soilRegions() - derive_TranspRgnBounds() determines transpiration regions by soil layer from transpiration region soil depth - SW_SIT_init_run() is now calling derive_TranspRgnBounds() - SW_SIT_read() is now handling transpiration region depths as provided by the updated "siteparam.in"
…operties as inputs via netCDFs - expanded on existing checks previously located in get_invar_information() - new checkRequiredSoils() is called by new SW_NCIN_check_input_config() -- instead by SW_NCIN_read_input_vars() because this is called by SW_CTL_setup_domain(), i.e., before we read in configuration information including "hasConsistentSoilLayerDepths" and "inputsProvideSWRCp" - new checkRequiredSoils() checks that: Required soil properties if not constant soils: 1) one out of {width, depth} 2) soil density 3) gravel 4) two out of {sand, silt, clay} 5) som 6) evaporation coefficients 7) transpiration coefficients 8) SWRCp Soil properties that are not required (value of 0 will be assumed if missing): 9) impermeability 10) initial soil temperature
… among netCDF inputs - initialize soils and soil counts - derive number of soil layers - check consistency between depth and width if both provided - check consistency between sand, silt, and clay if all provided - if !hasConstSoilLyrs, then * calculate depth if width provided but not depth * calculate width if depth provided but not width * calculate sand if clay and silt provided but not sand * calculate clay if sand and silt provided but not clay * set impermeability to 0 if not provided * set avgLyrTempInit to 0 if not provided
src/SW_netCDF_Input.c:6236:9: runtime error: index -1 out of bounds for type 'size_t[4]' (aka 'unsigned long[4]') src/SW_netCDF_Input.c:6243:13: runtime error: index -1 out of bounds for type 'size_t[4]' (aka 'unsigned long[4]') -> only use timeIndex and pftIndex if they are not negative
- Expanded documentation including hasConsistentSoilLayerDepths, nMaxSoilLayers, nMaxEvapLayers and depthsAllSoilLayers - Clarification of how soil layer number is determined from nc-inputs: * txt-mode: layers provided in "soils.in" and "swrc_params.in" * nc-mode: the first n layers with depth and width/thickness that are neither zero nor missing (if provided as input) - Clarification of the implications of `hasConsistentSoilLayerDepths` in nc-mode --> "siteparam.in" has now improved comments * if depth/thickness of soil layers vary among sites/gridcells, then soils are only created from nc-inputs; nc-outputs use soil layers as vertical axis * if depth/thickness of soil layers is constant among sites/gridcells (number of soil layers may vary), then nc-inputs complement the soil information from text inputs but text inputs must contain the the maximum number of soil layers; nc-outputs use soil layer depths as vertical axis - get_invar_information() no longer checks that size of the vertical dimension is consistent across soil nc-inputs if `hasConsistentSoilLayerDepths` -- this is now replaced with checks by SW_NCIN_soilProfile() and read_soil_inputs() with finer control - SW_NCIN_soilProfile() gains arguments "numSoilVarLyrs", "default_n_layers", and "default_depths" and loses "ncInFiles" * No need anymore to read from nc files because relevant information, i.e., size of the vertical dimension, has already been read by an earlier call to get_invar_information() * Determines nMaxSoilLayers depending on `hasConsistentSoilLayerDepths` (from text input "soils.in" or from nc-inputs "soilLayerDepths" or "soilLayerWidths") * Checks that sufficient soil layers are provided by inputs - get_nconsistent_soil_layers() and get_max_inconsistent_soil_layers() are no longer necessary because relevant information, i.e., size of the vertical dimension, is read by an earlier call to get_invar_information() - read_soil_inputs() gains argument "depthsAllSoilLayers" * The number of soil layers is inferred from the first n layers with depth and width/thickness (if provided as input) that are neither zero nor missing * If hasConsistentSoilLayerDepths`, then the function checks that depth values are consistent with the default soil depths (from text input "soils.in") * A maximum of `nMaxSoilLayers` are read along the vertical dimension of soil nc-inputs
- one warning in total instead of a warning for each soil layer is now produced if evco or trco are normalized
- Running SOILWAT2 for the test example with txt-mode or nc-mode should result in identical output values (see `tools/check_outputModes.sh`) - "SW2_netCDF_input_variables.tsv": tab-separated spreadsheet for the user to provide necessary information to obtain inputs from netCDFs for each SOILWAT2 input variable * SW_NCIN_read_input_vars() is reading the spreadsheet * The values of column "SW2 input group" are matched against "expectedColNames" from SW_netCDF_input.c * The values of column "SW2 variable" are matched against "possVarNames" from SW_netCDF_input.c * The values of column "SW2 units" are matched against "swInVarUnits" from SW_netCDF_input.c - New netCDFs that provide inputs for the example (climate, weather, soils, vegetation, topography); they are designed to contain the same information as the text input files
- 40 test runs plus 4 based on external data sources (which need to be downloaded separately) --- Create, execute, and check a set of nc-based SOILWAT2 test runs * Comprehensive set of spatial configurations of simulation domains and input netCDFs (see 'tests/ncTestRuns/data-raw/metadata_testRuns.csv') * (if available) Runs with daily inputs from external data sources including Daymet, gridMET, and MACAv2METDATA * One site/grid cell in the simulation domain is set up to correspond to the reference run (which, by default, is equivalent to tests/example); the output of that site/grid cell is compared against the reference output Usage: tools/check_ncTestRuns.sh [ACTIONS] [OPTIONS] Actions: all Combine options 'create' and 'check'; 'all' is the default if no actions are specified. create Create template for selected test run(s). check Execute SOILWAT2 and check selected test run(s). clean Remove selected test setup and run(s). cleanTestRuns Remove selected test run(s). cleanReference Remove the default reference run. downloadExternalWeatherData Runs 'wget' scripts to download daily inputs from external data sources including Daymet, gridMET, and MACAv2METDATA Options: -ref, --reference <path/to/reference/output> Path to reference run output. If the default is selected as reference and output is missing, then SOILWAT2 will be run to create reference output. Default is 'tests/ncTestRuns/results/referenceRun/Output' -t --testRun <test number> Perform action(s) on selected test run(s); default is '-1', i.e., all test runs.
nc-soil inputs - Clarify soil layers, depth, and width/thickness in nc-mode - Clarification of how soil layer number is determined from nc-inputs: - Clarification of the implications of hasConsistentSoilLayerDepths in nc-mode - Checks if user is providing required soil properties as inputs via netCDFs - Capability to derive missing soil properties among netCDF inputs - Complete example inputs for nc-mode - ncTestRuns: new framework to comprehensively test nc-based SOILWAT2 * Comprehensive set of spatial configurations of simulation domains and input netCDFs
…_sw() - SW_CTL_run_sw() now calls SW_CTL_init_run() after call to SW_NCIN_read_inputs() - previously, SW_CTL_init_run() was called by main() and then SW_NCIN_read_inputs() called individual SW_*_init_run() functions (again) - SW_LYR_read() now needs to determine n_evap_lyrs and soils.depths[] which may be used as template values by SW_DOM_soilProfile() if hasConsistentSoilLayerDepths -> this commit allows the user to provide values in text-input files that would fail checks (mainly by SW_SIT_init_run()) if the values are not correctly overwritten by nc-inputs
Resolved problem: repeated calls to `tools/check_ncTestRuns.sh -t 30` caused repeated addition of a 100-cm and a 120-cm deep soil layer to "soils.in" even if they already were present
`nc_get_var_string()`: - When checking for correct PFT input strings, create pointers for the resulting read strings - Initialize to NULL instead of having statically-sized memory * `nc_get_var_string()` seems to allocate memory - Free allocated memory at the end of the function `SW_F_deconstruct()` - Free memory for missed variable within SW_PATH_INPUTS
- Update `get_read_start()` to set latitude and longitude to indices 0 and 1, respectively - Initialize temporary silt to 0
- error: comparison of integer expressions of different signedness: 'size_t' {aka 'long unsigned int'} and 'int' - error: Value stored to 'latIndex' is never read - error: Value stored to 'lonIndex' is never read
- fix explicit link request to 'possVarNames' could not be resolved - fix explicit link request to 'numVarsInKey' could not be resolved
- SW_VPD_read() previously used sw_strtof() in a few cases instead of the correct sw_strtod() --> this slightly affects simulation output - SOILWAT2 no longer calls sw_strtof() -> removed no longer used sw_strtof() sw_strtof() was introduced on 2024-July-26 with commit 813c0b9 "Reorganize errno checks (again)" and we changed all floats to doubles on 2024-July-30 with commit 62237ae "Fix narrowing-conversions (part 2b - floats): slightly different outputs"; however, that commit failed to reflect the change by replacing calls to sw_strtof() with correct sw_strtod()
- Compare netCDFs with specified tolerances (c. 1.5e-8 for double and, equivalently, c. 8.5e-5 for float typed netCDF input test runs) - Round inputs to "climate.nc" -- as done for all other nc-inputs already (does not have impact on netCDF content) - Input values for SWRC parameters * Turn off SWRC parameter inputs except for dedicated tests (text-based version by default calculates parameters internally from soil properties using a pedotransfer function) -- even though they were activated in tsv-inputs, the input value for inputsProvideSWRCp (from siteparam.in) turned them off (in favor of calculating the values with a pedotransfer function) * New ncTestRuns that use netCDF inputs for SWRC parameters; the input values for SWRC parameters are now being calculated with the pedotransfer function that is being used internally (instead of copying stored values)
- Updated swrcp.in * Reflecting bugfix commit "Fix ksat estimation by SWRC_PTF_Cosby1984_for_Campbell1974()" 2351511 on Sep 6, 2024 * Calculate and store SWRCp at full internal precision - Updated veg2.nc and vegPFT.nc * Reflecting bugfix commit "SW_VPD_read() now uses sw_strtod() instead of sw_strtof()" f7a6f03 on Dec 20, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just about there!
… activated - clarify description of swrc-related inputs in "siteparam.in" - fix "swrcp" ncTestRuns
github review process has become confusing