-
Notifications
You must be signed in to change notification settings - Fork 51
HowTo
[[TOC]]
If you are working on a unix Computer, you are probably already familiar with this (skip to Download LPJmL), but as a Windows user, connecting to the cluster, you might need to familiarize yourself with some shell commands: Linux-Terminal-Commands
We currently host the LPJmL code at github. For an introduction to github, see [the git userguide](git userguide).
You should now have a copy of the LPJmL code. If you enter the folder (LPJROOT-folder) and list the content (ls), it should look somewhat like this:
$ ls
AUTHORS COPYRIGHT input_crumonthly.conf lpj.conf magic.mgc R
bin doc input_fms.conf lpjml.conf Makefile README
config html input_netcdf.conf lpjml_fms.conf man REFERENCES
configure.bat include INSTALL lpjml_image.conf par src
configure.sh input.conf LICENSE lpjml_netcdf.conf param.conf VERSION
Most of the code resides in the folder “src”. Parameters can be found in
“par”. *.conf files are for configuration of the model. “bin” contains
the computer-executable files, that the compiler creates.
The code is written in the programming language C (files that end with
“.c”), which can be read and modified by humans. To run the program on
a computer, you need to translate it to machine code, which is done with
a compiler. The compiler translates the code specific to the machine,
you want to run it at. Two implications of this are, that you need to
recompile every time you change something in the code (except
parameters, that are not hardcoded, but read at runtime) and every time
that you want to run it on a new machine.
There are several dependencies on standard libraries and compiler setting, please consult the configure.sh
and the Makefile templates in the folder config
and adjust these to your local setup.
Next, on linux based systems, go to your LPJROOT directory. This is the
folder, where your LPJmL code resides (Get the model
running).
Run configure.sh which configures your Makefile.inc for your system.
Remember that all compiled executables are specific to the machine on
which it has been compiled.
./configure.sh
If configure script exits with message “Unsupported operating system”,
Makefile.$osname is created from Makefile.gcc and probably has to be
modified for your operating system/compiler.
If the configure script finds a MPI environment a parallel version of
lpjml is built.
The configure script creates a copy of the following OS-specific
makefiles from
directory config:
Makefile.aix - IBM AIX settings (xlc compiler)
Makefile.aix_mpi - IBM AIX and MPI environment
Makefile.gcc - GNU C-compiler settings
Makefile.darwin_gcc - GNU C-compiler settings for MacOS X
Makefile.intel - Intel C-compiler settings
Makefile.intel_mpi - Intel C-compiler and Intel MPI settings
Makefile.cluster2015 - Intel C-compiler and Intel MPI on HLRS2015 cluster at PIK
Makefile.mpich - GNU C-Compiler and MPI Chameleon settings
Makefile.win32 - Windows settings (used by configure.bat)
Run
make
to compile just the LPJmL exe (will be stored in the bin subfolder) or
make all
to also compile all the utility programs (exes will also be stored in
the bin folder), libraries from individual sub-directories will bin in
the lib directory.
Run
make clean
to remove all object files and libraries if you want to have a fresh
start before running make or make all
Compilation can be sped up by using multiple threads (but you should
avoid using more threads than available on one node)
make -j16 all
for the cluster (which has 16 threads per node), use -j2 for a 2-core local machine etc.
Please see Compilation-on-Windows
-
Error 2 during “make clean”
/bin/sh: 1: cd: can't cd to ../../lib
If you try to clean, and some folders have already been removed manually, you receive this error. To resolve, simply create the demanded folder manually: “mkdir lib” to recreate the lib folder.
-
Error during make lpjliveview
- edit “config/Makefile.intel_mpi”
- remove “-lxcb-xlib” in line 36: X11LIB = -L/usr/X11R6/lib64 -lX11 -lxcb -lxcb-xlib -lXau
Now the model is prepared to run, but in order to run, you should modify its parameters to your needs. The three main files for this are: lpjml.conf (for the general setup - settings, or start and stop year), input.conf (setup of regional input files like climate, or landuse patterns) and param.conf (global model parameters).
Note that the model code does not include any input files like climate or
landuse patterns. Unless you run LPJmL on the PIK infrastructure you will
have to create all required input files yourself and update the paths to
these files in the respective input.conf
file.
* change lpjml.conf to specify
**** the location of parameter files and part of the input data
/*===================================================================*/
/* II. Input parameter section */
/*===================================================================*/
#include "param.conf" /* Input parameter file */
/*===================================================================*/
/* III. Input data section */
/*===================================================================*/
#include "input_crumonthly.conf" /* Input files of CRU dataset */
#if defined(WITH_WATERUSE) && defined(WITH_LANDUSE)
CLM2 /p/projects/lpjml/input/historical/input_VERSION2/wateruse_1900_2000.bin /* water consumption for industry,household and livestock */
#endif
**** the names and location of output files (be sure they match the number and order of outputfiles specified in conf.h, see below for further information on output files)
/*
ID Fmt filename
------------------- --- ----------------------------- */
GRID RAW output/grid.bin
FPC RAW output/fpc.bin
...
**** the spin-up period (number of years the model is run towards equilibrium with constant climate)
5010 /* spinup years */
**** the cells to be computed (“ALL”, “singlecellnumber”, or “startcellnumber endcellnumber”, where cellnumbers range from 0 to 67419)
ALL /* 27410 67208 60400 all grid cells */
**** the start/endyear of the simulation
1901 /* first year of simulation */
1901 /* last year of simulation */
/*===================================================================*/
/* V. Run settings section */
/*===================================================================*/
ALL /* 27410 67208 60400 all grid cells */
#ifndef FROM_RESTART
5000 /* spinup years */
/* exclude next line in case of 0 spinup years */
30 /* cycle length during spinup (yr) */
1901 /* first year of simulation */
1901 /* last year of simulation */
NO_RESTART /* do not start from restart file */
RESTART /* create restart file: the last year of simulation=restart-year */
restart/restart_1840_nv_stdfire.lpj /* filename of restart file */
1840 /* write restart at year; exclude line in case of no restart to be written */
#else
390 /* spinup years */
/* exclude next line in case of 0 spinup years */
30 /*cycle length during spinup (yr)*/
1901 /* first year of simulation */
2011 /* last year of simulation */
RESTART /* start from restart file */
restart/restart_1840_nv_stdfire.lpj /* filename of restart file */
RESTART /* create restart file */
restart/restart_1900_crop_stdfire.lpj /* filename of restart file */
1900 /* write restart at year; exclude line in case of no restart to be written */
#endif
**** …
Generally you need three different runs, where in some cases, run 2 and 3 can be combined.
- potential natural vegetation (pnv) spinup: ~5000 years to fill the
carbon pools (SPINUPYEARS=5000, STARTYEAR=1901, STOPYEAR=1901)
- no landuse, no wateruse, no irrigation, no reservoirs, riverrouting enabled, no read restart, write restart, no fixed sowing dates
- landuse spinup: 1700-simulation period start (1700-1999) - to have
realistic soil property changes through past agricultural use for
all landuse cells
- irrigation, landuse, riverrouting, wateruse, reservoirs, read restart, write restart
- if you have climate input only from 1901 or later, you have to use the spinupyears to use the the landuse input from earlier (SPINUPYEARS=201, STARTYEAR=1901, STOPYEAR=1999)
- actual run: simulation period (e.g. 2000-2100)
- irrigation, landuse, riverrouting, wateruse, reservoirs, read restart, no write restart
In Configuration_files you find an example of what needs to be changed in one specific case, but yours will most probably differ.
See these pages for more information: Input | Output | Parameter (these pages might be outdated, there might have been new parameters added, or changes to existing ones)
- Go to your LPJROOT folder.
- First check, if your lpjml-configuration file is consistent, and all files are present - lpj does that for you. ./bin/lpjcheck lpjml.conf
- If you did not create an output folder yet, you can do that via
mkdir
: for the spinup runmkdir restart
and for the transient (main) runmkdir output
. If you want specific folder names e.g. for multiple runs, you can insert#define output my_output_folder
in lpjml.conf in the definitions at the top.
* On the cluster load the lpjml module (if not already loaded) - on other machines make sure the necessary tools are available.
module load lpjml
* Source the lpj_paths file:
. ./bin/lpj_paths.sh
(short for “source ./bin/lpj_paths.sh”) This will make the paths of your LPJ folder globally available on the machine. Don’t forget the first dot!
**** If lpjml is started from a different directory than the root directory, omit sourcing the lpj_paths file, and set environment variable LPJROOT manually:
export LPJROOT=<lpjml root directory>
-
Submit the run to the cluster-slurm-queue:
./bin/lpjsubmit_slurm -group grpname -blocking 16 ntasks [-DFROM_RESTART] lpjml.conf
- grpname might be open/macmit/biodiv
- blocking 16 reserves whole nodes, which is more stable, but takes longer to start the run - otherwise the tasks are possible spread over many nodes.
- ntasks can be a multiple of 2: 64,128,256 … The more tasks you use, the faster it will run (parallelization), but it will take more time to start, because you request more resources. 100 years with 256 tasks takes approx. 15 min (Jan2018).
- if you know the job will not take much time it is useful to
limit the processing time for example by adding
-wtime 5:00:00
- this will sometimes speed up the waiting time :-)
- if you want to start from a restartfile, include the "-DFROM_RESTART part, otherwise omit it
More information on how to run the model can be found here: running_lpjml
The slurm management system calculates a priority rating for your submission. The time, you have to wait until your run is starting, depends on the current workload of the cluster, the amount and rating of jobs already submitted, and the rating of your run. The better your rating is, the faster you can start. But how can you influence your priority rating score?
- You can reduce waiting time, by disabling blocking: remove
-blocking 16
, your tasks will now be possibly spread over many nodes, which allows them to start faster but can result in the run taking longer. The variation in runtime is also increased. - Setting
ntasks
to a lower value lets the run start faster, but run longer, you should find a suitable setup for you. - Find a
-wtime
which allows the simulation to run through within the time limit (mind the variation due to non-blocking) but be as short as possible, to start earlier. Caution: if you are too greedy/optimistic, your run is cancelled and you have to queue in again, which probably won’t be worth it! - Choose the right partition! Sometimes one partition (e.g. “ram_gpu”) has a lower workload than “standard” or “broadwell”. You can check the load of a partition with the “sinfo” command ([[Linux-Terminal-Commands#Slurm-management-system|Slurm Management System]]). If there are more idle nodes than you need on a partition - go! Otherwise (if there are mixed nodes) you might consider deactivate blocking.
By the time, the simulation finishes, the slurm-submission script is
configured to send you an email. You can see what the status of the run
was, when it stopped (COMPLETED, CANCELLED, FAILED, OUT_OF_MEMORY).
For more information, check the error and output files lpjml.%i.err and
lpjml.%i.out for hints on what went wrong.
Some ideas:
- Are the input files compliant with your start-end date?
- balanceW/C Error:
- Water/Carbon balance is not correct -> threshold in src/lpj/check_fluxes.c:138 can be adjusted
- better solve the problem, then omit the error, by commenting out the line or increasing the threshold !
By default, the output files are saved to the “output” directory in your lpj folder. They are saved as plain binary data, meaning, that you cannot look at them directly, but that you need a tool to extract the values. The outputfiles do not contain any header, the order in which the outputvariables are (for some outfiles) in can be found in Output.
As a rough first check, if everything worked properly, you can have a
look at the *.out file that is created during the run. If you have not
changed settings, it will reside in the LPJROOT directory.
First there is a long list of Parameters and Settings, and then it
displays for every modelled year, important global carbon and water
fluxes.
First year: 2000
Last year: 2109
Number of grid cells: 67420
==============================================================================
Simulation begins...
Carbon flux (GtC) Water (km3)
--------------------------------------- ---------------------------------------------
Year NEP estab fire harvest total transp evap interc wd discharge
------ ------- ------- ------- ------- ------- ---------- ------- ------- ------- ----------
2000 18.798 0.217 3.206 12.473 3.336 43448.8 13646.3 7618.8 981.8 58945.5
2001 16.248 0.218 3.487 11.756 1.223 42669.1 13180.4 7334.2 989.8 55353.3
switch flags by calling lpjml -DFLAG, e.g lpjml -DISRANDOM
- FROM_RESTART: A restart-file can be generated at the end of a LPJ
run. [Setup the
parameters for the model](HowTo#Setup the parameters for the model)
Set this flag to restart from this point (which might be useful e.g. to start multiple runs from the same spin-up run)
Compilation of LPJmL is customized by definition of macros in the
LPJFLAGS
section of “Makefile.inc”. You have to call “make clean” and “make all”
to recompile afterwards
LPJFLAGS= -Dflag1 ...
Flag Description
------------------- ------------------------------------------------------------
COUPLING_WITH_FMS enable coupling to FMS
DAILY_ESTABLISHMENT Enable daily establishment
DEBUG diagnostic output is generated for debugging purposes
DOUBLE_HARVEST adding correct sequencing of harvest events
IMAGE include coupler to IMAGE model
LINEAR_DECAY use linearized functions for litter decay
MICRO_HEATING Enable microbial heating
SAFE code is compiled with additional checks
STORECLIMATE store climate data in memory for spin up phase
USE_MPI compile parallel version of LPJmL
USE_NETCDF enable NetCDF input/output
USE_NETCDF4 enable NetCDF version 4 input/output
USE_RAND48 use drand48() random number generator
USE_UDUNITS enable unit conversion in NetCDF files
WITH_FPE floating point exceptions are enabled for debugging purposes
LANDUSE includes all land-use specific routines, especially the memory allocation to the cropdates and the landuse structure (only needed in the LPJ_SPEEDY branch)
NEW_GRASS implementation of grasses (e.g. for bioenergy-grasses) - only for LPJmL 3.X - for LPJmL 4.0 and later it is part of the standard config
------------------- ------------------------------------------------------------
To run with EFRs, …
See Debugging
See model documentation for navigating through the model structure.
Check out the LPJmL4 model documentation publications for more
details:
http://dx.doi.org/10.5194/gmd-11-1343-2018
http://dx.doi.org/10.5194/gmd-11-1377-2018