-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdatarmor_documentation.qmd
582 lines (380 loc) · 16 KB
/
datarmor_documentation.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
## What is Datarmor?

## Applications
- Parallel computing (MPI, Fortran, C/C++)
- Data Analysis (Python, R, Matlab, Julia)
- JupyterHub (Notebooks, Jupyterlab)
- Machine Learning (TensorFlow)
- Data visualisation (VisIt, VMD)
## Creating an account
To create an account, send me an email (nicolas.barrier\@ird.fr) with:
- An **institutional** email address
- A phone number
- The names of your supervisors (if any)
- A short description of why you want to use Datarmor (projects, tools, etc.)
Support questions about Datarmor should be sent to [assistance\@ifremer.fr](mailto:[email protected])
## Connection: Pulse Secure
Outside the Ifremer Network, PulseSecure is required. It can be downloaded [here](https://domicile.ifremer.fr/index.php/s/WC9GArY8Eo51yZE/,DanaInfo=cloud.ifremer.fr,SSL+download?path=%2F&files).
Install the right version depending on your OS:
- Windows 64: `PulseSecure.x86.msi`
- Mac Os X 64: `PulseSecure.dmg`
- Ubuntu 20.04: `pulsesecure_9.1.R12_amd64.deb`
- Ubuntu 18.04: `pulse-9.1R8.x86_64.deb`
## Connection: Pulse Secure
Now set-up a new connection as follows:

Connect to the Datarmor VPN using your **extranet** logins and leave it open until you have finished.
## Connection: Terminal (Linux / Mac Os X)
For Linux/Mac Os X users, open a Terminal and types:
```
ssh -X [email protected]
```
replacing `nbarrier` by your **intranet** login. The `-X` option allows display (for use of text editors for instance). If the `-X` option does not work, use `-Y`.
> For Mac Os X users, I recommend to install and use [iTerm2](https://iterm2.com/) Terminal application, which is more user friendly than the default one.
## Connection: Putty or MobaXTerm (Windows)
For Windows Users, it is recommended to use [Putty](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html) or [MobaXTerm](https://mobaxterm.mobatek.net/)

## Connection: Putty (Windows)
To allow display, you need to enable X11 forwarding on the `Connection > SSH` menu:

## RSA keys (Linux / Mac Os X)
To connect on Datarmor without typing the password, you need to use an ED25519 key. First, check if one already exists on your local computer:
```
ls $HOME/.ssh/id_ed25519.pub
```
If no such file, generate a key using
```
ssh-keygen -t ed25519
```
and follows instructions. Then, send it to Datarmor (the intranet password is needed):
```
ssh-copy-id [email protected]
```
Use your login instead of `nbarrier`. Now you should be able to connect to Datarmor without typing your password.
## Navigating on Datarmor
Datarmor is a Unix computer. You need some Linux background.
- Change directory: `cd my/new/directory`
- Find current directory: `pwd`
- List directory content: `ls`
- Go to parent directory: `cd ..`
- Create new folder: `mkdir -p folder_name`
- Create an empty file: `touch file.txt`
- Copy a file: `cp file.txt save_file.txt`
- Remove a file/folder: `rm -r file.txt`
- Rename/Move a file: `mv file.text my/dest/renamed.txt`
Visit [linux-commands-cheat-sheet](https://linoxide.com/linux-commands-cheat-sheet/) for a summary of essentials Linux commands.
## Datarmor: important folders
Important folders are:
- `$HOME`: main folder (50 Go, backed-up). For codes and important things
- `$DATAWORK`: data folder (1 To, **no back-up**). For data.
- `$SCRATCH`: temporary folder (10 To, files older than 10 days are automatically removed). For running computation.
- `/dataref`: folder containing some reference data (Copernicus data, atmospheric forcings, etc.)
- `/home/datawork-marbec-pmod`: Marbec DEN folder (limited access)
> To recover deleted files from `$HOME`, send an email to [assistance\@ifremer.fr](mailto:[email protected])
## Modules (1/2)
To work with external tools, you need to load them into Datarmor's memory. This is done as follows:
```
module load R ## load one module
module load java NETCDF ## load 2 modules
module load vacumm/3.4.0-intel ## load a specific version
```
To list all the available modules:
```
module avail
```
## Modules (2/2)
To list the modules that are loaded:
```
module list
```
To unload a module:
```
module unload R
```
To unload all the modules at once:
```
module purge
```
## Default settings (1/2)
**By default, the `rm` command does not ask you whether you are sure or not. Might be error prone. We might want to correct it.**
To change some default behaviours, you need to create/edit a Linux configuration file.
```
gedit ${HOME}/.cshrc &
```
> The `&` character implies that you will keep access to your terminal. Else, the terminal will be back once the text editor is closed.
## Default settings (2/2)
In the `.cshrc` file, you can overwrite existing commands and create new ones:
``` {.csh language="csh"}
alias x 'exit'
alias rm 'rm -i -v'
alias cp 'cp -i -v'
alias mv 'mv -i -v'
```
You can also create environment variables (accessible via `$`):
``` {.csh language="csh"}
setenv R_LIBS_USER $HOME/libs/R/lib
```
You can also load your favorite modules automatically by adding the `module load` command in your `.cshrc` file:
```
module load R
```
## Running a calculation: warning!
When you connect on Datarmor, you end-up on the **login node**. It is used for navigation, small file manipulation, text edition, code compilation **but nothing more!**
**Absolutely no computation or heavy file manipulation should be done from here!!!**
Heavy stuff should be done on a compute node, which are accessible by submitting a PBS job using the `qsub` command.
## Running a job: interactive mode.
To run an interactive job, type the following command line
```
qsub -I -l walltime=01:00:00 -l mem=50M
```
The `-l mem` specifies the requested memory, `-l walltime` specifies the requested calculation time.
Job is ended by typing `exit` on the terminal.
> **Running interactive jobs imply that you leave your connection open until the job is finished.**
## Running a job: PBS script
To run a job in a non-interactive way, you need to create a `.pbs` file, which contains the instructions for running your job.
When done, run the calculation as follows:
```
qsub run_script.pbs
```
Job output files will be provided in a `run_script.pbs.oXXXX` file, with `XXXX` the job ID.
Some examples are provided in Datarmor's `/appli/services/exemples/` folder (see the `R` and `pbs` sub-folders).
## Running a job: PBS script (sequential)
``` bash
#!/bin/csh
#PBS -l mem=100M
#PBS -l walltime=01:00:00
## Load the modules that will be used to do the job
source /usr/share/Modules/3.2.10/init/csh
module load R
## go to the directory where the job has been launched
cd $PBS_O_WORKDIR
## Run R
Rscript script.R >& output.log ## redirects outputs into log
```
## Running a job: PBS script (parallel)
Parallel jobs are run in the same way, except that a queue (`-q`) parameter is added. It specifies the resources that you will use.
``` {.bash language="csh"}
#!/bin/csh
#PBS -l mem=100M
#PBS -q mpi_2
#PBS -l walltime=01:00:00
cd $PBS_O_WORKDIR
source /usr/share/Modules/3.2.10/init/csh
module load NETCDF/4.3.3.1-mpt-intel2016
$MPI_LAUNCH program.exe >& out
```
In the above, 2 nodes, each containing 28 cores are requested, so 56 cores in total
## Running a job: queues
The full description of Datarmor queues is provided [here](https://domicile.ifremer.fr/intraric/Mon-IntraRIC/Calcul-et-donnees-scientifiques/Datarmor-Calcul-et-Donnees/Datarmor-calcul-et-programmes/,DanaInfo=w3z.ifremer.fr,SSL+Queues-d-execution-PBS-Configuration-detaillee). Most important ones are:
- `sequentiel`: the default one (single core)
- `omp`: shared-memory queue (several nodes with access to the same memory).
- `mpi_N`: distributed memory queue (several nodes with independent memories), with `N` ranging from 1 (28 cores) to 18 (504 cores)
- `big`: distributed memory with 1008 cores.
- `ftp`: queue used to upload/download data to/from remote FTP servers
- `gpuq`: GPU queue.
## Running a job: good practice
A good practice is to copy everything you need (code + data) to `$SCRATCH` and run your calculation from here:
``` bash
#!/bin/csh
#PBS -l mem=100M
#PBS -q mpi_2
#PBS -l walltime=01:00:00
source /usr/share/Modules/3.2.10/init/csh
module load NETCDF/4.3.3.1-mpt-intel2016
cp -r $HOME/code.exe $SCRATCH
cp -r $DATAWORK/data $SCRATCH
cd $SCRATCH
$MPI_LAUNCH code.exe >& out
cp -r output $DATAWORK
```
## Running a job: array
To repeat a job a certain number of times (if your model has stochasticity for instance), you can use job arrays:
```
#!/bin/csh
#PBS -l mem=10M
#PBS -l walltime=00:01:00
cd $PBS_O_WORKDIR
mkdir -p output_${PBS_ARRAY_INDEX}
touch output_${PBS_ARRAY_INDEX}/toto.txt
```
To run the job:
```
qsub -J 0-10 seq.array
```
It will run the job 11 times, with `PBS_ARRAY_INDEX` ranging from `0` to `10`.
## Running a job: chained jobs (Advanced users)
To run a job in a chained mode (i.e.`job2` depends on `job1`), first run a job using the `-h` option (freeze the job):
```
qsub -h -N Job1 script1.pbs
```
Now, run a second job depending on the result of the first job:
```
qsub -N Job2 -W depend=afterany:'qselect -N Job1 -u $USER' script2.pbs
```
Finally, release the first job:
```
qrls 'qselect -N Job1 -u $USER'
```
> Note: replace `afterany` by `afterok` (no error) or `afternotok` (error)
## Running a job: follow-up
To follow the status of your job:
```
qstat -u nbarrier
```
| Status | Description |
|--------|------------------------------------------|
| **C** | Job is completed after having run |
| **E** | Job is exiting after having run |
| **H** | Job is held |
| **Q** | Job is queued, eligible to run or routed |
| **R** | Job is running |
: Job status provided by `qstat`
## Running a job: follow-up
To suppress a job:
```
qdel 9255575.datarmor0 ## replace by the ID of the job to kill
```
At the end of the job, check the email you receive and look for the following lines:
```
resources_used.mem=12336kb
resources_used.walltime=00:00:24
```
If you requested more memory/walltime than you used, adapt your needs for the next time (cf. [here](https://domicile.ifremer.fr/intraric/Mon-IntraRIC/Calcul-et-donnees-scientifiques/Datarmor-Calcul-et-Donnees/Datarmor-calcul-et-programmes/,DanaInfo=w3z.ifremer.fr,SSL+Placer-dimensionner-et-surveiller-ses-jobs-PBS#exemple) for more details)
## Exchange between Datarmor and local computer
Data exchange between local computer and Datarmor should not be done on the compute node, especially so for heavy files (**no use of `scp`**).
To exchange data, use the `datacopy.ifremer.fr` server, to which you can connect using FTP. Your **intranet** logins are required.
> **Note:** you need to be on the Ifremer network. If not, the VPN should be on.
Is is advised to use FileZilla to do that
## Exchange between Datarmor and local computer

## Exchange between Datarmor and remote server
To recover data from a remote FTP server, submit a job on the `ftp` queue. An exemple is provided below (inspired from `/appli/services/exemples/pbs/ftp.pbs`)
```
#!/bin/csh
#PBS -q ftp
#PBS -l walltime=02:15:00
cd $DATAWORK
time rsync -av login@server:/source/folder /destination/folder/ >& output
```
This will need some adaptation depending on the remote server.
## Conda
Sometimes, you might need external tools that are not available on modules. One way to use these tools is to create your own [Conda](https://docs.conda.io/en/latest/) environments, which is possible on Datarmor (cf. [Conda sur Datarmor](https://domicile.ifremer.fr/intraric/Mon-IntraRIC/Calcul-et-donnees-scientifiques/Datarmor-Calcul-et-Donnees/Datarmor-calcul-et-programmes/Pour-aller-plus-loin/,DanaInfo=w3z.ifremer.fr,SSL+Conda-sur-Datarmor)).
First, edit your `.cshrc` file (using `gedit $HOME/.cshrc &`) and add:
```
source /appli/anaconda/latest/etc/profile.d/conda.csh
```
Then, close and reopen the datarmor connection and type
```
which conda
```
to see if `conda` commands are accessible.
## Conda: settings
Now, create a `.condarc` using `gedit $HOME/.condarc &` and write:
```
envs_dirs:
- /my/env/folder
- /home1/datahome/nbarrier/softwares/anaconda3-envs
- /appli/conda-env
- /appli/conda-env/2.7
- /appli/conda-env/3.6
channels:
- conda-forge
- defaults
```
Replace the first line by a folder of your choice. It will contain your own environments.
## Conda: using existing environments
To list the environments:
```
conda env list
```
To activate an environment:
```
conda activate pyngl
```
To deactivate an environment:
```
conda deactivate pyngl
```
## Conda: creating environments
To create a new environment:
```
conda create --name new-env
```
To install packages in the activated environment:
```
conda install package_name
```
For a R environment:
```
conda create --name r-env
conda activate r-env
conda install r r-base
```
To remove an environment:
```
conda env remove --name r-env
```
## Conda: running jobs
Conda can be used to run PBS jobs as follows:
```
#!/bin/csh
#PBS -l mem=4g
#PBS -l walltime=00:10:00
## for CSH
source /appli/anaconda/latest/etc/profile.d/conda.csh
## for BASH
#. /appli/anaconda/latest/etc/profile.d/conda.csh
conda activate myconda
python toto.py >& output
```
## Jupyterhub
In order to process data in a fancy way, you can use [Jupyter](https://jupyter.org/). To do so, connect on [https://datarmor-jupyterhub.ifremer.fr/](https://datarmor-jupyterhub.ifremer.fr/) with your Intranet login.

> Note: you also need to be on Ifremer Network or with the Datarmor VPN activated
## Jupyterhub
Now, select the resources that you want (core + memory)

> **Warning: do not specify an optional environment**
## Jupyterhub
When the server is on, click on the `New` button and choose the Conda environment of your choice.

## Jupyterhub: personal Conda environments
To use Jupyter with your own R, Python or Matlab environments, you will need to install the following additional libraries to your Conda environments:
```
conda install r-irkernel ## for R
conda install ipykernel ## for Python
conda install matlab_kernel ## for matlab
```
To use Jupyter with Julia, run Julia (installed with `conda`) from your environment and type:
```
using Pkg
Pkg.add("IJulia")
```
## Jupyterhub: manual installation of kernel
If all your R libraries have been installed with a R version provided by Datarmor, you must manually install the R Kernel. To do so:
- Activate the JupyterHub environment of Datarmor: `conda activate jupyterhub`
```{=html}
<!-- -->
```
- Load the R module of your choice, for instance `module load R/3.6.3-intel-cc-17.0.2.174`
- Open an R console by typing `R`. Then type:
``` R
install.packages('IRkernel')
IRkernel::installspec(name = 'my-r', displayname = 'My R')
```
- Leave the R console (`quit()`). You should have a new folder in `$HOME/.local/share/jupyter/kernels/my-r`.
- Type `echo $LD_LIBRARY_PATHS` and copy the output that is prompted in the Terminal.
- In the `my-r` folder, edit the `kernel.json` file and edit it by adding a `env` variable that contains
the output of the `LD_LIBRARY_PATH`. It should look like this
```json
{
"argv": ["/appli/R/3.6.3-intel-cc-17.0.2.174/lib64/R/bin/R", "--slave", "-e", "IRkernel::main()", "--args", "{connection_file}"],
"display_name": "R/3.6.3-intel-cc-17.0.2.174",
"language": "R",
"env": {"LD_LIBRARY_PATH": "..."}
}
```
- Restart your JupyterHub server. You should see your environment.
## Datarmor acknowledgements
To acknowledge Datarmor on your publications, add the following lines to your paper:
*The authors acknowledge the Pôle de Calcul et de Données Marines (PCDM, http://www.ifremer.fr/pcdm) for providing DATARMOR storage, data access, computational resources, visualization and support services.*