Skip to content

Commit

Permalink
optimize document and error report based on FAQs (#967)
Browse files Browse the repository at this point in the history
Make some recently reported problems clear in code and documentation.
  • Loading branch information
HuangJiameng authored Sep 25, 2022
1 parent 403d42d commit 3319a61
Show file tree
Hide file tree
Showing 8 changed files with 24 additions and 9 deletions.
2 changes: 1 addition & 1 deletion doc/run/overview-of-the-run-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,4 @@ DP-GEN identifies the stage of the run process by a record file, record.dpgen, w

0,1,2 correspond to make_train, run_train, post_train. DP-GEN will write scripts in make_train, run the task by specific machine in run_train and collect result in post_train. The records for model_devi and fp stage follow similar rules.

If the process of DP-GEN stops for some reasons, DP-GEN will automatically recover the main process by record.dpgen. You may also change it manually for your purpose, such as removing the last iterations and recovering from one checkpoint.
If the process of DP-GEN stops for some reasons, DP-GEN will automatically recover the main process by record.dpgen. You may also change it manually for your purpose, such as removing the last iterations and recovering from one checkpoint. When re-running dpgen, the process will start from the stage that the last line record.
15 changes: 12 additions & 3 deletions doc/user-guide/common-errors.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,24 @@
# Common Errors
(Errors are sorted alphabetically)

## Command not found: xxx.
There is no such software in the environment, or it is unavailable. It may be because 1. It is not installed; 2. The Conda environment is not activated; 3. You have chosen the wrong image in machine.json.

## dargs.dargs.ArgumentKeyError: [at location `xxx`] undefined key xxx is not allowed in strict mode.
Strict format check has been applied since version 0.10.7. To avoid misleading users, some older-version keys that are already ignored or absorbed into default settings are not allowed to be present. And the expected structure of the dictionary in the param.json also differs from those before version 0.10.7. This error will occur when format check finds older-fashion keys in the json file. Please try deleting or annotating these keys, or correspondingly modulate the json file. Example files in the newest format could be found in [examples](https://github.com/deepmodeling/dpgen/tree/master/examples).

## dargs.dargs.ArgumentTypeError: [at root location] key `xxx` gets wrong value type, requires <xxx> but gets <xx>
Please check your parameters with [DPGEN's Document](https://docs.deepmodeling.com/projects/dpgen/en/latest/). Maybe youhave superfluous parentheses in your parameter file.

## Dargs: xxx is not allowed in strict mode.
Strict format check has been applied since version 0.10.7. To avoid misleading users, some older-version keys that are already ignored or absorbed into default settings are not allowed to be present. And the expected structure of the dictionary in the param.json also differs from those before version 0.10.7. This error will occur when format check finds older-fashion keys in the json file. Please try deleting or annotating these keys, or correspondingly modulate the json file. Example files in the newest format could be found in [examples](https://github.com/deepmodeling/dpgen/tree/master/examples).

## FileNotFoundError: [Errno 2] No such file or directory: '.../01.model_devi/graph.xxx.pb'
If you find this error occurs, please check your initial data. Your model will not be generated if the initial data is incorrect.

## json.decoder.JSONDecodeError
Your `.json` file is incorrect. It may be a mistake in syntax or a missing comma.

## OSError: [Error cannot find valid a data system] Please check your setting for data systems
Check if the path to the dataset in the parameter file is set correctly. Note that `init_data_sys` is a list, while `sys_configs` should be a two-dimensional list. The first dimension corresponds to `sys_idx`, and the second level are some poscars under each group. Refer to the [sample file](https://github.com/deepmodeling/dpgen/blob/master/examples/run/dp2.x-lammps-vasp/param_CH4_deepmd-kit-2.0.1.json ).

## RuntimeError: job:xxxxxxx failed 3 times
```
RuntimeError: job:xxxxxxx failed 3 times
Expand All @@ -38,3 +44,6 @@ The ratio of failed jobs is larger than ratio_failure. You can set a high value

## ValueError: Cannot load file containing picked data when allow_picked=False
Please ensure that you write the correct path of the dataset with no excess files.

## warnings.warn("Some Gromacs commands were NOT found; "
You can ignore this warning if you don't need Gromacs. It just show that Gromacs is not installed in you environment.
6 changes: 5 additions & 1 deletion doc/user-guide/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@
- Size of `sel_a` and actual types of atoms in your system.
- Index of `sys_configs` and `sys_idx`.

2. Please verify the directories of `sys_configs`. If there isn't any POSCAR for `01.model_devi` in one iteration, it may happen that you write the false path of `sys_configs`.
2. Please verify the directories of `sys_configs`. If there isn't any POSCAR for `01.model_devi` in one iteration, it may happen that you write the false path of `sys_configs`. Note that `init_data_sys` is a list, while `sys_configs` should be a two-dimensional list. The first dimension corresponds to `sys_idx`, and the second level are some poscars under each group. Refer to the [sample file](https://github.com/deepmodeling/dpgen/blob/master/examples/run/dp2.x-lammps-vasp/param_CH4_deepmd-kit-2.0.1.json ).

3. Correct format of JSON file.

4. The frames of one system should be larger than `batch_size` and `numb_test` in `default_training_param`. It happens that one iteration adds only a few structures and causes error in next iteration's training. In this condition, you may let `fp_task_min` be larger than `numb_test`.

5. If you found the dpgen with the same version on two machines behaves differently, you may have modified the code in one of them.

2 changes: 1 addition & 1 deletion dpgen/dispatcher/LazyLocalContext.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def download(self,
else:
pass
else:
raise RuntimeError('do not find download file ' + fname)
raise OSError('do not find download file ' + fname)


def block_checkcall(self,
Expand Down
2 changes: 1 addition & 1 deletion dpgen/dispatcher/LocalContext.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ def upload(self,
for jj in local_up_files :
if not os.path.exists(os.path.join(local_job, jj)):
os.chdir(cwd)
raise RuntimeError('cannot find upload file ' + os.path.join(local_job, jj))
raise OSError('cannot find upload file ' + os.path.join(local_job, jj))
if os.path.exists(os.path.join(remote_job, jj)) :
os.remove(os.path.join(remote_job, jj))
_check_file_path(jj)
Expand Down
2 changes: 1 addition & 1 deletion dpgen/generator/arginfo.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ def run_mdata_arginfo() -> Argument:

# basics
def basic_args() -> List[Argument]:
doc_type_map = 'Atom types.'
doc_type_map = 'Atom types. Reminder: The elements in param.json, type.raw and data.lmp(when using lammps) should be in the same order.'
doc_mass_map = 'Standard atomic weights (default: "auto"). if one want to use isotopes, or non-standard element names, chemical symbols, or atomic number in the type_map list, please customize the mass_map list instead of using "auto".'
doc_use_ele_temp = 'Currently only support fp_style vasp. \n\n\
- 0: no electron temperature. \n\n\
Expand Down
2 changes: 2 additions & 0 deletions dpgen/generator/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -3752,6 +3752,8 @@ def run_iter (param_file, machine_file) :
with open (record) as frec :
for line in frec :
iter_rec = [int(x) for x in line.split()]
if len(iter_rec) == 0:
raise ValueError("There should not be blank lines in record.dpgen.")
dlog.info ("continue from iter %03d task %02d" % (iter_rec[0], iter_rec[1]))

cont = True
Expand Down
2 changes: 1 addition & 1 deletion tests/dispatcher/test_local_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def test_upload_non_exist(self) :
self.job = LocalContext('loc', work_profile)
tasks = ['task0', 'task1']
# test uploading non-existing file
with self.assertRaises(RuntimeError):
with self.assertRaises(OSError):
self.job.upload(tasks, ['foo'])

def test_upload(self) :
Expand Down

0 comments on commit 3319a61

Please sign in to comment.