You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement a framework-independent DP model format.
Detailed Description
Background
Currently, the DP model file is dependent on the deep learning framework. The TensorFlow model is in ProtoBuf format (.pb), while the developing PyTorch model is in .pt format. These two files are hard to convert between each other. The ONNX package aims to do it on the OP level, but it is limited since both TensorFlow and PyTorch have lots of unsupported OPs, and DP models may have customized OPs.
The DeePMD-kit needs to implement a framework-independent DP model format to have multiple backend support, as described below. Different frameworks are expected to behave similarly for the same model data.
Data structure
The model data is based on the current input parameters, ensuring alignment for each framework. Unimplemented parameters should also be aligned, and the framework raises a NotImplementedError during runtime.
Add a @variables key to each layer's dictionary, with a type of dict[str, np.ndarray], to store network parameters corresponding to what is needed to be restored in the current init_frz_model (which currently ensures complete restoration). "@variables" has a special character @ and should be a reserved name and avoided in the future. The keys of @variables should be aligned for all frameworks. Type embedding should be explicitly written and not hidden.
Add the following meta-information at the top level: (1) Software, version, and module used to generate the model file. (2) Generation time. (3) A unified model definition version for all frameworks.
HDF5 file is used to store data. h5py is a dependency of TensorFlow, PyTorch, and the existing DeePMD-kit, so this doesn't bring extra dependencies.
All variables are stored in the HDF5 file using a unique path. The json path is preserved and should not be used.
The JSON file is stored in the json path, where the type of @variables is dict[str, str]. The value of the @variables dict is the path to the variable, which could be different among different platforms.
Convert dict[str, np.ndarray] to dict[str, str] when saving the model and convert it back when restoring it.
Binding with class
Add deserialize (methodclass) and serialize to each class. The parent class should call the method of subclass. The implementation should follow dpdispacher:
Summary
Implement a framework-independent DP model format.
Detailed Description
Background
Currently, the DP model file is dependent on the deep learning framework. The TensorFlow model is in ProtoBuf format (
.pb
), while the developing PyTorch model is in.pt
format. These two files are hard to convert between each other. The ONNX package aims to do it on the OP level, but it is limited since both TensorFlow and PyTorch have lots of unsupported OPs, and DP models may have customized OPs.The DeePMD-kit needs to implement a framework-independent DP model format to have multiple backend support, as described below. Different frameworks are expected to behave similarly for the same model data.
Data structure
The model data is based on the current input parameters, ensuring alignment for each framework. Unimplemented parameters should also be aligned, and the framework raises a
NotImplementedError
during runtime.Add a
@variables
key to each layer's dictionary, with a type ofdict[str, np.ndarray]
, to store network parameters corresponding to what is needed to be restored in the currentinit_frz_model
(which currently ensures complete restoration). "@variables" has a special character@
and should be a reserved name and avoided in the future. The keys of@variables
should be aligned for all frameworks. Type embedding should be explicitly written and not hidden.Data storage
HDF5 file is used to store data.
h5py
is a dependency of TensorFlow, PyTorch, and the existing DeePMD-kit, so this doesn't bring extra dependencies.json
path is preserved and should not be used.json
path, where the type of@variables
isdict[str, str]
. The value of the@variables
dict is the path to the variable, which could be different among different platforms.dict[str, np.ndarray]
todict[str, str]
when saving the model and convert it back when restoring it.Binding with class
Add
deserialize
(methodclass
) andserialize
to each class. The parent class should call the method of subclass. The implementation should follow dpdispacher:https://github.com/deepmodeling/dpdispatcher/blob/065731a60be3b58979b54f1d33562ef189800158/dpdispatcher/submission.py#L97-L166
The
deserialize
(methodclass
) andserialize
of the top class can be called by external modules.Progress
Model
support DP native model format #2987DOSModel
EnergyModel
support DP native model format #2987FrozenModel
LinearModel
MultiModel
PairwiseDPRc
TensorModel
Descriptor
support DP native model format #2987DescrptHybrid
DescrptLocFrame
DescrptSeAEbdV2
DescrptSeAEbd
DescrptSeAEf
DescrptSeAMask
DescrptSeA
support DP native model format #2987DescrptSeAttenV2
DescrptSeAtten
DescrptSeR
DescrptSeT
Fitting
support DP native model format #2987DipoleFittingSeA
DOSFitting
EnerFitting
PolarFittingSeA
TypeEmbedNet
aparam
andfparam
placeholders inconvert_dp_to_pb
Further Information, Files, and Links
No response
The text was updated successfully, but these errors were encountered: