pd: add CPP inference with LAMMPS #4467

HydrogenSulfate · 2024-12-10T05:07:03Z

support LAMMPS inference with Paddle backend on cuda and dcu device
Fix deepmd/pd/utils/serialization.py and paddle inference files can be deserialized using dp convert-backend deeppot_sea.yaml deeppot_sea.json

related PR:

【pir】update If_grad and while_grad op' stop gradient by yield_op PaddlePaddle/Paddle#70545

Summary by CodeRabbit

Based on the comprehensive changes, here are the updated release notes:

New Features
- Added support for PaddlePaddle deep learning framework.
- Introduced new Paddle-based deep potential model computations.
- Expanded backend support for model inference.
Configuration
- New CMake option ENABLE_PADDLE to toggle Paddle support.
- Added configuration parameters for Paddle version and inference directory.
Testing
- Comprehensive test suites added for Paddle backend.
- Enhanced LAMMPS integration tests with Paddle support.
Documentation
- Updated version header and configuration files to reflect Paddle integration.
Performance
- Added JIT compilation for Paddle model methods.
- Optimized model serialization and deserialization.

coderabbitai · 2024-12-10T05:10:03Z

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive support for the PaddlePaddle (Paddle) deep learning framework in the DeePMD project. The changes span multiple files across the project, adding a new backend option ENABLE_PADDLE to the CMake configuration. The implementation includes a new DeepPotPD class for Paddle-based potential energy calculations, updates to build configurations, test suites, and integration with existing project infrastructure. The changes enable users to use Paddle models alongside existing TensorFlow and PyTorch backends.

Changes

File	Change Summary
`source/CMakeLists.txt`	Added `ENABLE_PADDLE` configuration option with logic to download Paddle inference library.
`source/api_cc/CMakeLists.txt`	Added conditional linking for Paddle libraries and compilation definitions.
`source/api_cc/include/DeepPotPD.h`	New header defining `DeepPotPD` class with methods for potential energy calculations.
`source/api_cc/src/DeepPotPD.cc`	Implementation of `DeepPotPD` class methods for Paddle inference.
`source/config/run_config.ini`	Added Paddle version and inference directory configuration parameters.
`.pre-commit-config.yaml`	Updated file exclusion patterns for hooks.
`.github/workflows/test_cc.yml`	Updated library paths and environment variables for Paddle testing.
`source/api_cc/src/common.cc`	Enhanced backend support for Paddle model files in `get_backend` function.
`source/install/test_cc_local.sh`	Updated script to include Paddle configuration in build process.
`source/lmp/plugin/CMakeLists.txt`	Improved LAMMPS plugin configuration logic.
`source/lmp/tests/test_lammps_pd.py`	Introduced comprehensive tests for LAMMPS integration with DeepMD.
`deepmd/pd/utils/serialization.py`	Updated model serialization and JIT compilation logic for Paddle.
`source/api_cc/tests/test_deeppot_pd.cc`	Added extensive test cases for the `DeepPot` class.

Possibly related PRs

feat(pt): consistent fine-tuning with init-model #3803: The main PR introduces a new configuration option for Paddle in CMake, which is relevant to the changes in the source/api_cc/CMakeLists.txt file of this PR that also adds support for Paddle.
feat pt : Support property fitting #3867: This PR enhances the functionality of the DeepMD framework to support property fitting, which is related to the overall integration of Paddle and the new features introduced in the main PR.
chore(pt): make comm_dict for dpa2 noncompulsory when nghost is 0 #4144: The changes in this PR regarding the handling of ghost atoms and message passing in the DeepPot class may relate to the overall improvements in the model's functionality, which is a focus of the main PR.
fix(pt): set weights_only=True for torch.load #4147: The modifications to the model loading process in this PR, specifically regarding the weights_only parameter, could be relevant to the main PR's enhancements in model configuration and integration.
fix(pt ut): make separated uts deterministic #4162: The introduction of a global seed for reproducibility in tests may connect with the main PR's focus on improving the overall structure and functionality of the codebase.
feat(pt): train with energy Hessian #4169: The addition of energy Hessian training capabilities in this PR aligns with the main PR's enhancements to the model's functionality and configuration.
fix(tf): set visible_device_list for TF C++ #4172: The changes in GPU device selection and configuration in this PR may relate to the main PR's focus on improving the overall performance and usability of the DeepMD framework.
breaking: drop Python 3.8 support #4185: The decision to drop Python 3.8 support in this PR reflects a broader strategy to modernize the codebase, which is consistent with the main PR's enhancements and updates.

Suggested labels

breaking change, Docs

Suggested reviewers

iProzd

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 05759cc and 2a2fd1b.

📒 Files selected for processing (1)

.github/workflows/test_cc.yml (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/test_cc.yml

⏰ Context from checks skipped due to timeout of 90000ms (17)

GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Test Python (6, 3.12)
GitHub Check: Test Python (5, 3.12)
GitHub Check: Test Python (4, 3.12)
GitHub Check: Test Python (4, 3.9)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Test C++ (false)
GitHub Check: Test C++ (true)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (8)

source/api_cc/include/DeepPotPD.h (3)
238-239: Typo in documentation: Correct "arraay" to "array"

There is a typographical error in the documentation comment. The word "arraay" should be corrected to "array" for clarity.

Apply this diff to fix the typo:
- * @brief Get the buffer arraay of this model.
+ * @brief Get the buffer array of this model.
251-251: Typo in documentation: Missing period at the end of the sentence

The documentation comment is missing a period at the end of the sentence. For consistency and readability, please add a period.

Apply this diff:
- * @param[out] buffer_scalar Buffer scalar.
+ * @param[out] buffer_scalar Buffer scalar.
+ */
342-342: Variable naming: Rename inited to initialized

The private member variable inited is a non-standard abbreviation. Renaming it to initialized enhances readability and adheres to common naming conventions.

Apply this diff to rename the variable:
-int inited;
+bool initialized;
Also, update all occurrences of inited throughout the class to initialized.
source/api_cc/src/DeepPotPD.cc (2)
323-324: Incorrect exception message: Replace "fparam" with "aparam"

In the exception messages, "fparam is not supported as input yet." should be "aparam is not supported as input yet." when handling aparam. This correction avoids confusion and provides accurate information.

Apply these diffs:
throw deepmd::deepmd_exception("fparam is not supported as input yet.");
+throw deepmd::deepmd_exception("aparam is not supported as input yet.");
Repeat this correction in both occurrences where aparam is involved.

Also applies to: 330-331

47-52: Hardcoded GPU settings: Make GPU memory and ID configurable

The GPU memory size and device ID are hardcoded, which may not be suitable for all environments. Consider making these parameters configurable to enhance flexibility and usability.

Suggest modifying the code to accept GPU settings from configuration or input parameters.
-int gpu_num = 1;
-if (gpu_num > 0) {
-  gpu_id = gpu_rank % gpu_num;
-} else {
-  gpu_id = 0;
-}
+gpu_id = gpu_rank;
Similarly, allow the GPU memory size to be specified:
-config->EnableUseGpu(
-    4096, 0);  // annotate it if use cpu, default use gpu with 4G mem
+config->EnableUseGpu(memory_size_in_mb, gpu_id);
Ensure that memory_size_in_mb and gpu_id are properly defined and passed to the init method.
source/api_cc/include/version.h.in (1)
13-13: Update comments to reflect new variable

A new variable global_pd_lib has been added. Consider updating any documentation or comments to include this variable, ensuring that all global library variables are documented consistently.

Add a comment describing global_pd_lib, similar to the existing variables.
+// Path to the Paddle libraries
 const std::string global_pd_lib="@PADDLE_LIBRARIES@";
source/api_cc/CMakeLists.txt (1)
30-37: Consider making the ROCm library path configurable.

While the Paddle integration looks good, the hardcoded path to libgalaxyhip.so might cause issues across different systems. Consider making this path configurable through a CMake variable.
 if(ENABLE_PADDLE AND NOT BUILD_PY_IF)
   target_link_libraries(${libname} PUBLIC "${PADDLE_LIBRARIES}")
   target_compile_definitions(${libname} PUBLIC BUILD_PADDLE)
   if(DP_VARIANT STREQUAL "rocm")
+    set(GALAXYHIP_LIBRARY "${hip_LIB_INSTALL_DIR}/libgalaxyhip.so" CACHE PATH "Path to libgalaxyhip.so")
     target_link_libraries(${libname}
-                         PUBLIC "${hip_LIB_INSTALL_DIR}/libgalaxyhip.so")
+                         PUBLIC "${GALAXYHIP_LIBRARY}")
   endif()
 endif()
source/api_cc/src/DeepPot.cc (1)
59-63: Improve error message for better user guidance.

While the initialization logic is correct, consider making the error message more informative by including build instructions.
-    throw deepmd::deepmd_exception("PaddlePaddle backend is not supported yet");
+    throw deepmd::deepmd_exception("PaddlePaddle backend is not built. Please rebuild with -DENABLE_PADDLE=ON");

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ec3b83f and bea3123.

📒 Files selected for processing (9)

source/CMakeLists.txt (3 hunks)
source/api_cc/CMakeLists.txt (1 hunks)
source/api_cc/include/DeepPotPD.h (1 hunks)
source/api_cc/include/version.h.in (1 hunks)
source/api_cc/src/DeepPot.cc (2 hunks)
source/api_cc/src/DeepPotPD.cc (1 hunks)
source/api_cc/src/common.cc (2 hunks)
source/config/CMakeLists.txt (1 hunks)
source/config/run_config.ini (1 hunks)

🔇 Additional comments (8)

source/api_cc/src/DeepPotPD.cc (1)

185-186: ⚠️ Potential issue

Ownership mismatch: Potential issue with unique_ptr assignment

Assigning the result of GetInputHandle to a std::unique_ptr may cause ownership issues if GetInputHandle does not return a unique_ptr. Verify the return type of GetInputHandle and ensure proper ownership semantics.

Please check if GetInputHandle returns a std::unique_ptr. If it returns a raw pointer or shared pointer, adjust the declaration of firstneigh_tensor accordingly.

-std::unique_ptr<paddle_infer::Tensor> firstneigh_tensor;
+auto firstneigh_tensor = predictor->GetInputHandle("nlist");

Or, if a shared_ptr is appropriate:

-std::unique_ptr<paddle_infer::Tensor> firstneigh_tensor;
+std::shared_ptr<paddle_infer::Tensor> firstneigh_tensor;

source/config/CMakeLists.txt (1)

17-21: Consistency in handling boolean options

The addition of the ENABLE_PADDLE option follows the existing pattern, but ensure that all boolean variables are consistently handled throughout the configuration. Verify that ENABLE_PADDLE is properly set and used in other CMake scripts.

source/config/run_config.ini (1)

17-18: LGTM! Configuration parameters for Paddle integration.

The new configuration parameters follow the established naming convention and structure.

source/CMakeLists.txt (2)

12-12: LGTM! Option declaration follows CMake conventions.

The ENABLE_PADDLE option is properly declared with a safe default value of OFF.

335-337: LGTM! Backend messaging is properly updated.

The backend messaging section is correctly updated to include Paddle in the list of enabled backends.

Also applies to: 341-341

source/api_cc/src/DeepPot.cc (1)

18-20: LGTM! Header inclusion follows existing backend patterns.

The Paddle backend header inclusion is properly guarded with BUILD_PADDLE and follows the same pattern as other backend includes.

source/api_cc/src/common.cc (2)

1395-1397: LGTM! Summary printing follows existing patterns.

The addition of Paddle library path to the summary output is consistent with how other backends are handled.

1415-1417: ⚠️ Potential issue

Consider using a more specific file extension for Paddle models.

Using .json as the model file extension could lead to conflicts since it's a very common format used for various purposes. Consider using a more specific extension like .pdmodel or .pd.json.

coderabbitai · 2024-12-10T05:10:06Z

source/api_cc/include/DeepPotPD.h

+  /**
+   * @brief DP constructor without initialization.
+   **/
+  DeepPotPD();


🛠️ Refactor suggestion

Inefficient passing of small types by const reference

Passing small built-in types like int and bool by const reference is less efficient than passing them by value. It is recommended to pass these types by value to improve performance and code clarity.

Apply the following diffs to pass int and bool parameters by value:

-DeepPotPD(); +DeepPotPD(); -DeepPotPD(const std::string& model, - const int& gpu_rank = 0, - const std::string& file_content = ""); +DeepPotPD(const std::string& model, + int gpu_rank = 0, + const std::string& file_content = ""); -void init(const std::string& model, - const int& gpu_rank = 0, - const std::string& file_content = ""); +void init(const std::string& model, + int gpu_rank = 0, + const std::string& file_content = ""); -template <typename VALUETYPE, typename ENERGYVTYPE> -void compute(ENERGYVTYPE& ener, +template <typename VALUETYPE, typename ENERGYVTYPE> +void compute(ENERGYVTYPE& ener, ...

Repeat this change for all instances where const int& or const bool& is used for parameter passing.

Also applies to: 26-28, 36-38, 145-145, 191-191, 198-198

coderabbitai · 2024-12-10T05:10:06Z

source/api_cc/src/DeepPotPD.cc

+  force.resize(static_cast<size_t>(nframes) * fwd_map.size() * 3);
+  select_map<VALUETYPE>(force, dforce, bkw_map, 3, nframes, fwd_map.size(),


⚠️ Potential issue

Potential exception safety issue: Throwing string literals

Throwing string literals is not recommended and may lead to undefined behavior. Instead, throw exceptions derived from std::exception or use existing exception classes.

Apply this diff to throw a proper exception:

-throw "atomic virial is not supported as output yet."; +throw deepmd::deepmd_exception("Atomic virial is not supported as output yet.");

This change ensures proper exception handling and consistency with other exception uses in the code.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-12-10T05:10:07Z

source/CMakeLists.txt

+if(ENABLE_PADDLE)
+  if(NOT DEFINED PADDLE_INFERENCE_DIR)
+    # message( FATAL_ERROR "Make sure PADDLE_INFERENCE_DIR is set when
+    # ENABLE_PADDLE=ON")
+    message(STATUS "PADDLE_INFERENCE_DIR is not defined. Downloading...")
+    set(DOWNLOAD_URL
+        "https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda118_cudnn860_Trt8531_D1/ce51e82e84fc97e0a55a162037f1554746159cad/paddle_inference.tgz"
+    )
+    set(TGZ_FILE "${CMAKE_BINARY_DIR}/paddle_inference.tgz")
+    set(EXTRACTED_DIR "${CMAKE_BINARY_DIR}/paddle_inference_install_dir")
+    file(DOWNLOAD ${DOWNLOAD_URL} ${TGZ_FILE})
+    message(STATUS "Downloading finished, extracting...")
+    execute_process(COMMAND ${CMAKE_COMMAND} -E tar -xzvf ${TGZ_FILE}
+                    OUTPUT_QUIET)
+    file(REMOVE ${TGZ_FILE})
+    set(PADDLE_INFERENCE_DIR
+        ${EXTRACTED_DIR}
+        CACHE PATH
+              "Path to 'paddle_inference_install_dir' or 'paddle_inference'")
+  else()
+    message(
+      STATUS "PADDLE_INFERENCE_DIR is already defined: ${PADDLE_INFERENCE_DIR}")
+  endif()
+
+  message(STATUS "Final PADDLE_INFERENCE_DIR is set to ${PADDLE_INFERENCE_DIR}")
+
+  set(PADDLE_INFERENCE_DIR
+      ${PADDLE_INFERENCE_DIR}
+      CACHE PATH "Path to 'paddle_inference_install_dir' or 'paddle_inference'")
+
+  # used in api_cc
+  set(PADDLE_LIBRARIES
+      "${PADDLE_INFERENCE_DIR}/paddle/lib/libpaddle_inference.so"
+      CACHE PATH "Path to libpaddle_inference.so")
+
+  include_directories("${PADDLE_INFERENCE_DIR}/")
+  set(PADDLE_LIB_THIRD_PARTY_PATH
+      "${PADDLE_INFERENCE_DIR}/third_party/install/")
+
+  include_directories("${PADDLE_LIB_THIRD_PARTY_PATH}protobuf/include")
+  include_directories("${PADDLE_LIB_THIRD_PARTY_PATH}glog/include")
+  include_directories("${PADDLE_LIB_THIRD_PARTY_PATH}gflags/include")
+  include_directories("${PADDLE_LIB_THIRD_PARTY_PATH}xxhash/include")
+
+  link_directories("${PADDLE_LIB_THIRD_PARTY_PATH}protobuf/lib")
+  link_directories("${PADDLE_LIB_THIRD_PARTY_PATH}glog/lib")
+  link_directories("${PADDLE_LIB_THIRD_PARTY_PATH}gflags/lib")
+  link_directories("${PADDLE_LIB_THIRD_PARTY_PATH}xxhash/lib")
+  link_directories("${PADDLE_INFERENCE_DIR}/paddle/lib")
+  # if (USE_ROCM_TOOLKIT) add_definitions(-D_GLIBCXX_USE_CXX11_ABI=1) endif()
+endif(ENABLE_PADDLE)


⚠️ Potential issue

Improve robustness of Paddle library download and configuration.

Several improvements could make the Paddle integration more robust:

The download URL is hardcoded and might become outdated

No verification of downloaded file integrity

No fallback mechanism if download fails

Third-party library paths are hardcoded

Consider implementing these improvements:

if(ENABLE_PADDLE) if(NOT DEFINED PADDLE_INFERENCE_DIR) - message(STATUS "PADDLE_INFERENCE_DIR is not defined. Downloading...") - set(DOWNLOAD_URL - "https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda118_cudnn860_Trt8531_D1/ce51e82e84fc97e0a55a162037f1554746159cad/paddle_inference.tgz" - ) + set(PADDLE_DOWNLOAD_URL "https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda118_cudnn860_Trt8531_D1/ce51e82e84fc97e0a55a162037f1554746159cad/paddle_inference.tgz" + CACHE STRING "URL to download Paddle inference library") + set(PADDLE_DOWNLOAD_HASH "expected_hash_here" + CACHE STRING "Expected hash of downloaded Paddle inference library") + + message(STATUS "PADDLE_INFERENCE_DIR is not defined. Downloading from ${PADDLE_DOWNLOAD_URL}...") set(TGZ_FILE "${CMAKE_BINARY_DIR}/paddle_inference.tgz") set(EXTRACTED_DIR "${CMAKE_BINARY_DIR}/paddle_inference_install_dir") - file(DOWNLOAD ${DOWNLOAD_URL} ${TGZ_FILE}) + + file(DOWNLOAD ${PADDLE_DOWNLOAD_URL} ${TGZ_FILE} + EXPECTED_HASH SHA256=${PADDLE_DOWNLOAD_HASH} + STATUS DOWNLOAD_STATUS + SHOW_PROGRESS) + + list(GET DOWNLOAD_STATUS 0 STATUS_CODE) + if(NOT STATUS_CODE EQUAL 0) + message(FATAL_ERROR "Failed to download Paddle inference library. Please set PADDLE_INFERENCE_DIR manually.") + endif()

Also, consider making third-party library paths configurable:

- set(PADDLE_LIB_THIRD_PARTY_PATH - "${PADDLE_INFERENCE_DIR}/third_party/install/") + set(PADDLE_THIRD_PARTY_PATH "${PADDLE_INFERENCE_DIR}/third_party/install/" + CACHE PATH "Path to Paddle third-party libraries")

Committable suggestion skipped: line range outside the PR's diff.

source/api_cc/src/common.cc

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)

source/api_cc/include/DeepPotPD.h (2)
259-262: Inconsistent data type: declare aparam_nall as bool

The member variable aparam_nall is declared as an int but is used as a boolean in the is_aparam_nall() method. To improve code clarity and correctness, consider changing its type to bool.

Apply the following change:
-      int aparam_nall;
+      bool aparam_nall;
Ensure that all assignments and usages of aparam_nall are updated accordingly.

Also applies to: 347-347

17-18: Delete copy constructor and assignment operator to prevent unintended copying

The DeepPotPD class manages resources using std::shared_ptr. To prevent unintended copying, which could lead to resource management issues, consider deleting the copy constructor and copy assignment operator.

Add the following declarations to the class:
DeepPotPD(const DeepPotPD&) = delete;
DeepPotPD& operator=(const DeepPotPD&) = delete;
source/lmp/plugin/CMakeLists.txt (1)
11-11: Add space in the message for clarity

There is no space between "TO" and ${LAMMPS_SOURCE_ROOT} in the message, causing the output to be concatenated without a space. Please add a space for improved readability.

Apply the following change:
-message(STATUS "STARTING DOWNLOAD LAMMPS TO" ${LAMMPS_SOURCE_ROOT})
+message(STATUS "STARTING DOWNLOAD LAMMPS TO " ${LAMMPS_SOURCE_ROOT})

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between bea3123 and 2e6cae4.

📒 Files selected for processing (3)

source/api_cc/include/DeepPotPD.h (1 hunks)
source/api_cc/src/common.cc (2 hunks)
source/lmp/plugin/CMakeLists.txt (1 hunks)

🔇 Additional comments (3)

source/api_cc/include/DeepPotPD.h (1)

27-27: Avoid passing small built-in types by const reference

Passing small built-in types like int by const reference (const int&) is less efficient than passing them by value. It is recommended to pass them by value to improve performance and code clarity.

Apply the following changes to update the parameter declarations:

At line 27:

-                const int& gpu_rank = 0,
+                int gpu_rank = 0,

At line 38:

-                const int& gpu_rank = 0,
+                int gpu_rank = 0,

At line 114:

-                   const int& ago,
+                   int ago,

At line 145:

-                              const int& nframes,
+                              int nframes,

At line 183:

-                              const int& nframes,
+                              int nframes,

Also applies to: 38-38, 114-114, 145-145, 183-183

source/api_cc/src/common.cc (2)

1395-1397: Paddle library integration is correctly implemented

The addition of the Paddle library path in the print_summary function is properly included and follows the existing pattern for other backends.

1415-1417: Backend detection for Paddle models is appropriate

The modification to the get_backend function to detect .json files for the Paddle backend is correctly implemented and ensures proper backend selection.

source/api_cc/src/DeepPotPD.cc

codecov · 2024-12-10T07:58:13Z

Codecov Report

Attention: Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.

Project coverage is 82.05%. Comparing base (dbdb9b9) to head (26f8010).
Report is 1 commits behind head on devel.

Files with missing lines	Patch %	Lines
deepmd/pd/utils/serialization.py	0.00%	9 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##            devel    #4467      +/-   ##
==========================================
- Coverage   84.55%   82.05%   -2.51%     
==========================================
  Files         677      557     -120     
  Lines       63903    46934   -16969     
  Branches     3488      967    -2521     
==========================================
- Hits        54035    38512   -15523     
+ Misses       8743     7789     -954     
+ Partials     1125      633     -492

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

njzjz

Tests should be added and enabled, and then we could see the C++ converage.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (7)

.pre-commit-config.yaml (1)
66-66: Consider narrowing the JSON exclusion pattern.

The current pattern excludes all JSON files from clang-format checks. Consider narrowing this to specific JSON files or directories if only certain JSON files need to be excluded.
-        exclude: ^(source/3rdparty|source/lib/src/gpu/cudart/.+\.inc|.+\.ipynb$|.+\.json$)
+        exclude: ^(source/3rdparty|source/lib/src/gpu/cudart/.+\.inc|.+\.ipynb$|source/.*deeppot.*\.json$)
deepmd/pd/utils/serialization.py (2)

40-40: Preserve or remove commented-out serialization code.
If the model_def_script is no longer relevant, consider removing this line entirely to keep the codebase clean. If it remains relevant, please clarify why it's commented out.

65-66: Clarify the saved model file extension.
Using model_file.split(".json")[0] strips the “.json” extension; ensure downstream tools support this naming convention for Paddle models (usually “.pdmodel/.pdiparams”).

source/api_cc/tests/test_deeppot_pd.cc (4)

56-75: Consider adjusting EPSILON for different data types.
You’re comparing floating-point results using the same EPSILON for all typed tests. If TypeParam can be float or double, you might want to use different tolerances to reduce false positives in tests.

122-136: Possible duplication in model testing.
You might benefit from factoring out repeated test logic into a shared utility or fixture (like the MyModel approach). This would improve maintainability and reduce repetitive code blocks.

418-427: Avoid modifying the global test input in-place.
Here, you prepend “dummy” virtual atoms to the main coordinate/atype arrays, which can complicate test isolation. Consider copying or leveraging a fixture to keep the base arrays unmodified.

529-532: Add coverage for error scenarios or corner cases.
A “print_summary” test is helpful for demonstration purposes, but it might be beneficial to include negative or boundary-case tests, e.g., missing file, invalid JSON, etc.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e6cae4 and a8145fa.

📒 Files selected for processing (3)

.pre-commit-config.yaml (2 hunks)
deepmd/pd/utils/serialization.py (2 hunks)
source/api_cc/tests/test_deeppot_pd.cc (1 hunks)

🔇 Additional comments (4)

.pre-commit-config.yaml (1)

10-10: Verify the existence of deeppot_sea.json file.

The exclude pattern has been updated to ignore deeppot_sea.json. Let's verify this file is part of the PR changes.

✅ Verification successful

The deeppot_sea.json file exists and is correctly excluded

The file deeppot_sea.json is present in the PR at source/tests/infer/deeppot_sea.json, confirming that the exclude pattern in the pre-commit config is valid and necessary.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if deeppot_sea.json exists in the PR
gh pr view 4467 --json files -q '.files[].path' | grep 'deeppot_sea.json'

# Find any potential locations of this file
fd 'deeppot_sea.json'

Length of output: 167

deepmd/pd/utils/serialization.py (2)

51-54: Import statements look good.
Importing InputSpec directly from paddle.static is correct for specifying dynamic model inputs.

55-63: Handle dynamic shapes cautiously.
You are explicitly specifying [-1, -1, 3], [-1, -1], and [-1, -1, -1] in the input spec. Consider whether you need more flexible dimensions (e.g., for variable numbers of atoms or neighbors in other scenarios).

source/api_cc/tests/test_deeppot_pd.cc (1)

1-15: General test structure is well-organized.
Using Google Test for multiple data types is a solid approach. The test scaffolding is thorough and sets the stage for verifying all relevant aspects of the DeepPot API.

HydrogenSulfate · 2024-12-27T07:48:54Z

Tests should be added and enabled, and then we could see the C++ converage.

Thanks for comments, I have added inference model files and an unitest file source/api_cc/tests/test_deeppot_pd.cc, but how can I running the test_deeppot_pd.cc in my environment? Is there a document for CPP test guidance?

njzjz · 2024-12-27T23:47:49Z

Tests should be added and enabled, and then we could see the C++ converage.

Thanks for comments, I have added inference model files and an unitest file source/api_cc/tests/test_deeppot_pd.cc, but how can I running the test_deeppot_pd.cc in my environment? Is there a document for CPP test guidance?

Please check .devcontainer directory.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (7)

source/api_cc/src/DeepPotPD.cc (6)
14-20: Consider using a more generalized approach or direct insertion for building the flattened list.
Currently, createNlistTensor flattens a nested vector via insertion into ret. This approach works fine for small to medium data. However, for very large sizes, consider reserving capacity or using bulk insertion to reduce repeated memory allocations and copying.
 std::vector<int> createNlistTensor(const std::vector<std::vector<int>>& data) {
   std::vector<int> ret;
+  // Optionally reserve capacity to avoid repeated reallocations
+  size_t total_size = 0;
+  for (const auto& row : data) {
+    total_size += row.size();
+  }
+  ret.reserve(total_size);

   for (const auto& row : data) {
     ret.insert(ret.end(), row.begin(), row.end());
   }
   return ret;
 }
22-33: Graceful error handling is properly implemented, but consider logging context details.
The constructor uses a try/catch block to ensure correct rethrow if initialization fails. This is good for preventing partial initialization. However, capturing and logging the actual exception content (e.g., std::current_exception()) before rethrow may provide additional debugging context.

190-203: Remove or convert commented-out code sections if not needed.
It seems that handling for fparam and aparam is commented out. Consider removing this code if no longer necessary or converting it into a clearly marked TODO/FIXME if future support is planned.

291-359: Reduce duplication between the two compute methods.
The second compute method in lines 291-359 is structurally similar to the one in lines 113-290, differing mostly in parameters like nghost and InputNlist. Consider refactoring into a shared internal function to avoid code duplication.

387-400: Memory-safety consideration for string building in get_type_map.
Appending each character via std::string(1, char_c) is correct but might be less efficient. You could push back into a std::vector<char> and convert at the end. Also, ensure that all returned char_c values are valid printable characters if used in logs.

422-514: Consider unifying the computew and computew_mixed_type logic.
The many overloads in lines 422-484 and lines 485-514 could be simplified using templates or bridging functions to minimize duplication. If you eventually implement mixed-type logic for Paddle, maintaining a single interface would be more manageable.
source/CMakeLists.txt (1)

12-12: Provide clarity on ENABLE_PADDLE usage.
You introduce ENABLE_PADDLE with a default OFF, which is fine. It would be helpful to clarify in documentation or comments how users can set PADDLE_INFERENCE_DIR if they enable this option manually (since you provide an auto-download fallback).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e6172d and 5a048c4.

📒 Files selected for processing (5)

.devcontainer/build_cxx.sh (1 hunks)
deepmd/pd/utils/serialization.py (1 hunks)
source/CMakeLists.txt (3 hunks)
source/api_cc/src/DeepPotPD.cc (1 hunks)
source/install/test_cc_local.sh (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

deepmd/pd/utils/serialization.py

🔇 Additional comments (6)

source/api_cc/src/DeepPotPD.cc (2)

34-42: Avoid repeated initialization in highly concurrent calls.
The code checks if (inited) ... return; to skip re-initialization. If users call init from multiple threads, more robust concurrency safeguards (such as a mutex or an atomic check) might be required to prevent partial re-initialization.

242-256: Consistent with past feedback: atomic virial storage is still unsupported.
The code rightly throws an exception for unimplemented features. You might mark this code path with a clear TODO if you plan to implement atomic virial output later, or remove it if it's not intended to be supported.

.devcontainer/build_cxx.sh (1)

14-14: Confirm whether Paddle should always be enabled in container builds.
You've set -D ENABLE_PADDLE=ON unconditionally. If Paddle is optional or if some environments don’t require it, consider a runtime switch or environment variable to toggle Paddle support.

source/install/test_cc_local.sh (1)

24-24: Include tests for Paddle-based C++ inference.
Enabling Paddle in the build is a great step. Ensure that ctest and your new test_deeppot_pd.cc are properly invoked to validate Paddle-based functionality, especially since the PR aims to increase coverage.

source/CMakeLists.txt (2)

339-345: Consistent logging for enabled backends.
You’ve added a check for ENABLE_PADDLE to log “- Paddle” among other backends, ensuring consistent feedback to users. This looks good.

30-84: 🛠️ Refactor suggestion

Security and maintainability improvements for third-party downloads.
Downloading Paddle from a hardcoded URL raises the risk of broken links or malicious tampering. Consider verifying a checksum or providing a fallback if the download fails. Also, check whether your license policy requires disclaimers for included third-party code.

 if(ENABLE_PADDLE)
   if(NOT DEFINED PADDLE_INFERENCE_DIR)
     message(STATUS "PADDLE_INFERENCE_DIR is not defined. Downloading...")
+    # Consider verifying a known hash or fallback if the URL is unavailable
     set(DOWNLOAD_URL
         "https://paddle-qa.bj.bcebos.com/paddle-pipeline/..."
     )

Likely invalid or redundant comment.

coderabbitai · 2024-12-31T08:01:34Z

source/api_cc/src/DeepPotPD.cc

+  auto energy_ = predictor->GetOutputHandle(output_names[1]);
+  auto force_ = predictor->GetOutputHandle(output_names[2]);
+  auto virial_ = predictor->GetOutputHandle(output_names[3]);
+  std::vector<int> output_energy_shape = energy_->shape();
+  int output_energy_size =
+      std::accumulate(output_energy_shape.begin(), output_energy_shape.end(), 1,
+                      std::multiplies<int>());
+  std::vector<int> output_force_shape = force_->shape();
+  int output_force_size =
+      std::accumulate(output_force_shape.begin(), output_force_shape.end(), 1,
+                      std::multiplies<int>());
+  std::vector<int> output_virial_shape = virial_->shape();
+  int output_virial_size =
+      std::accumulate(output_virial_shape.begin(), output_virial_shape.end(), 1,
+                      std::multiplies<int>());


🛠️ Refactor suggestion

Use output names carefully to avoid index-based assumptions.
Fetching the output handles by fixed indices [1], [2], [3] can be brittle if the model order changes. Ideally, rely on known output names or validate the length of output_names to prevent out-of-range access.

coderabbitai · 2024-12-31T08:01:35Z

source/api_cc/src/DeepPotPD.cc

+    pdmodel_path = model;
+    pdiparams_path = model;
+    pdiparams_path.replace(pdiparams_path.find(".json"), 5,
+                           std::string(".pdiparams"));
+  } else if (model.find(".pdmodel") != std::string::npos) {
+    pdmodel_path = model;
+    pdiparams_path = model;
+    pdiparams_path.replace(pdiparams_path.find(".pdmodel"), 8,
+                           std::string(".pdiparams"));
+  } else {
+    throw deepmd::deepmd_exception("Given inference model: " + model +
+                                   " do not exist, please check it.");
+  }


⚠️ Potential issue

Handle edge cases when searching for file extensions.
The code replaces “.json” or “.pdmodel” with “.pdiparams”, assuming the extension is present. Consider handling the case when model.find(...) yields std::string::npos or if the file doesn't match either extension.

} else { - throw deepmd::deepmd_exception("Given inference model: " + model + - " do not exist, please check it."); + throw deepmd::deepmd_exception("Unsupported model file extension for: " + model + + ". Expected .json or .pdmodel extension."); }

Committable suggestion skipped: line range outside the PR's diff.

…initions

source/api_cc/src/DeepPotPD.cc

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

source/api_cc/src/DeepPotPD.cc (1)
509-511: Document unimplemented features with TODO comments.

Add TODO comments to indicate future implementation plans for mixed-type compute methods.
+  // TODO(#4467): Implement mixed-type compute methods for paddle backend
   throw deepmd::deepmd_exception(
       "computew_mixed_type is not implemented in paddle backend yet");
Also applies to: 524-526

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a048c4 and 269e96e.

📒 Files selected for processing (2)

source/CMakeLists.txt (3 hunks)
source/api_cc/src/DeepPotPD.cc (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

source/CMakeLists.txt

🔇 Additional comments (1)

source/api_cc/src/DeepPotPD.cc (1)

219-221: 🛠️ Refactor suggestion

Use output names instead of hard-coded indices.

Accessing output tensors by fixed indices is fragile. Consider using named outputs or validating the tensor names.

-  auto energy_ = predictor->GetOutputHandle(output_names[1]);
-  auto force_ = predictor->GetOutputHandle(output_names[2]);
-  auto virial_ = predictor->GetOutputHandle(output_names[3]);
+  const char* ENERGY_OUTPUT = "energy";
+  const char* FORCE_OUTPUT = "force";
+  const char* VIRIAL_OUTPUT = "virial";
+  
+  auto energy_ = predictor->GetOutputHandle(ENERGY_OUTPUT);
+  auto force_ = predictor->GetOutputHandle(FORCE_OUTPUT);
+  auto virial_ = predictor->GetOutputHandle(VIRIAL_OUTPUT);

Likely invalid or redundant comment.

coderabbitai · 2024-12-31T10:01:04Z

source/api_cc/src/DeepPotPD.cc

+  for (int i = 0; i < output_names.size(); ++i) {
+    printf("output_names[%d] = %s\n", i, output_names[i].c_str());
+  }


🛠️ Refactor suggestion

Remove debug print statements.

Debug print statements should not be present in production code.

- for (int i = 0; i < output_names.size(); ++i) { - printf("output_names[%d] = %s\n", i, output_names[i].c_str()); - }

coderabbitai · 2024-12-31T10:01:04Z

source/api_cc/src/DeepPotPD.cc

+  config->EnableUseGpu(
+      4096, 0);  // annotate it if use cpu, default use gpu with 4G mem
+  gpu_enabled = config->use_gpu();
+  gpu_enabled = false;


⚠️ Potential issue

Fix inconsistent GPU configuration.

The code enables GPU with 4GB memory but immediately disables it by setting gpu_enabled = false. This creates confusion about the intended execution environment.

- config->EnableUseGpu( - 4096, 0); // annotate it if use cpu, default use gpu with 4G mem - gpu_enabled = config->use_gpu(); - gpu_enabled = false; + if (gpu_num > 0) { + config->EnableUseGpu(4096, gpu_id); + gpu_enabled = true; + } else { + gpu_enabled = false; + }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-12-31T10:01:04Z

source/api_cc/src/DeepPotPD.cc

+  do_message_passing = 0;
+  rcut = 6.0;
+  ntypes = 2;
+  dfparam = 0;
+  daparam = 0;
+  aparam_nall = 0;
+  inited = true;


⚠️ Potential issue

Replace hard-coded model parameters with values from model buffers.

The code uses hard-coded values instead of retrieving them from model buffers. This could lead to inconsistencies if the model's parameters differ.

- do_message_passing = 0; - rcut = 6.0; - ntypes = 2; - dfparam = 0; - daparam = 0; - aparam_nall = 0; + DeepPotPD::get_buffer<int>("buffer_has_message_passing", do_message_passing); + DeepPotPD::get_buffer<double>("buffer_rcut", rcut); + DeepPotPD::get_buffer<int>("buffer_ntypes", ntypes); + DeepPotPD::get_buffer<int>("buffer_dfparam", dfparam); + DeepPotPD::get_buffer<int>("buffer_daparam", daparam); + DeepPotPD::get_buffer<int>("buffer_aparam_nall", aparam_nall);

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-12-31T10:01:05Z

source/api_cc/src/DeepPotPD.cc

+  if (model.find(".json") != std::string::npos) {
+    pdmodel_path = model;
+    pdiparams_path = model;
+    pdiparams_path.replace(pdiparams_path.find(".json"), 5,
+                           std::string(".pdiparams"));
+  } else if (model.find(".pdmodel") != std::string::npos) {
+    pdmodel_path = model;
+    pdiparams_path = model;
+    pdiparams_path.replace(pdiparams_path.find(".pdmodel"), 8,
+                           std::string(".pdiparams"));
+  } else {
+    throw deepmd::deepmd_exception("Given inference model: " + model +
+                                   " do not exist, please check it.");
+  }


🛠️ Refactor suggestion

Improve model path validation and error handling.

The error message is misleading when the file exists but has an unsupported extension. Additionally, the code assumes the extensions are in lowercase.

- if (model.find(".json") != std::string::npos) { + std::string model_lower = model; + std::transform(model_lower.begin(), model_lower.end(), model_lower.begin(), ::tolower); + if (model_lower.find(".json") != std::string::npos) { pdmodel_path = model; pdiparams_path = model; pdiparams_path.replace(pdiparams_path.find(".json"), 5, std::string(".pdiparams")); - } else if (model.find(".pdmodel") != std::string::npos) { + } else if (model_lower.find(".pdmodel") != std::string::npos) { pdmodel_path = model; pdiparams_path = model; pdiparams_path.replace(pdiparams_path.find(".pdmodel"), 8, std::string(".pdiparams")); } else { throw deepmd::deepmd_exception("Given inference model: " + model + - " do not exist, please check it."); + " has unsupported extension. Expected .json or .pdmodel"); }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

source/api_cc/include/DeepPotPD.h (3)

64-75: Improve extension handling for model files
When checking model.find(".json") or model.find(".pdmodel"), any mismatch throws a generic “do not exist, please check it.” It may be clearer to throw an exception about an “unsupported file extension,” especially if the file does exist but with an unexpected extension.

110-117: Consider retrieving hyperparameters from model buffers
Currently, rcut, ntypes, dfparam, etc., are hard-coded. This might cause desynchronization if the model defines different parameters. Consider using get_buffer<int> or get_buffer<double> calls to keep these values in sync with the model.

240-244: Clarify naming convention for methods
Methods such as compute, compute_mixed_type, and computew are similarly named but vary in both parameters and behavior. Consider consolidating or clarifying method names to reduce confusion.

source/api_cc/src/DeepPotPD.cc (2)

247-265: Atomic virial not supported yet
You throw an exception for atomic virial outputs. If this is a planned feature, document it clearly or remove the partial code until fully supported.

332-337: Remove or finalize commented-out sections
Large blocks of commented-out code for fparam_tensor and aparam_tensor remain. It may be best to remove them or wrap them in #ifdef DEBUG to maintain clarity.

Also applies to: 339-344

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 269e96e and 3f7e5af.

📒 Files selected for processing (3)

source/CMakeLists.txt (3 hunks)
source/api_cc/include/DeepPotPD.h (1 hunks)
source/api_cc/src/DeepPotPD.cc (1 hunks)

🔇 Additional comments (12)

source/api_cc/include/DeepPotPD.h (5)

12-12: Class Overview Looks Good

Overall, this newly introduced DeepPotPD class provides a clear and well-documented interface for the Paddle-based backend, indicating a solid design for managing model inference within deepmd.

1-2: License Header Acknowledgment

The SPDX identifier is properly included, ensuring license clarity.

26-28: Avoid passing small built-in types by const reference
This repeats a prior recommendation: passing small built-in types by const reference can be less efficient than passing by value.

- DeepPotPD(const std::string& model,
-           const int& gpu_rank = 0,
-           const std::string& file_content = "");
+ DeepPotPD(const std::string& model,
+           int gpu_rank = 0,
+           const std::string& file_content = "");

36-38: Avoid passing small built-in types by const reference
Same issue for the init method’s gpu_rank parameter.

- void init(const std::string& model,
-           const int& gpu_rank = 0,
-           const std::string& file_content = "");
+ void init(const std::string& model,
+           int gpu_rank = 0,
+           const std::string& file_content = "");

140-145: Check multi-atom parameters for consistency
In templated compute methods, the function references nall, aparam_nall, ntypes, .... Ensure all array-like parameters have the correct dimensions and that the code properly handles corner cases (e.g., zero atoms in a frame).

source/CMakeLists.txt (4)

12-12: New Option for Paddle Support
Introducing ENABLE_PADDLE is clear and maintains the project’s modular approach.

30-51: Consider verifying and hashing downloaded Paddle inference archives
This process downloads Paddle libraries but does not verify checksums. Corrupt or tampered files could compromise the build.

From prior feedback: adding a fallback or verification mechanism (e.g., EXPECTED_HASH SHA256=<hash>) could improve security.

67-70: Review path-based linking
When referencing library directories (${PADDLE_INFERENCE_DIR}/paddle/lib, etc.), please ensure these paths remain valid across different systems and do not overwrite user-provided settings.

Also applies to: 71-71

352-354: Clear backend listing
Listing Paddle among enabled backends is consistent. This helps users confirm the project’s build configuration.

source/api_cc/src/DeepPotPD.cc (3)

25-25: Use int instead of const int& for small integral parameters
Same best practice: avoid const int& for gpu_rank.

- DeepPotPD::DeepPotPD(const std::string& model,
-                      const int& gpu_rank,
-                      ...
+ DeepPotPD::DeepPotPD(const std::string& model,
+                      int gpu_rank,
+                      ...

65-78: More descriptive exception message for unsupported file extension
Currently, the code says "do not exist, please check it." For clarity, mention unsupported extension if .json or .pdmodel is missing.

219-223: Validate output handle indices
You’re fetching output_names[1], [2], [3], etc. Without verifying output_names.size(), out-of-range access is a risk if the model’s output changes.

Also applies to: 226-228, 230-233

coderabbitai · 2024-12-31T12:15:24Z

source/api_cc/src/DeepPotPD.cc

+  std::vector<int> firstneigh = createNlistTensorPD(nlist_data.jlist);
+  firstneigh_tensor = predictor->GetInputHandle("nlist");
+  firstneigh_tensor->Reshape({1, nloc, (int)firstneigh.size() / (int)nloc});
+  firstneigh_tensor->CopyFromCpu(firstneigh.data());


⚠️ Potential issue

Potential divide-by-zero
firstneigh_tensor->Reshape({1, nloc, (int)firstneigh.size() / (int)nloc}); could trigger a divide-by-zero if nloc == 0. Consider handling empty frames or local atoms carefully.

+ if (nloc == 0) { + throw deepmd::deepmd_exception("No local atoms found. Avoid dividing by zero."); + } firstneigh_tensor->Reshape({1, nloc, (int)firstneigh.size() / (int)nloc});

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

firstneigh_tensor->CopyFromCpu(firstneigh.data());

if (nloc == 0) {

throw deepmd::deepmd_exception("No local atoms found. Avoid dividing by zero.");

}

firstneigh_tensor->CopyFromCpu(firstneigh.data());

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

deepmd/pd/utils/serialization.py (2)

60-79: Consider enhancing the documentation and debugging support.

The output shape documentation is helpful, but could be improved:

Consider using a more structured format (e.g., docstring)
Add units for physical quantities
Document the purpose of full_graph=True and its implications for debugging

-    """ example output shape and dtype of forward
-    atom_energy: fetch_name_0 (1, 6, 1) float64
-    atom_virial: fetch_name_1 (1, 6, 1, 9) float64
-    energy: fetch_name_2 (1, 1) float64
-    force: fetch_name_3 (1, 6, 3) float64
-    mask: fetch_name_4 (1, 6) int32
-    virial: fetch_name_5 (1, 9) float64
-    """
+    """Output tensor specifications for forward method:
+    
+    Returns
+    -------
+    dict
+        atom_energy : paddle.Tensor
+            Per-atom energy contributions (shape: [batch_size, n_atoms, 1], dtype: float64, unit: eV)
+        atom_virial : paddle.Tensor
+            Per-atom virial contributions (shape: [batch_size, n_atoms, 1, 9], dtype: float64, unit: eV)
+        energy : paddle.Tensor
+            Total system energy (shape: [batch_size, 1], dtype: float64, unit: eV)
+        force : paddle.Tensor
+            Atomic forces (shape: [batch_size, n_atoms, 3], dtype: float64, unit: eV/Å)
+        mask : paddle.Tensor
+            Atom mask (shape: [batch_size, n_atoms], dtype: int32)
+        virial : paddle.Tensor
+            System virial (shape: [batch_size, 9], dtype: float64, unit: eV)
+            
+    Note
+    ----
+    Using full_graph=True optimizes the model but may impact debugging capabilities.
+    """

80-100: Apply similar documentation improvements to forward_lower method.

The documentation structure should be consistent with the forward method. Consider applying the same documentation improvements suggested above.

-    """ example output shape and dtype of forward_lower
-    fetch_name_0: atom_energy [1, 192, 1] paddle.float64
-    fetch_name_1: energy [1, 1] paddle.float64
-    fetch_name_2: extended_force [1, 5184, 3] paddle.float64
-    fetch_name_3: extended_virial [1, 5184, 1, 9] paddle.float64
-    fetch_name_4: virial [1, 9] paddle.float64
-    """
+    """Output tensor specifications for forward_lower method:
+    
+    Returns
+    -------
+    dict
+        atom_energy : paddle.Tensor
+            Per-atom energy contributions (shape: [batch_size, n_atoms, 1], dtype: float64, unit: eV)
+        energy : paddle.Tensor
+            Total system energy (shape: [batch_size, 1], dtype: float64, unit: eV)
+        extended_force : paddle.Tensor
+            Extended atomic forces (shape: [batch_size, n_extended, 3], dtype: float64, unit: eV/Å)
+        extended_virial : paddle.Tensor
+            Extended virial contributions (shape: [batch_size, n_extended, 1, 9], dtype: float64, unit: eV)
+        virial : paddle.Tensor
+            System virial (shape: [batch_size, 9], dtype: float64, unit: eV)
+            
+    Note
+    ----
+    The extended tensors include contributions from neighboring atoms in the neighbor list.
+    """

Additionally, consider documenting the purpose of the None input specifications in both methods.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76e9019 and 26f8010.

📒 Files selected for processing (2)

.github/workflows/test_cc.yml (2 hunks)
deepmd/pd/utils/serialization.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/test_cc.yml

⏰ Context from checks skipped due to timeout of 90000ms (19)

GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Test Python (6, 3.12)
GitHub Check: Test Python (6, 3.9)
GitHub Check: Test Python (5, 3.12)
GitHub Check: Test Python (5, 3.9)
GitHub Check: Test Python (4, 3.12)
GitHub Check: Test C++ (false)
GitHub Check: Test Python (4, 3.9)
GitHub Check: Test C++ (true)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Analyze (c-cpp)

🔇 Additional comments (2)

deepmd/pd/utils/serialization.py (2)

43-48: Well-implemented buffer registration!

Good use of register_buffer for storing the minimum neighbor distance as a tensor, which is the recommended approach for non-parameter tensors that should be saved and restored in the state dict.

36-40: Verify the impact of enabling primitive operations.

While enabling primitive operations can be necessary, it might affect performance. Consider documenting why this flag is required and its potential impact.

✅ Verification successful

Primitive operations flag is correctly used for JIT compilation

The _set_prim_all_enabled flag is a necessary technical requirement for Paddle's JIT compilation process, used in conjunction with other JIT-related flags (FLAGS_prim_enable_dynamic, FLAGS_enable_pir_api). This is not a performance concern but rather a requirement for proper model serialization.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for other usages of primitive operations flag
rg -l "paddle.framework.core._set_prim_all_enabled" 

# Search for any performance-related comments or documentation
rg -i "primitive.*performance|performance.*primitive"
Length of output: 140

Script:
#!/bin/bash
# Check the file content for comments and context
cat deepmd/pd/utils/serialization.py

# Search for Paddle backend documentation
fd README -e md | xargs rg -i "paddle.*backend|primitive.*operation"

# Search for related configuration settings
rg -i "paddle.*config|paddle.*setting"
Length of output: 3682

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

.github/workflows/ignorelist.txt (1)
1-1: Fix grammatical error in the comment.

The comment has a grammatical error.
-# Memory leak from functions call below can be ignored
+# Memory leaks from function calls below can be ignored
🧰 Tools

🪛 LanguageTool

[grammar] ~1-~1: The singular proper name ‘Memory’ must be used with a third-person or a past tense verb.
Context: # Memory leak from functions call below can be ignore...

(HE_VERB_AGR)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26f8010 and 60dabca.

📒 Files selected for processing (1)

.github/workflows/ignorelist.txt (1 hunks)

🧰 Additional context used

🪛 LanguageTool

.github/workflows/ignorelist.txt

[grammar] ~1-~1: The singular proper name ‘Memory’ must be used with a third-person or a past tense verb.
Context: # Memory leak from functions call below can be ignore...

(HE_VERB_AGR)

🔇 Additional comments (1)

.github/workflows/ignorelist.txt (1)

2-6: Consider a more targeted approach to memory leak detection.

While ignoring these functions addresses the immediate memory leak warnings, it might mask other potential issues. Consider:

Adding destructors to the operator-registered classes where possible.

Limiting the scope of ignored functions instead of excluding the entire module.

Let's verify if there are any other memory leaks that might be masked by these broad exclusions:

HydrogenSulfate · 2025-01-09T07:45:11Z

@njzjz I found using LSAN_OPTIONS might be better than ignore-list, for there were still some leak reports remained after I had written a list of match pattern. LSAN_OPTIONS seems works well and passed the related tests

https://clang.llvm.org/docs/AddressSanitizer.html#suppressing-memory-leaks and https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer#suppressions

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

.github/workflows/test_cc.yml (1)
65-65: Refactor duplicated library paths.

The Paddle library paths are duplicated in two environment configurations. Consider extracting these paths into a reusable environment variable to improve maintainability.

Example refactor:
env:
  PADDLE_LIBS: ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir
  BASE_LD_LIBRARY_PATH: ${{ github.workspace }}/dp_test/lib:${{ github.workspace }}/libtorch/lib
  PADDLE_LD_LIBRARY_PATH: ${{ env.PADDLE_LIBS }}/paddle/lib:${{ env.PADDLE_LIBS }}/third_party/install/onednn/lib:${{ env.PADDLE_LIBS }}/third_party/install/mklml/lib

# Then use:
LD_LIBRARY_PATH: ${{ env.BASE_LD_LIBRARY_PATH }}:${{ env.PADDLE_LD_LIBRARY_PATH }}:${LD_LIBRARY_PATH}
Also applies to: 74-74

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e5a9f7 and 8940bd9.

📒 Files selected for processing (1)

.github/workflows/test_cc.yml (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (20)

GitHub Check: Test Python (2, 3.12)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Analyze (python)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Test C++ (false)
GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Test C++ (true)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Analyze (c-cpp)
GitHub Check: Build C++ (cpu, cpu)

🔇 Additional comments (2)

.github/workflows/test_cc.yml (2)

54-57: LGTM: Switching to clang for better sanitizer support.

The change to use clang aligns with the previous discussion about GCC's limitations with sanitizer support. The leak sanitizer configuration looks good.

57-57: Verify the leak suppression configuration.

Please ensure that the suppressions file .github/workflows/suppr.txt exists and contains the necessary entries for the operator-registered classes mentioned in the PR discussion.

Run this script to verify the suppressions file:

✅ Verification successful

Leak suppression configuration is correctly set up ✅

The suppressions file exists and contains the appropriate entry to suppress memory leaks from Paddle operator-registered classes through the libpaddle_inference pattern.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the leak suppression configuration

# Check if the suppressions file exists
if [ ! -f .github/workflows/suppr.txt ]; then
    echo "Error: Suppressions file not found"
    exit 1
fi

# Check the content of the suppressions file
echo "Content of suppressions file:"
cat .github/workflows/suppr.txt

# Verify if it contains entries for Paddle
echo "Checking for Paddle-related suppressions:"
grep -i "paddle" .github/workflows/suppr.txt
Length of output: 335

HydrogenSulfate · 2025-01-09T10:12:29Z

some .so files can not be founded when I append paddle inference library after LD_LIBRARY_PATH in env

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

.github/workflows/test_cc.yml (1)
54-57: Consider adding fsanitize-ignorelist for more granular leak detection.

The switch to clang is good, but the current configuration might miss the opportunity for more granular control over leak detection. While using suppressions file is valid, fsanitize-ignorelist could provide more precise control over which functions to ignore.

Consider updating the CXXFLAGS to include fsanitize-ignorelist:
-        CXXFLAGS: ${{ matrix.check_memleak && '-fsanitize=leak' || '' }}
+        CXXFLAGS: ${{ matrix.check_memleak && format('-fsanitize=leak -fsanitize-ignorelist={0}/.github/workflows/ignorelist.txt', github.workspace) || '' }}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8940bd9 and 2b7481e.

📒 Files selected for processing (1)

.github/workflows/test_cc.yml (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (21)

GitHub Check: Test Python (3, 3.12)
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Test Python (2, 3.12)
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Test Python (2, 3.9)
GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Test Python (1, 3.12)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Test Python (1, 3.9)
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Test C++ (false)
GitHub Check: Test C++ (true)
GitHub Check: Analyze (c-cpp)

coderabbitai · 2025-01-09T11:23:17Z

.github/workflows/test_cc.yml

+    - run: |
+        cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so /dp_test/lib/
+        cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so /dp_test/lib/
+        cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so /dp_test/lib/
+        pytest --cov=deepmd source/ipi/tests


🛠️ Refactor suggestion

Reduce code duplication in test setup.

The library copying steps are duplicated between LAMMPS and IPI tests. Consider extracting this into a reusable step.

Create a composite action or shell script for the shared setup:

# .github/actions/setup-paddle-libs/action.yml name: 'Setup Paddle Libraries' description: 'Copy Paddle Inference libraries to test directory' runs: using: "composite" steps: - shell: bash run: | mkdir -p ${{ github.workspace }}/dp_test/lib/ cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so ${{ github.workspace }}/dp_test/lib/ cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so ${{ github.workspace }}/dp_test/lib/ cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so ${{ github.workspace }}/dp_test/lib/

Then use it in the workflow:

- - run: | - cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so /dp_test/lib/ - cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so /dp_test/lib/ - cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so /dp_test/lib/ + - uses: ./.github/actions/setup-paddle-libs

coderabbitai · 2025-01-09T11:23:17Z

.github/workflows/test_cc.yml

+    - run: |
+        cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so /dp_test/lib/
+        cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so /dp_test/lib/
+        cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so /dp_test/lib/
+        pytest --cov=deepmd source/lmp/tests


⚠️ Potential issue

Fix potential permission issues with absolute path.

Using the absolute path /dp_test/lib/ might cause permission issues in different CI environments. Consider using a relative path within the workspace.

Update the copy commands to use a relative path:

- cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so /dp_test/lib/ - cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so /dp_test/lib/ - cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so /dp_test/lib/ + mkdir -p ${{ github.workspace }}/dp_test/lib/ + cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so ${{ github.workspace }}/dp_test/lib/ + cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so ${{ github.workspace }}/dp_test/lib/ + cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so ${{ github.workspace }}/dp_test/lib/

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- run: |

cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so /dp_test/lib/

cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so /dp_test/lib/

cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so /dp_test/lib/

pytest --cov=deepmd source/lmp/tests

- run: |

mkdir -p ${{ github.workspace }}/dp_test/lib/

cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/paddle/lib/*.so ${{ github.workspace }}/dp_test/lib/

cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/onednn/lib/*.so ${{ github.workspace }}/dp_test/lib/

cp ${{ github.workspace }}/source/build_tests/paddle_inference_install_dir/third_party/install/mklml/lib/*.so ${{ github.workspace }}/dp_test/lib/

pytest --cov=deepmd source/lmp/tests

HydrogenSulfate · 2025-01-09T13:07:26Z

@njzjz It seems that there is only one last issue remaining, but I'm not sure how to resolve it. I don't see any connection between this and the Paddle inference library: https://github.com/deepmodeling/deepmd-kit/actions/runs/12690039613/job/35370052640?pr=4467

---------------------------- Captured stdout setup -----------------------------
LAMMPS (29 Aug 2024 - Update 1)
using 1 OpenMP thread(s) per MPI task
Open of file /home/runner/work/deepmd-kit/deepmd-kit/dp_test/lib/deepmd_lmp/dpplugin.so failed: /home/runner/work/deepmd-kit/deepmd-kit/dp_test/lib/deepmd_lmp/dpplugin.so: undefined symbol: _Z9MPI_BcastPviiii
Loaded 0 plugins from /home/runner/work/deepmd-kit/deepmd-kit/dp_test/lib/deepmd_lmp
LAMMPS output is captured by PyLammps wrapper
WARNING: run thermo data not captured since PYTHON LAMMPS package is not enabled

similar issue here: #2739 , #2071 , #3060

.github/workflows/test_cc.yml

Signed-off-by: Jinzhe Zeng <[email protected]>

…epmd-kit into add_cpp_infer

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

source/lmp/tests/test_lammps_pd.py (4)

239-243: Clean up the md_file after tests

The md_file is generated during several tests but is not removed in teardown_module, which could leave behind temporary files and consume unnecessary disk space.

Apply this diff to remove md_file in teardown_module:

 def teardown_module():
     os.remove(data_file)
     os.remove(data_type_map_file)
+    os.remove(md_file)

27-28: Remove unused variable system_file

The variable system_file is defined but not used anywhere in the code. Unused variables can lead to confusion and should be removed to improve code clarity.

Apply this diff to remove the unused variable:

-system_file = Path(__file__).parent.parent.parent / "tests"

250-272: Simplify unit handling in _lammps function

The unit handling in the _lammps function repeats similar if-elif-else blocks. This can be simplified by using a dictionary to map units to their corresponding values, improving code maintainability and readability.

Apply this diff to simplify unit handling:

 def _lammps(data_file, units="metal") -> PyLammps:
     lammps = PyLammps()
     lammps.units(units)
     lammps.boundary("p p p")
     lammps.atom_style("atomic")
-    if units == "metal" or units == "real":
-        lammps.neighbor("2.0 bin")
-    elif units == "si":
-        lammps.neighbor("2.0e-10 bin")
-    else:
-        raise ValueError("units should be metal, real, or si")
+    neighbor_settings = {
+        "metal": "2.0 bin",
+        "real": "2.0 bin",
+        "si": "2.0e-10 bin"
+    }
+    try:
+        lammps.neighbor(neighbor_settings[units])
+    except KeyError:
+        raise ValueError("units should be 'metal', 'real', or 'si'")
     lammps.neigh_modify("every 10 delay 0 check no")
     lammps.read_data(data_file.resolve())
-    if units == "metal" or units == "real":
-        lammps.mass("1 16")
-        lammps.mass("2 2")
-    elif units == "si":
-        lammps.mass("1 %.10e" % (16 * constants.mass_metal2si))
-        lammps.mass("2 %.10e" % (2 * constants.mass_metal2si))
-    else:
-        raise ValueError("units should be metal, real, or si")
+    mass_settings = {
+        "metal": [("1", "16"), ("2", "2")],
+        "real": [("1", "16"), ("2", "2")],
+        "si": [
+            ("1", f"{16 * constants.mass_metal2si:.10e}"),
+            ("2", f"{2 * constants.mass_metal2si:.10e}")
+        ]
+    }
+    try:
+        for atom_type, mass in mass_settings[units]:
+            lammps.mass(f"{atom_type} {mass}")
+    except KeyError:
+        raise ValueError("units should be 'metal', 'real', or 'si'")
     if units == "metal":
         lammps.timestep(0.0005)
     elif units == "real":
         lammps.timestep(0.5)
     elif units == "si":
         lammps.timestep(5e-16)
     else:
         raise ValueError("units should be metal, real, or si")
     lammps.fix("1 all nve")
     return lammps

703-708: Skip MPI tests if Paddle backend is not enabled

The MPI tests may fail if the Paddle backend is not enabled in the build configuration. To avoid false negatives in testing environments where Paddle is not available, consider adding a skip condition.

Apply this diff to skip tests conditionally:

 @pytest.mark.skipif(
     shutil.which("mpirun") is None, reason="MPI is not installed on this system"
 )
 @pytest.mark.skipif(
     importlib.util.find_spec("mpi4py") is None, reason="mpi4py is not installed"
 )
+@pytest.mark.skipif(
+    not constants.ENABLE_PADDLE, reason="Paddle backend is not enabled"
+)
 @pytest.mark.parametrize(
     ("balance_args",),
     [(["--balance"],), ([],)],
 )

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b761e83 and 58711c7.

📒 Files selected for processing (1)

source/lmp/tests/test_lammps_pd.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (9)

GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Analyze (python)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Analyze (c-cpp)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)

🔇 Additional comments (2)

source/lmp/tests/test_lammps_pd.py (2)

239-243: Ensure all test artifacts are properly cleaned up

data_file_si is created during setup_module but is not removed in teardown_module, which could leave behind temporary files.

Apply this diff to remove data_file_si in teardown_module:
 def teardown_module():
     os.remove(data_file)
     os.remove(data_type_map_file)
+    os.remove(data_file_si)
737-743: Add assertions for empty or malformed md data

In test_pair_deepmd_mpi, after loading md = np.loadtxt(md_file.resolve()), there are no checks to ensure that the data is valid. If the md file is empty or malformed, it could lead to misleading test results. Consider adding assertions to validate the md data before proceeding.

Apply this diff to add data validation:
     md = np.loadtxt(md_file.resolve())
+    assert md.size > 0, "Loaded md data is empty"
+    assert not np.any(np.isnan(md)), "md data contains NaN values"
✅ Verification successful

The suggested validation checks are appropriate and should be implemented

The suggested validation checks are well-placed and address real potential issues:

Empty data check prevents IndexError in subsequent array indexing

NaN check prevents invalid statistical calculations

The validation is placed at the correct location, immediately after loading the data

The review comment's suggestion should be applied as proposed:
     md = np.loadtxt(md_file.resolve())
+    assert md.size > 0, "Loaded md data is empty"
+    assert not np.any(np.isnan(md)), "md data contains NaN values"
🏁 Scripts executed

The following scripts were executed for the analysis:

Error: No shell scripts found

Script:
#!/bin/bash
# Find the complete test_pair_deepmd_mpi function implementation
ast-grep --pattern 'def test_pair_deepmd_mpi($$$):
    $$$
'  source/lmp/tests/test_lammps_pd.py

# Find similar md data loading patterns
rg "np\.loadtxt.*md.*" -A 5 source/lmp/tests/

# Look for existing validation patterns with md data
rg "assert.*md" source/lmp/tests/
Length of output: 107378

coderabbitai · 2025-01-09T14:02:46Z

source/lmp/tests/test_lammps_pd.py

+sp.check_output(
+    f"{sys.executable} -m deepmd convert-from pbtxt -i {pbtxt_file2.resolve()} -o {pb_file2.resolve()}".split()
+)


🛠️ Refactor suggestion

Handle exceptions when converting model files

The sp.check_output call that converts the model files does not handle exceptions. If the conversion fails, it could cause the tests to crash unexpectedly. Consider adding a try-except block to handle potential errors gracefully.

Apply this diff to handle exceptions:

+try: sp.check_output( f"{sys.executable} -m deepmd convert-from pbtxt -i {pbtxt_file2.resolve()} -o {pb_file2.resolve()}".split() ) +except sp.CalledProcessError as e: + pytest.fail(f"Model conversion failed: {e}")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

sp.check_output(

f"{sys.executable} -m deepmd convert-from pbtxt -i {pbtxt_file2.resolve()} -o {pb_file2.resolve()}".split()

)

try:

sp.check_output(

f"{sys.executable} -m deepmd convert-from pbtxt -i {pbtxt_file2.resolve()} -o {pb_file2.resolve()}".split()

)

except sp.CalledProcessError as e:

pytest.fail(f"Model conversion failed: {e}")

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

source/lmp/tests/test_lammps_pd.py (3)

247-278: Refactor duplicate error messages and document magic numbers.

Consider the following improvements:

Extract the repeated error message into a constant to avoid duplication
Document or extract the hardcoded neighbor cutoff value (2.0) into a named constant with explanation

+# At the top of the file with other constants
+ALLOWED_UNITS = ["metal", "real", "si"]
+NEIGHBOR_CUTOFF = 2.0  # Angstroms for metal/real, meters for si
+NEIGHBOR_CUTOFF_SI = NEIGHBOR_CUTOFF * 1e-10  # Convert to meters for si units
+
 def _lammps(data_file, units="metal") -> PyLammps:
     lammps = PyLammps()
     lammps.units(units)
     lammps.boundary("p p p")
     lammps.atom_style("atomic")
     if units == "metal" or units == "real":
-        lammps.neighbor("2.0 bin")
+        lammps.neighbor(f"{NEIGHBOR_CUTOFF} bin")
     elif units == "si":
-        lammps.neighbor("2.0e-10 bin")
+        lammps.neighbor(f"{NEIGHBOR_CUTOFF_SI} bin")
     else:
-        raise ValueError("units should be metal, real, or si")
+        raise ValueError(f"units should be one of {ALLOWED_UNITS}")

343-691: Extract common model deviation validation logic.

The test functions for model deviation calculations contain repeated patterns for validating forces and virial tensors. Consider extracting these common validation patterns into helper functions to improve maintainability and reduce duplication.

Example refactor:

def validate_model_deviation_forces(md, expected_f, expected_f2, unit_conversion=1.0):
    """Helper to validate force-related model deviations."""
    expected_md_f = np.linalg.norm(np.std([expected_f, expected_f2], axis=0), axis=1)
    assert md[7:] == pytest.approx(expected_md_f * unit_conversion)
    assert md[4] == pytest.approx(np.max(expected_md_f) * unit_conversion)
    assert md[5] == pytest.approx(np.min(expected_md_f) * unit_conversion)
    assert md[6] == pytest.approx(np.mean(expected_md_f) * unit_conversion)

def validate_model_deviation_virial(md, expected_v, expected_v2, unit_conversion=1.0):
    """Helper to validate virial-related model deviations."""
    expected_md_v = (
        np.std([np.sum(expected_v, axis=0), np.sum(expected_v2, axis=0)], axis=0) / 6
    )
    assert md[1] == pytest.approx(np.max(expected_md_v) * unit_conversion)
    assert md[2] == pytest.approx(np.min(expected_md_v) * unit_conversion)
    assert md[3] == pytest.approx(
        np.sqrt(np.mean(np.square(expected_md_v))) * unit_conversion
    )

706-752: Parameterize the number of MPI processes.

The test hardcodes the number of MPI processes to 2. Consider parameterizing this value to test with different numbers of processes and document why 2 was chosen as the default.

+# At the top of the file with other constants
+DEFAULT_MPI_PROCS = 2  # Minimum number of processes needed to test MPI functionality
+
 @pytest.mark.skipif(
     shutil.which("mpirun") is None, reason="MPI is not installed on this system"
 )
 @pytest.mark.skipif(
     importlib.util.find_spec("mpi4py") is None, reason="mpi4py is not installed"
 )
 @pytest.mark.parametrize(
-    ("balance_args",),
-    [(["--balance"],), ([],)],
+    ("balance_args", "n_procs"),
+    [
+        (["--balance"], DEFAULT_MPI_PROCS),
+        ([], DEFAULT_MPI_PROCS),
+        (["--balance"], DEFAULT_MPI_PROCS * 2),  # Test with more processes
+    ],
 )
-def test_pair_deepmd_mpi(balance_args: list):
+def test_pair_deepmd_mpi(balance_args: list, n_procs: int):
     with tempfile.NamedTemporaryFile() as f:
         sp.check_call(
             [
                 "mpirun",
                 "-n",
-                "2",
+                str(n_procs),
                 sys.executable,

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 58711c7 and d265094.

📒 Files selected for processing (1)

source/lmp/tests/test_lammps_pd.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (7)

GitHub Check: Build C++ (clang, clang)
GitHub Check: Analyze (python)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Analyze (c-cpp)

🔇 Additional comments (4)

source/lmp/tests/test_lammps_pd.py (4)

1-21: LGTM! Well-organized imports.

The file has appropriate license header and well-organized imports that follow Python best practices.

226-228: Handle exceptions when converting model files.

The sp.check_output call that converts the model files does not handle exceptions. If the conversion fails, it could cause the tests to crash unexpectedly.

242-245: Ensure all test artifacts are properly cleaned up.

data_file_si is created during setup but is never removed in teardown_module, which could leave behind temporary files.

280-306: LGTM! Well-structured test fixtures.

The fixtures follow pytest best practices:

Proper resource management with yield and cleanup

Clear separation of concerns for different unit systems

Consistent pattern across all fixtures

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

source/lmp/tests/test_lammps_pd.py (1)

308-309: Refactor to reduce code duplication in test functions

Multiple test functions, such as test_pair_deepmd, test_pair_deepmd_virial, and others, contain repeated code for setting up the pair_style, pair_coeff, running the simulation, and performing assertions. To improve maintainability and reduce duplication, consider extracting common code into helper functions or parameterized tests.

Also applies to: 320-321, 343-344, 369-370, 408-409, 438-439, 471-472, 483-484, 497-498, 530-531, 560-561, 624-625, 657-658, 694-695

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d265094 and 05759cc.

📒 Files selected for processing (1)

source/lmp/tests/test_lammps_pd.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (14)

GitHub Check: Test Python (4, 3.12)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Analyze (python)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Analyze (c-cpp)

🔇 Additional comments (2)

source/lmp/tests/test_lammps_pd.py (2)

226-228: Handle exceptions when converting model files

The sp.check_output call that converts the model files does not handle exceptions. If the conversion fails, it could cause the tests to crash unexpectedly. Consider adding a try-except block to handle potential errors gracefully.

242-244: Ensure all test artifacts are properly cleaned up

data_file_si is created during setup_module() but is not removed in teardown_module(). This could leave behind temporary files after tests are run. Please add the removal of data_file_si to prevent cluttering the filesystem.

Apply this diff to clean up the test artifact:
 def teardown_module():
     os.remove(data_file)
     os.remove(data_type_map_file)
+    os.remove(data_file_si)

rerun ci Signed-off-by: HydrogenSulfate <[email protected]>

add LAMMPS CPP inference

bea3123

github-actions bot added the C++ label Dec 10, 2024

coderabbitai bot reviewed Dec 10, 2024

View reviewed changes

update code

2e6cae4

github-actions bot added the LAMMPS label Dec 10, 2024

coderabbitai bot reviewed Dec 10, 2024

View reviewed changes

github-advanced-security bot found potential problems Dec 10, 2024

View reviewed changes

HydrogenSulfate changed the title ~~pd: add CPP inference with LAMMPS~~ [WIP] pd: add CPP inference with LAMMPS Dec 11, 2024

njzjz reviewed Dec 25, 2024

View reviewed changes

update CPP inference test(deeppot_sea)

a8145fa

github-actions bot added the Python label Dec 27, 2024

coderabbitai bot reviewed Dec 27, 2024

View reviewed changes

refine code

0e6172d

HydrogenSulfate changed the title ~~[WIP] pd: add CPP inference with LAMMPS~~ pd: add CPP inference with LAMMPS Dec 27, 2024

fix suggestions from code rabbit

79f0b98

HydrogenSulfate added 2 commits December 31, 2024 15:44

update

c2a9295

update test shell

5a048c4

HydrogenSulfate force-pushed the add_cpp_infer branch from 6cc6eb7 to 5a048c4 Compare December 31, 2024 07:56

coderabbitai bot reviewed Dec 31, 2024

View reviewed changes

change createNlistTensor to createNlistTensorPD to avoid mutliple def…

8800bb0

…initions

github-advanced-security bot found potential problems Dec 31, 2024

View reviewed changes

source/api_cc/src/DeepPotPD.cc Fixed Show fixed Hide fixed

use CPU inference lib instead of GPU inference lib

269e96e

coderabbitai bot reviewed Dec 31, 2024

View reviewed changes

resize output vector size before copy to CPU

3f7e5af

coderabbitai bot reviewed Dec 31, 2024

View reviewed changes

fix test_deeppot_pd related code

ff50600

HydrogenSulfate force-pushed the add_cpp_infer branch from c99cbf5 to 26f8010 Compare January 8, 2025 15:47

coderabbitai bot reviewed Jan 8, 2025

View reviewed changes

add libpaddle_inference.so as module to igorelist.txt

60dabca

coderabbitai bot reviewed Jan 9, 2025

View reviewed changes

HydrogenSulfate added 3 commits January 9, 2025 14:53

use LSAN_OPTIONS instead of fsanitize-ignorelist

17b39c1

remove LD_LIBRARY_PATH in cov test env

1679314

Merge branch 'devel' into add_cpp_infer

2e9bfa7

HydrogenSulfate added 3 commits January 9, 2025 16:42

fix

2944d58

update source/lmp/plugin/CMakeLists.txt

8e5a9f7

append original LD_LIBRARY_PATH to

8940bd9

coderabbitai bot reviewed Jan 9, 2025

View reviewed changes

HydrogenSulfate force-pushed the add_cpp_infer branch from 69e96c0 to 2b7481e Compare January 9, 2025 11:19

coderabbitai bot reviewed Jan 9, 2025

View reviewed changes

HydrogenSulfate force-pushed the add_cpp_infer branch from 2b7481e to 890063c Compare January 9, 2025 12:17

copy paddle so instead of set LD_LIBRARY_PATH

4bc5db3

HydrogenSulfate force-pushed the add_cpp_infer branch from 890063c to 4bc5db3 Compare January 9, 2025 12:38

njzjz reviewed Jan 9, 2025

View reviewed changes

.github/workflows/test_cc.yml Outdated Show resolved Hide resolved

njzjz and others added 3 commits January 9, 2025 21:15

revert Clang to GCC

b761e83

Signed-off-by: Jinzhe Zeng <[email protected]>

fix rtol to 1e-6 and atol to 1e-9

38904c3

Merge branch 'add_cpp_infer' of https://github.com/HydrogenSulfate/de…

58711c7

…epmd-kit into add_cpp_infer

coderabbitai bot reviewed Jan 9, 2025

View reviewed changes

update rtol and atol for python unitest

d265094

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

skip test_pair_deepmd_mpi

05759cc

coderabbitai bot reviewed Jan 10, 2025

View reviewed changes

HydrogenSulfate and others added 2 commits January 10, 2025 17:13

Update build_cxx.sh

50122a7

rerun ci Signed-off-by: HydrogenSulfate <[email protected]>

Fix PATH export location in test workflow

2a2fd1b

		force.resize(static_cast<size_t>(nframes) * fwd_map.size() * 3);
		select_map<VALUETYPE>(force, dforce, bkw_map, 3, nframes, fwd_map.size(),

pd: add CPP inference with LAMMPS #4467

Are you sure you want to change the base?

pd: add CPP inference with LAMMPS #4467

Conversation

HydrogenSulfate commented Dec 10, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Dec 10, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 10, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov bot commented Dec 10, 2024 • edited Loading

Codecov Report

njzjz left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

HydrogenSulfate commented Dec 27, 2024

njzjz commented Dec 27, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 31, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

HydrogenSulfate commented Jan 9, 2025 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

HydrogenSulfate commented Jan 9, 2025 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 9, 2025

Choose a reason for hiding this comment

coderabbitai bot Jan 9, 2025

Choose a reason for hiding this comment

HydrogenSulfate commented Jan 9, 2025 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 9, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

HydrogenSulfate commented Dec 10, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 10, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Dec 10, 2024 •

edited

Loading

HydrogenSulfate commented Jan 9, 2025 •

edited

Loading

HydrogenSulfate commented Jan 9, 2025 •

edited

Loading

HydrogenSulfate commented Jan 9, 2025 •

edited

Loading