-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazily import torch/pydantic
in json
module, speedup from monty.json import
by 10x
#713
Lazily import torch/pydantic
in json
module, speedup from monty.json import
by 10x
#713
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #713 +/- ##
==========================================
- Coverage 81.02% 80.99% -0.03%
==========================================
Files 27 27
Lines 1586 1589 +3
Branches 361 285 -76
==========================================
+ Hits 1285 1287 +2
- Misses 236 239 +3
+ Partials 65 63 -2 ☔ View full report in Codecov by Sentry. |
MontyEncoder
and MontyDecoder
torch
for MontyEncoder
and MontyDecoder
952eb94
to
f389e24
Compare
torch
for MontyEncoder
and MontyDecoder
torch/numpy/pydantic/ruamel
for MontyEncoder
and MontyDecoder
torch/numpy/pydantic/ruamel
for MontyEncoder
and MontyDecoder
torch/numpy/pydantic
for MontyEncoder
and MontyDecoder
torch/numpy/pydantic
for MontyEncoder
and MontyDecoder
torch/numpy/pydantic
in json
module
if torch is not None and isinstance(o, torch.Tensor): | ||
# Support for Pytorch Tensors. | ||
# Support for Pytorch Tensors | ||
if _check_type(o, "torch.Tensor"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be the correct type string:
import torch
tensor = torch.randn(3, 3)
print(type(tensor).mro())
Gives: [<class 'torch.Tensor'>, <class 'torch._C.TensorBase'>, <class 'object'>]
@@ -660,7 +644,7 @@ def default(self, o) -> dict: | |||
raise AttributeError(e) | |||
|
|||
try: | |||
if pydantic is not None and isinstance(o, pydantic.BaseModel): | |||
if _check_type(o, "pydantic.main.BaseModel"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from pydantic import BaseModel
class MyModel(BaseModel):
name: str
model_instance = MyModel(name="monty")
print(type(model_instance).mro())
Gives: [<class '__main__.MyModel'>, <class 'pydantic.main.BaseModel'>, <class 'object'>]
torch/numpy/pydantic
in json
moduletorch/numpy/pydantic
in json
module, speedup from monty.json import
by 10x
I don't think we need to lazy import numpy. Numpy is used everywhere. We need to keep in mind that the import cost will be incurred regardless. I am fine with lazy importing time, pydantic and torch since those are used much less commonly. |
Cool thanks for the comment! Sure thing I would revert the part for Can you please review #714 which would unblock this PR (otherwise nesting those import blocks would make it really hard to see what optional dependency is required)? Really appreciate that |
torch/numpy/pydantic
in json
module, speedup from monty.json import
by 10xtorch/pydantic
in json
module, speedup from monty.json import
by 10x
I merged 714 |
Thank you again! Would clean up this PR now |
WalkthroughThe changes in the pull request focus on enhancing type hinting, error handling, and code structure in the Changes
Possibly related PRs
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (4)
src/monty/json.py (4)
Line range hint
574-580
: Replace deprecatedo.type()
witho.dtype
inMontyEncoder
The use of
o.type()
is deprecated in recent versions of PyTorch. It's recommended to useo.dtype
for compatibility with newer versions.Apply the following changes to update the code:
if _check_type(o, "torch.Tensor"): - d: dict[str, Any] = { + d = { "@module": "torch", "@class": "Tensor", - "dtype": o.type(), + "dtype": str(o.dtype), } - if "Complex" in o.type(): + if "Complex" in str(o.dtype): d["data"] = [o.real.tolist(), o.imag.tolist()] else: d["data"] = o.numpy().tolist() return dThis ensures future compatibility by using
o.dtype
and converting it to a string where necessary.
925-925
: Redundant line without codeLine 925 appears to be an empty or redundant line without any code changes.
Consider removing this line to maintain code cleanliness.
965-965
: Unnecessary line without code changeLine 965 does not contain any code changes and can be removed for clarity.
Removing empty lines helps keep the code concise.
Line range hint
1005-1009
: Improve handling of Pydantic BaseModel injsanitize
The check for
pydantic.main.BaseModel
should handle cases where Pydantic may not be installed, preventing potentialImportError
.Update the code to safely check for Pydantic BaseModel:
+try: if _check_type(obj, "pydantic.main.BaseModel"): return jsanitize( MontyEncoder().default(obj), strict=strict, allow_bson=allow_bson, enum_values=enum_values, recursive_msonable=recursive_msonable, ) +except ImportError: + passThis ensures that the code does not raise an exception if Pydantic is not available.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- src/monty/json.py (17 hunks)
🧰 Additional context used
🔇 Additional comments (6)
src/monty/json.py (6)
27-28
: Efficient use ofTYPE_CHECKING
to optimize importsThe use of
TYPE_CHECKING
to conditionally importAny
ensures that the import occurs only during type checking, reducing runtime overhead.
44-44
: Added return type annotation for_load_redirect
Including the return type
-> dict
enhances type checking and improves code readability.
55-55
: Type annotation forredirect_dict
enhances claritySpecifying the type
dict
forredirect_dict
aids in static analysis and clarifies the expected data structure.
71-71
: Updated_check_type
signature for flexibilityAllowing
type_str
to be either astr
ortuple[str, ...]
makes the function more versatile in handling multiple type strings.
105-105
: Optimized type checking in_check_type
The use of a generator expression with
any()
efficiently checks if the object's type matches any intype_str
.
322-326
: Conditional import forpydantic_core
with clear error handlingImporting
pydantic_core
within the method and providing a specific error message if it's not available ensures that the dependency is only required when needed, and users are informed appropriately.
@shyuep ready for review, thanks |
6495fa3
to
6f9580c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
.github/workflows/test.yml (1)
12-12
: Approved: Python version matrix update enhances testing precision.The change from
["3.9", "3.x"]
to["3.9", "3.12"]
ensures testing on both an older stable version (3.9) and the latest Python version (3.12). This aligns well with the benchmarking environment mentioned in the PR objectives and will help catch any Python 3.12-specific issues early.Consider adding an intermediate version (e.g., 3.11) to the matrix to catch potential issues with versions between 3.9 and 3.12:
python-version: ["3.9", "3.11", "3.12"]This would provide more comprehensive coverage across Python versions without significantly increasing the testing time.
Summary
torch/pydantic
formonty.json
, thanks for @janosh's profiling inpymatviz
import cost way too high janosh/pymatviz#209 (comment), would significantly speed upfrom monty.json import MSONable/MontyEncoder/MontyDecoder/...
Performance comparison
python -X importtime -c "from monty.json import MSONable" 2> monty.log && tuna monty.log
Benchmark Script (by GPT)
Import time from master branch
Import time traced for
from monty.json import MSONable
:Import time from tip of this PR
After lazily import
torch
After lazily import
numpy
as well (reverted, as use of np is very common)After lazily import
pydantic
too (limited improvement from here)After lazily import
ruamel.yaml
(reverted, no obvious improvement)Summary by CodeRabbit