-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chore: Analyze dependency and imports using deptry
#534
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces significant updates to the Changes
Sequence Diagram(s)sequenceDiagram
participant Developer
participant CI/CD
participant DependencyAnalysis
Developer->>CI/CD: Push changes
CI/CD->>DependencyAnalysis: Trigger dependency analysis job
DependencyAnalysis->>CI/CD: Analyze dependencies
CI/CD-->>Developer: Analysis complete
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
pyproject.toml (3)
75-75
: LGTM! Quick thought about tool integration though...The addition of deptry looks good! Since we have quite a comprehensive set of dev tools already (ruff, mypy, etc.), would you consider adding a deptry check to the
[tool.poe.tasks]
section to make it part of our standard checks? Something like:[tool.poe.tasks] -check = { shell = "ruff check . && mypy . && pytest --collect-only -qq" } +check = { shell = "ruff check . && mypy . && deptry . && pytest --collect-only -qq" }wdyt? 🤔
Line range hint
182-183
: Quick thought about tool overlap! 🤔I notice we have ruff's
I
(isort) rules enabled and comprehensive import configurations. Since deptry also analyzes imports, should we document somewhere (maybe in README.md) how these tools complement each other? This could help clarify:
- ruff (I): import sorting and conventions
- deptry: dependency usage analysis
wdyt?Also applies to: 205-206
Hey there! I've got some interesting findings about those dependencies! 🔍
After checking the actual usage patterns, here's what I found:
yaml
is actively used throughout the codebase for config loading and dumping - this should definitely be added as a direct dependencypytz
is actively used inrecords.py
for timezone handling - this needs to be added tooduckdb
is used in several files including processors and caches - this is a needed dependencypsycopg
is used in test fixtures (conftest.py
) - this is required for testingnumpy
shows no direct usage or imports asnp
- maybe we could drop this one?Would you like to update the dependencies like this? 👇
- Add:
pyyaml
andpytz
as direct dependencies- Keep:
duckdb
andpsycopg
as they're actively used- Remove:
numpy
since it's not used directly (unless it's a transitive dependency?)wdyt? 🤔
🔗 Analysis chain
Line range hint
18-47
: Let's verify those dependency findings from deptry! 🕵️The deptry scan revealed some interesting insights:
- Missing dependencies ('yaml', 'pytz')
- Unused dependencies ('duckdb', 'numpy', 'psycopg')
Would you like me to help verify these findings? We can run a quick check to confirm the usage patterns.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify dependency usage patterns # Check for yaml imports echo "Checking for yaml imports..." rg -l "import yaml|from yaml" || echo "No yaml imports found" # Check for pytz imports echo "Checking for pytz imports..." rg -l "import pytz|from pytz" || echo "No pytz imports found" # Check for duckdb usage echo "Checking for duckdb usage..." rg -l "import duckdb|from duckdb" || echo "No duckdb imports found" # Check for numpy usage echo "Checking for numpy usage..." rg -l "import numpy|from numpy" || echo "No numpy imports found" # Check for psycopg usage echo "Checking for psycopg usage..." rg -l "import psycopg|from psycopg" || echo "No psycopg imports found"Length of output: 1089
Script:
#!/bin/bash # Let's check the actual usage patterns more deeply to understand if these are just imports or actual usage # For yaml - check actual usage beyond imports echo "Checking yaml usage patterns..." rg "yaml\." -A 2 # For numpy - check if it might be used as 'np' echo "Checking for numpy as np..." rg -l "import numpy as np|from numpy as np" # For psycopg - check actual usage in conftest echo "Checking psycopg usage in conftest..." rg "psycopg" tests/conftest.py -A 2 # For duckdb - check if it's used in actual data processing echo "Checking duckdb actual usage..." rg "duckdb\." -A 2 # Check if pytz is used beyond imports echo "Checking pytz usage patterns..." rg "pytz\." -A 2Length of output: 4507
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (1)
pyproject.toml
(1 hunks)
/poetry-lock
|
Resolving dependencies... changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
pyproject.toml (2)
Line range hint
14-14
: Python version constraint looks good, but shall we document the reason for <3.13? 🤔The constraint aligns well with sqlalchemy-bigquery's requirement. Would it be helpful to add a comment explaining this relationship, similar to how we documented other version constraints? wdyt?
75-75
: Consider integrating deptry into the CI pipeline? 🛠️Since we're adding deptry for dependency analysis, would it make sense to add it to the
[tool.poe.tasks]
section as a check task? This way, we could run it alongside other checks like ruff and mypy. wdyt?Example addition:
[tool.poe.tasks] -check = { shell = "ruff check . && mypy . && pytest --collect-only -qq" } +check = { shell = "ruff check . && mypy . && deptry . && pytest --collect-only -qq" }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (1)
pyproject.toml
(1 hunks)
🔇 Additional comments (1)
pyproject.toml (1)
Line range hint 17-43
: Shall we address the dependency issues flagged by deptry? 📦
The scan revealed several potential issues:
-
Unused dependencies:
- duckdb
- numpy
- psycopg
-
Missing dependencies:
- yaml
- pytz
Would you like me to help verify these findings? We could:
- Check for indirect usage of the "unused" dependencies
- Verify if the missing ones are actually transitive dependencies
Let's check for actual usage:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
.github/workflows/python_pytest.yml (1)
191-211
: Consider uploading deptry results as an artifact.The job looks good! For better visibility and debugging, would you consider uploading the deptry results as an artifact? This could help track dependency issues over time, wdyt?
- name: Run Deptry run: | poetry run deptry . > deptry-results.txt + +- name: Upload Deptry Results + if: always() + uses: actions/upload-artifact@v4 + with: + name: deptry-analysis + path: deptry-results.txt
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (2)
.github/workflows/python_pytest.yml
(1 hunks)pyproject.toml
(3 hunks)
🔇 Additional comments (1)
pyproject.toml (1)
347-361
: LGTM! Well-structured deptry configuration.
The deptry configuration is clear and well-documented. The ignore rules are thoughtfully organized with helpful comments explaining each exclusion.
deptry
I wanted to know if we could auto-detect dependencies which are added to the project but not used today. The
deptry
tool has some fale positivies (mostly because SQLAlchemy modules aren't imported directly) but otherwise seems to add value.This PR resolves (or ignores) all dependency issues, resulting in removal of a couple unused packages, and declaring transitive dependencies which we referenced directly but not directly declared as needed.
When wanting to not add version constraints for transitive dependencies also used by PyAirbyte, in some cases I applied "*" version restriction - this makes the dependency explicit while not adding any new restrictions beyond what the dependencies already restrict to.
Summary by CodeRabbit
New Features
Chores
>=3.10,<3.13
.