Skip to content

Commit

Permalink
Improve docs for flake8-use-pathlib rules (#14741)
Browse files Browse the repository at this point in the history
Flag the perf impact more clearly, add more links, clarify the rule
about the glob module
  • Loading branch information
AlexWaygood authored Dec 3, 2024
1 parent bf0fd04 commit 70bd106
Show file tree
Hide file tree
Showing 7 changed files with 214 additions and 121 deletions.
28 changes: 18 additions & 10 deletions crates/ruff_linter/src/rules/flake8_use_pathlib/rules/glob_rule.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ use ruff_diagnostics::Violation;
use ruff_macros::{derive_message_formats, ViolationMetadata};

/// ## What it does
/// Checks for the use of `glob` and `iglob`.
/// Checks for the use of `glob.glob()` and `glob.iglob()`.
///
/// ## Why is this bad?
/// `pathlib` offers a high-level API for path manipulation, as compared to
Expand All @@ -12,35 +12,43 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
/// improve readability over their low-level counterparts (e.g.,
/// `glob.glob()`).
///
/// Note that `glob.glob` and `Path.glob` are not exact equivalents:
/// Note that `glob.glob()` and `Path.glob()` are not exact equivalents:
///
/// | | `glob` | `Path.glob` |
/// |-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|
/// | Hidden files | Excludes hidden files by default. From Python 3.11 onwards, the `include_hidden` keyword can be used to include hidden directories. | Includes hidden files by default. |
/// | Iterator | `iglob` returns an iterator. Under the hood, `glob` simply converts the iterator to a list. | `Path.glob` returns an iterator. |
/// | Working directory | `glob` takes a `root_dir` keyword to set the current working directory. | `Path.rglob` can be used to return the relative path. |
/// | Globstar (`**`) | `glob` requires the `recursive` flag to be set to `True` for the `**` pattern to match any files and zero or more directories, subdirectories, and symbolic links. | The `**` pattern in `Path.glob` means "this directory and all subdirectories, recursively". In other words, it enables recursive globbing. |
/// | | `glob`-module functions | `Path.glob()` |
/// |-------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
/// | Hidden files | Hidden files are excluded by default. On Python 3.11+, the `include_hidden` keyword can be used to include hidden directories. | Includes hidden files by default. |
/// | Eagerness | `glob.iglob()` returns a lazy iterator. Under the hood, `glob.glob()` simply converts the iterator to a list. | `Path.glob()` returns a lazy iterator. |
/// | Working directory | `glob.glob()` and `glob.iglob()` take a `root_dir` keyword to set the current working directory. | `Path.rglob()` can be used to return the relative path. |
/// | Globstar (`**`) | The `recursive` flag must be set to `True` for the `**` pattern to match any files and zero or more directories, subdirectories, and symbolic links. | The `**` pattern in `Path.glob()` means "this directory and all subdirectories, recursively". In other words, it enables recursive globbing. |
///
/// ## Example
/// ```python
/// import glob
/// import os
///
/// glob.glob(os.path.join(path, "requirements*.txt"))
/// glob.glob(os.path.join("my_path", "requirements*.txt"))
/// ```
///
/// Use instead:
/// ```python
/// from pathlib import Path
///
/// Path(path).glob("requirements*.txt")
/// Path("my_path").glob("requirements*.txt")
/// ```
///
/// ## Known issues
/// While using `pathlib` can improve the readability and type safety of your code,
/// it can be less performant than the lower-level alternatives that work directly with strings,
/// especially on older versions of Python.
///
/// ## References
/// - [Python documentation: `Path.glob`](https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob)
/// - [Python documentation: `Path.rglob`](https://docs.python.org/3/library/pathlib.html#pathlib.Path.rglob)
/// - [Python documentation: `glob.glob`](https://docs.python.org/3/library/glob.html#glob.glob)
/// - [Python documentation: `glob.iglob`](https://docs.python.org/3/library/glob.html#glob.iglob)
/// - [PEP 428 – The pathlib module – object-oriented filesystem paths](https://peps.python.org/pep-0428/)
/// - [Why you should be using pathlib](https://treyhunner.com/2018/12/why-you-should-be-using-pathlib/)
/// - [No really, pathlib is great](https://treyhunner.com/2019/01/no-really-pathlib-is-great/)
#[derive(ViolationMetadata)]
pub(crate) struct Glob {
pub function: String,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,12 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
///
/// ## Why is this bad?
/// `pathlib` offers a high-level API for path manipulation, as compared to
/// the lower-level API offered by `os`.
/// the lower-level API offered by `os.path`.
///
/// When possible, using `Path` object methods such as `Path.stat()` can
/// improve readability over the `os` module's counterparts (e.g.,
/// improve readability over the `os.path` module's counterparts (e.g.,
/// `os.path.getatime()`).
///
/// Note that `os` functions may be preferable if performance is a concern,
/// e.g., in hot loops.
///
/// ## Examples
/// ```python
/// import os
Expand All @@ -29,6 +26,11 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
/// Path(__file__).stat().st_atime
/// ```
///
/// ## Known issues
/// While using `pathlib` can improve the readability and type safety of your code,
/// it can be less performant than the lower-level alternatives that work directly with strings,
/// especially on older versions of Python.
///
/// ## References
/// - [Python documentation: `Path.stat`](https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat)
/// - [Python documentation: `os.path.getatime`](https://docs.python.org/3/library/os.path.html#os.path.getatime)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,12 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
///
/// ## Why is this bad?
/// `pathlib` offers a high-level API for path manipulation, as compared to
/// the lower-level API offered by `os`.
/// the lower-level API offered by `os.path`.
///
/// When possible, using `Path` object methods such as `Path.stat()` can
/// improve readability over the `os` module's counterparts (e.g.,
/// improve readability over the `os.path` module's counterparts (e.g.,
/// `os.path.getctime()`).
///
/// Note that `os` functions may be preferable if performance is a concern,
/// e.g., in hot loops.
///
/// ## Examples
/// ```python
/// import os
Expand All @@ -29,6 +26,11 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
/// Path(__file__).stat().st_ctime
/// ```
///
/// ## Known issues
/// While using `pathlib` can improve the readability and type safety of your code,
/// it can be less performant than the lower-level alternatives that work directly with strings,
/// especially on older versions of Python.
///
/// ## References
/// - [Python documentation: `Path.stat`](https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat)
/// - [Python documentation: `os.path.getctime`](https://docs.python.org/3/library/os.path.html#os.path.getctime)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,12 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
///
/// ## Why is this bad?
/// `pathlib` offers a high-level API for path manipulation, as compared to
/// the lower-level API offered by `os`.
/// the lower-level API offered by `os.path`.
///
/// When possible, using `Path` object methods such as `Path.stat()` can
/// improve readability over the `os` module's counterparts (e.g.,
/// improve readability over the `os.path` module's counterparts (e.g.,
/// `os.path.getmtime()`).
///
/// Note that `os` functions may be preferable if performance is a concern,
/// e.g., in hot loops.
///
/// ## Examples
/// ```python
/// import os
Expand All @@ -29,6 +26,11 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
/// Path(__file__).stat().st_mtime
/// ```
///
/// ## Known issues
/// While using `pathlib` can improve the readability and type safety of your code,
/// it can be less performant than the lower-level alternatives that work directly with strings,
/// especially on older versions of Python.
///
/// ## References
/// - [Python documentation: `Path.stat`](https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat)
/// - [Python documentation: `os.path.getmtime`](https://docs.python.org/3/library/os.path.html#os.path.getmtime)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,12 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
///
/// ## Why is this bad?
/// `pathlib` offers a high-level API for path manipulation, as compared to
/// the lower-level API offered by `os`.
/// the lower-level API offered by `os.path`.
///
/// When possible, using `Path` object methods such as `Path.stat()` can
/// improve readability over the `os` module's counterparts (e.g.,
/// improve readability over the `os.path` module's counterparts (e.g.,
/// `os.path.getsize()`).
///
/// Note that `os` functions may be preferable if performance is a concern,
/// e.g., in hot loops.
///
/// ## Examples
/// ```python
/// import os
Expand All @@ -29,6 +26,11 @@ use ruff_macros::{derive_message_formats, ViolationMetadata};
/// Path(__file__).stat().st_size
/// ```
///
/// ## Known issues
/// While using `pathlib` can improve the readability and type safety of your code,
/// it can be less performant than the lower-level alternatives that work directly with strings,
/// especially on older versions of Python.
///
/// ## References
/// - [Python documentation: `Path.stat`](https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat)
/// - [Python documentation: `os.path.getsize`](https://docs.python.org/3/library/os.path.html#os.path.getsize)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,16 @@ use crate::checkers::ast::Checker;
/// if any(part in blocklist for part in Path("my/file/path").parts):
/// ...
/// ```
///
/// ## Known issues
/// While using `pathlib` can improve the readability and type safety of your code,
/// it can be less performant than working directly with strings,
/// especially on older versions of Python.
///
/// ## References
/// - [PEP 428 – The pathlib module – object-oriented filesystem paths](https://peps.python.org/pep-0428/)
/// - [Why you should be using pathlib](https://treyhunner.com/2018/12/why-you-should-be-using-pathlib/)
/// - [No really, pathlib is great](https://treyhunner.com/2019/01/no-really-pathlib-is-great/)
#[derive(ViolationMetadata)]
pub(crate) struct OsSepSplit;

Expand Down
Loading

0 comments on commit 70bd106

Please sign in to comment.