Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8 #1059

Closed
dtolnay opened this issue Aug 2, 2023 · 2 comments · Fixed by #1062
Labels

Comments

@dtolnay
Copy link
Member

dtolnay commented Aug 2, 2023

What version of regex are you using?

1.9.1

Describe the bug at a high level.

I am investigating a recent large memory usage regression in Buck2 that bisected to a regex update from 1.8.4 to 1.9.0, and persists in 1.9.1.

There are some long-lived GlobSet objects in Buck2 corresponding to filepath patterns that buck has been configured to ignore. The real-world code is here: https://github.com/facebook/buck2/blob/4d300b08ec0c742952f4cdd9abf1c1055c7a75ab/app/buck2_common/src/ignores/ignore_set.rs.

I instrumented the RegexSet created by the globset crate, and was able to create the following repro that shows +600MB being allocated and retained by the RegexSet implementation for a seemingly pretty reasonably sized globset.

What are the steps to reproduce the behavior?

cargo run --release:

// [dependencies]
// regex = "=1.9.1"  # or "=1.8.4"

use regex::bytes::RegexSet;
use std::thread;

const THREADS: usize = 80;

const REGEXES: &[&str] = &[
    "(?-u)^(buck\\-out/.*|buck\\-out)$",
    "(?-u)^(\\.buckd/.*|\\.buckd)$",
    "(?-u)^(\\.git/.*|\\.git)$",
    "(?-u)^(\\.hg/.*|\\.hg)$",
    "(?-u)^(\\.hgmonitor\\.status/.*|\\.hgmonitor\\.status)$",
    "(?-u)^(\\.idea/.*|\\.idea)$",
    "(?-u)^(\\.focus\\-android\\.buckd/.*|\\.focus\\-android\\.buckd)$",
    "(?-u)^(\\.focus\\-android\\-buck\\-out/.*|\\.focus\\-android\\-buck\\-out)$",
    "(?-u)^(\\.lsp\\.buckd/.*|\\.lsp\\.buckd)$",
    "(?-u)^\\.cpplsp[^/]*\\-buck\\-out$",
    "(?-u)^(\\.lsp\\-buck\\-out/.*|\\.lsp\\-buck\\-out)$",
    "(?-u)^(\\.ovrsource\\-rest/.*|\\.ovrsource\\-rest)$",
    "(?-u)^cpplsp[^/]*\\-buck\\-out$",
    "(?-u)^(lsp\\-buck\\-out/.*|lsp\\-buck\\-out)$",
    "(?-u)^(lsp\\.buckd/.*|lsp\\.buckd)$",
    "(?-u)^(buck\\-out/.*|buck\\-out)$",
    "(?-u)^(arvr/tools/build_defs/config/.*|arvr/tools/build_defs/config)$",
    "(?-u)^(arvr\\-legacy/.*|arvr\\-legacy)$",
    "(?-u)^(fbandroid/buck\\-out/.*|fbandroid/buck\\-out)$",
    "(?-u)^(fbandroid/\\.lsp\\.buckd/.*|fbandroid/\\.lsp\\.buckd)$",
    "(?-u)^fbandroid/\\.cpplsp[^/]*\\-buck\\-out$",
    "(?-u)^(fbandroid/\\.lsp\\-buck\\-out/.*|fbandroid/\\.lsp\\-buck\\-out)$",
    "(?-u)^(fbandroid/java/com/facebook/catalyst/js/node_modules/.*|fbandroid/java/com/facebook/catalyst/js/node_modules)$",
    "(?-u)^(fbandroid/libraries/fresco/fbcore/.*|fbandroid/libraries/fresco/fbcore)$",
    "(?-u)^(fbandroid/out/lint/.*|fbandroid/out/lint)$",
    "(?-u)^(fbcode/buck\\-out/.*|fbcode/buck\\-out)$",
    "(?-u)^(fbcode/\\.lsp\\.buckd/.*|fbcode/\\.lsp\\.buckd)$",
    "(?-u)^fbcode/\\.cpplsp[^/]*\\-buck\\-out$",
    "(?-u)^(fbcode/\\.lsp\\-buck\\-out/.*|fbcode/\\.lsp\\-buck\\-out)$",
    "(?-u)^(fbobjc/buck\\-out/.*|fbobjc/buck\\-out)$",
    "(?-u)^(fbobjc/\\.lsp\\.buckd/.*|fbobjc/\\.lsp\\.buckd)$",
    "(?-u)^fbobjc/\\.cpplsp[^/]*\\-buck\\-out$",
    "(?-u)^(fbobjc/\\.lsp\\-buck\\-out/.*|fbobjc/\\.lsp\\-buck\\-out)$",
    "(?-u)^(node_modules/.*|node_modules)$",
    "(?-u)^(ovrsource\\-legacy/.*|ovrsource\\-legacy)$",
    "(?-u)^(arvr/projects/rocktenn/libraries/ARCHIVE/.*|arvr/projects/rocktenn/libraries/ARCHIVE)$",
    "(?-u)^(arvr/projects/rocktenn/projects/ARCHIVE/.*|arvr/projects/rocktenn/projects/ARCHIVE)$",
    "(?-u)^(Research/archive/.*|Research/archive)$",
    "(?-u)^(rocket/.*|rocket)$",
    "(?-u)^third\\-party/rpms/systemd/[^/]*/test$",
    "(?-u)^third\\-party/rpms/systemd/[^/]*/units$",
    "(?-u)^(third\\-party/rust/\\.cargo/.*|third\\-party/rust/\\.cargo)$",
    "(?-u)^(xplat/buck\\-out/.*|xplat/buck\\-out)$",
    "(?-u)^(xplat/infinity/app/applications/facebook\\-intern/sonar/templates/.*|xplat/infinity/app/applications/facebook\\-intern/sonar/templates)$",
    "(?-u)^(xplat/js/node_modules/.*|xplat/js/node_modules)$",
    "(?-u)^xplat/js/react\\-native\\-github/Examples(?:/|/.*/)android/.*$",
    "(?-u)^(xplat/sonar/templates/.*|xplat/sonar/templates)$",
    "(?-u)^(xplat/unity\\-sdk/.*|xplat/unity\\-sdk)$",
    "(?-u)^(?:/?|.*/)[^/]*\\.xcodeproj/.*$",
    "(?-u)^(?:/?|.*/)\\(A Document Being Saved By Xcode\\)/.*$",
    "(?-u)^(?:/?|.*/)[^/]*___jb_bak___$",
    "(?-u)^(?:/?|.*/)[^/]*___jb_old___$",
    "(?-u)^(?:/?|.*/)[^/]*___jb_tmp___$",
    "(?-u)^(?:/?|.*/)__pycache__/.*$",
    "(?-u)^(?:/?|.*/)\\.\\#[^/]*$",
    "(?-u)^(?:/?|.*/)[^/]*\\~$",
];

mod allocator {
    use std::alloc::{GlobalAlloc, Layout, System};
    use std::sync::atomic::{AtomicUsize, Ordering};

    static TOTAL_ALLOC: AtomicUsize = AtomicUsize::new(0);
    static TOTAL_DEALLOC: AtomicUsize = AtomicUsize::new(0);

    #[global_allocator]
    static ALLOCATOR: TotalAllocator = TotalAllocator;

    struct TotalAllocator;

    unsafe impl GlobalAlloc for TotalAllocator {
        unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
            TOTAL_ALLOC.fetch_add(layout.size(), Ordering::Relaxed);
            System.alloc(layout)
        }

        unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
            TOTAL_DEALLOC.fetch_add(layout.size(), Ordering::Relaxed);
            System.dealloc(ptr, layout)
        }
    }

    pub(crate) fn reset() {
        TOTAL_ALLOC.store(0, Ordering::Relaxed);
        TOTAL_DEALLOC.store(0, Ordering::Relaxed);
    }

    pub(crate) fn tell() {
        let total_alloc = TOTAL_ALLOC.load(Ordering::Relaxed) as isize;
        let total_dealloc = TOTAL_DEALLOC.load(Ordering::Relaxed) as isize;
        eprintln!(
            "allocated: {}, retained: {}",
            total_alloc,
            total_alloc - total_dealloc,
        );
    }
}

fn main() {
    allocator::reset();

    let regex_set = RegexSet::new(REGEXES).unwrap();

    eprintln!("eager allocations:");
    allocator::tell();
    allocator::reset();

    thread::scope(|scope| {
        for _ in 0..THREADS {
            scope.spawn(|| {
                for _ in 0..10000 {
                    let path = "fbandroid/apps/instagram/fastparse";
                    let _ = regex_set.is_match(path.as_bytes());
                }
            });
        }
    });

    eprintln!("lazy allocations:");
    allocator::tell();
}

What is the actual behavior?

Regex 1.8.4:

eager allocations:
allocated: 3933520, retained: 397004
lazy allocations:
allocated: 64123704, retained: 12515536

Regex 1.9.1:

eager allocations:
allocated: 3493206, retained: 220812
lazy allocations:
allocated: 602136016, retained: 601652424

What is the expected behavior?

I am interested in knowing whether this amount of allocation is intended for the above regex, or if this might be fixable.

Buck2 is not particularly sensitive to the total amount of allocation done, but very sensitive to steady-state retained memory (with widely used RegexSet objects sitting around) and high-water mark.

@BurntSushi BurntSushi added the bug label Aug 3, 2023
BurntSushi added a commit that referenced this issue Aug 4, 2023
And this finally resolves the memory usage problem, as the PikeVM cache
used by the RegexSet in #1059 no longer allocates MBs of memory because
of the existence of impossible-to-use capturing groups.

Fixes #1059
BurntSushi added a commit to BurntSushi/ripgrep that referenced this issue Aug 4, 2023
We currently implement globs by converting them to regexes, and in doing
so, sometimes use grouping. In all but one case, we used non-capturing
groups. But for alternations, we used capturing groups, which was likely
just an oversight. We don't make use of capture groups at all, and while
they usually don't have any overhead, they lead to weird cases like this
one: rust-lang/regex#1059

That particular issue is also a bug in the regex crate itself, which is
fixed in rust-lang/regex#1062. Note though that
the bug fix in the regex crate is required. Even with this patch to
globset, memory usage is reduced (by about half in rust-lang/regex#1059)
but is not returned to where it was prior to the regex 1.9 release.
BurntSushi added a commit that referenced this issue Aug 4, 2023
And this finally resolves the memory usage problem, as the PikeVM cache
used by the RegexSet in #1059 no longer allocates MBs of memory because
of the existence of impossible-to-use capturing groups.

Fixes #1059
BurntSushi added a commit to BurntSushi/ripgrep that referenced this issue Aug 5, 2023
We currently implement globs by converting them to regexes, and in doing
so, sometimes use grouping. In all but one case, we used non-capturing
groups. But for alternations, we used capturing groups, which was likely
just an oversight. We don't make use of capture groups at all, and while
they usually don't have any overhead, they lead to weird cases like this
one: rust-lang/regex#1059

That particular issue is also a bug in the regex crate itself, which is
fixed in rust-lang/regex#1062. Note though that
the bug fix in the regex crate is required. Even with this patch to
globset, memory usage is reduced (by about half in rust-lang/regex#1059)
but is not returned to where it was prior to the regex 1.9 release.
BurntSushi added a commit that referenced this issue Aug 5, 2023
I originally prided myself on not having a dedicated `is_match` routine
on the meta regex engine's internal `Strategy` trait, and actually spent
a fair amount of attention ensuring that `is_match` and `find` always
returned the same results. That is, `is_match` returns true if and only
if `find` returns a match.

But the fix in the previous commits for #1059 means that a `PikeVM` and
a `BoundedBacktracker` can be used to run a search with an NFA that has
no capture states. Since both engines are implemented to only track
offsets via those capture states, it follows that the only thing that
can be returned in such cases is whether a match occurs (and if so,
which pattern matched). That in turn means that `is_match` can return
`true` while `find` can return `None` for the same search. This is
because the latter returns `None` even when a match is found but there
are no capture states to record the offsets of the match.

This in theory could be resolved by adding APIs to the `PikeVM` and the
`BoundedBacktracker` that return a `HalfMatch` without depending on any
capture states at all. Then `is_match` could be implemented in terms of
those APIs. That is probably the right path, but it's pretty gnarly to
do without breaking changes and I don't want to do any breaking changes
right now.

So instead, we just add a special path to the meta regex engine for
`is_match` and permit some cases to have different results between
`is_match` and `find`. Sigh.
BurntSushi added a commit that referenced this issue Aug 5, 2023
And this finally resolves the memory usage problem, as the PikeVM cache
used by the RegexSet in #1059 no longer allocates MBs of memory because
of the existence of impossible-to-use capturing groups.

Fixes #1059
BurntSushi added a commit that referenced this issue Aug 5, 2023
I originally prided myself on not having a dedicated `is_match` routine
on the meta regex engine's internal `Strategy` trait, and actually spent
a fair amount of attention ensuring that `is_match` and `find` always
returned the same results. That is, `is_match` returns true if and only
if `find` returns a match.

But the fix in the previous commits for #1059 means that a `PikeVM` and
a `BoundedBacktracker` can be used to run a search with an NFA that has
no capture states. Since both engines are implemented to only track
offsets via those capture states, it follows that the only thing that
can be returned in such cases is whether a match occurs (and if so,
which pattern matched). That in turn means that `is_match` can return
`true` while `find` can return `None` for the same search. This is
because the latter returns `None` even when a match is found but there
are no capture states to record the offsets of the match.

This in theory could be resolved by adding APIs to the `PikeVM` and the
`BoundedBacktracker` that return a `HalfMatch` without depending on any
capture states at all. Then `is_match` could be implemented in terms of
those APIs. That is probably the right path, but it's pretty gnarly to
do without breaking changes and I don't want to do any breaking changes
right now.

So instead, we just add a special path to the meta regex engine for
`is_match` and permit some cases to have different results between
`is_match` and `find`. Sigh.
@BurntSushi
Copy link
Member

This is fixed on crates.io in regex 1.9.2 and regex-automata 0.3.5.

@dtolnay
Copy link
Member Author

dtolnay commented Aug 5, 2023

Thank you greatly. I confirmed that Buck2 memory usage with regex 1.9.2 and globset 0.4.13 is at or slightly below what we had prior to the regex 1.9 update.

facebook-github-bot pushed a commit to facebook/errpy that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
facebook-github-bot pushed a commit to facebook/sapling that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
facebook-github-bot pushed a commit to facebook/hhvm that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
facebook-github-bot pushed a commit to facebookincubator/below that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
facebook-github-bot pushed a commit to facebookexperimental/rust-shed that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
facebook-github-bot pushed a commit to facebook/ocamlrep that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
facebook-github-bot pushed a commit to facebookincubator/reindeer that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
facebook-github-bot pushed a commit to facebookexperimental/hermit that referenced this issue Aug 5, 2023
Summary:
BurntSushi has fixed a memory usage regression introduced by regex 1.9 which caused Buck to allocate and retain significantly more memory when using moderately sized `buck2_common::ignores::ignore_set::IgnoreSet` objects concurrently from many threads.

- Bug report: **[rust-lang/regex#1059](rust-lang/regex#1059) *"1.9 memory usage: globset-generated RegexSet allocates and retains 48× more memory (600MB) vs regex 1.8"***

- Globset fix: **[BurntSushi/ripgrep#25770](https://github.com/BurntSushi/ripgrep/pull/25770) *"globset: use non-capture groups in regex transform"***

- Regex fix: **[rust-lang/regex#1062](rust-lang/regex#1062) *"fix memory usage regression for RegexSet with capture groups"***

Reviewed By: zertosh

Differential Revision: D48095372

fbshipit-source-id: ec11c2bcaccbd26354d6d0a0398000134eaf3681
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants