Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM handling changes #50880

Merged
merged 2 commits into from
May 30, 2018
Merged

OOM handling changes #50880

merged 2 commits into from
May 30, 2018

Conversation

glandium
Copy link
Contributor

As discussed in #49668 (comment) and subsequent.

This does have codegen implications. Even without the hooks, and with a handler that ignores the arguments, the compiler doesn't eliminate calling rust_oom with the Layout. Even if it managed to eliminate that, with the hooks, I don't know if the compiler would be able to figure out it can skip it if the hook is never set.

A couple implementation notes:

  • I went with explicit enums rather than bools because it makes it clearer in callers what is being requested.
  • I didn't know what feature to put the hook setting functions behind. (and surprisingly, the compile went through without any annotation on the functions)
  • There's probably some bikeshedding to do on the naming.

Cc: @SimonSapin, @sfackler

@rust-highfive
Copy link
Collaborator

r? @bluss

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 18, 2018
@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading https://files.pythonhosted.org/packages/c2/1e/f70d1125f5bf6383d2ee7a76aea72bed6ba103c1bb9dd4ca051787552d2b/awscli-1.15.24-py2.py3-none-any.whl (1.3MB)
    0% |▎                               | 10kB 9.9MB/s eta 0:00:01
    1% |▌                               | 20kB 1.9MB/s eta 0:00:01
    2% |▉                               | 30kB 2.2MB/s eta 0:00:01
    3% |█                               | 40kB 2.0MB/s eta 0:00:01
---

[00:05:43] travis_fold:start:tidy
travis_time:start:tidy
tidy check
[00:05:44] tidy error: /checkout/src/libstd/collections/hash/map.rs:26: line longer than 100 chars
[00:05:45] some tidy checks failed
[00:05:45] 
[00:05:45] 
[00:05:45] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:05:45] 
[00:05:45] 
[00:05:45] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:05:45] Build completed unsuccessfully in 0:02:30
[00:05:45] Build completed unsuccessfully in 0:02:30
[00:05:45] Makefile:79: recipe for target 'tidy' failed
[00:05:45] make: *** [tidy] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:0390c96b
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

#[doc(inline)] pub use alloc_system::System;
#[doc(inline)] pub use core::alloc::*;

use panicking::HOOK_LOCK;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason we're reusing this lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw no particular reason not to reuse it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOM hooks and panic hooks are totally unrelated concepts :P Let's avoid the overlap.

/// The OOM hook is invoked when an infallible memory allocation fails.
/// The default hook prints a message to standard error and aborts the
/// execution, but this behavior can be customized with the [`set_hook`]
/// and [`take_hook`] functions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hook currently isn't allowed to unwind IIRC. Probably worth documenting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn’t it really? Why?

We have accepted an RFC for oom=panic for the use case of handling OOM with catch_unwind: #48043, https://github.com/rust-lang/rfcs/blob/master/text/2116-alloc-me-maybe.md#oompanic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely intended to work, but there's a comment over here claiming we die in MIR transforms if you try to actually do it: https://github.com/rust-lang/rust/blob/master/src/liballoc/alloc.rs#L104-L121

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually know why Box still needs a special lang item to allocate though - maybe we can kill off exchange_alloc entirely at this point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe box foo syntax needs a lang item? It’s how Box::new is defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, that'd be it. Placement new's getting removed though so we should be able to switch Box::new over to just use the global allocator like other collections do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hook currently isn't allowed to unwind IIRC. Probably worth documenting.

If the hook unwinding crashes the compiler, but the intent is for it to be allowed per the oom=panic RFC, it probably makes sense to leave any comment about unwinding out of the documentation.

}

fn default_hook(_: Layout) -> ! {
rtabort!("memory allocation failed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well update this to something like memory allocation of {} bytes failed.

@sfackler
Copy link
Member

cc @gankro for the RawVec changes

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:start:test_run-pass
Check compiletest suite=run-pass mode=run-pass (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:49:00] 
[00:49:00] running 3007 tests
[00:49:15] ......F.............................................................................................
[00:49:45] ....................................................................................................
[00:50:00] ....................................................................................................
[00:50:13] ....................................................................................................
[00:50:33] ....................................................................................................
---
[00:53:35] ....................................................................................................
[00:53:48] ....................................................................................................
[00:54:05] ...........................................i........................................................
[00:54:25] ..........................................i.........................................................
[00:54:52] ........................................................................................test [run-pass] run-pass/mir_heavy_promoted.rs has been running for over 60 seconds
[00:54:55] .......F....
[00:55:08] .....................................................F..............................................
[00:56:11] ...........................i.ii.....................................................................
[00:56:39] ...................................iiiiiii..........................................................
[00:57:00] ....................................................................................................
[00:57:15] ....................................................................................................
---
[00:57:33] ---- [run-pass] run-pass/allocator-alloc-one.rs stdout ----
[00:57:33] 
[00:57:33] error: compilation failed!
[00:57:33] status: exit code: 101
[00:57:33] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/run-pass/allocator-alloc-one.rs" "--target=x86_64-unknown-linux-gnu" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/allocator-alloc-one/a" "-Crpath" "-O" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/allocator-alloc-one/auxiliary"
[00:57:33] ------------------------------------------
[00:57:33] 
[00:57:33] ------------------------------------------
[00:57:33] stderr:
[00:57:33] stderr:
[00:57:33] ------------------------------------------
[00:57:33] error[E0061]: this function takes 1 parameter but 0 parameters were supplied
[00:57:33]   --> /checkout/src/test/run-pass/allocator-alloc-one.rs:17:64
[00:57:33]    |
[00:57:33] 17 |         let ptr = Global.alloc_one::<i32>().unwrap_or_else(|_| oom());
[00:57:33] 
[00:57:33] error: aborting due to previous error
[00:57:33] 
[00:57:33] For more information about this error, try `rustc --explain E0061`.
[00:57:33] For more information about this error, try `rustc --explain E0061`.
[00:57:33] 
[00:5716687.rs:76:33
[00:57:33]    |
[00:57:33] 76 |             .unwrap_or_else(|_| oom());
[00:57:33] 
[00:57:33] error: aborting due to 2 previous errors
[00:57:33] 
[00:57:33] For more information about this error, try `rustc --explain E0061`.
---
[00:57:33] ---- [run-pass] run-pass/regions-mock-codegen.rs stdout ----
[00:57:33] 
[00:57:33] error: compilation failed!
[00:57:33] status: exit code: 101
[00:57:33] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/run-pass/regions-mock-codegen.rs" "--target=x86_64-unknown-linux-gnu" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/regions-mock-codegen/a" "-Crpath" "-O" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/regions-mock-codegen/auxiliary"
[00:57:33] ------------------------------------------
[00:57:33] 
[00:57:33] ------------------------------------------
[00:57:33] stderr:
[00:57:33] stderr:
[00:57:33] ------------------------------------------
[00:57:33] error[E0061]: this function takes 1 parameter but 0 parameters were supplied
[00:57:33]   --> /checkout/src/test/run-pass/regions-mock-codegen.rs:36:33
[00:57:33]    |
[00:57:33] 36 |             .unwrap_or_else(|_| oom());
[00:57:33] 
[00:57:33] error: aborting due to previous error
[00:57:33] 
[00:57:33] For more information about this error, try `rustc --explain E0061`.
---
[00:57:33] test result: FAILED. 2985 passed; 3 failed; 19 ignored; 0 measured; 0 filtered out
[00:57:33] 
[00:57:33] 
[00:57:33] 
[00:57:33] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/run-pass" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "run-pass" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-3.9/bin/FileCheck" "--host-rustcflags" "-Crpath -O -Zunstable-options " "--target-rustcflags" "-Crpath -O -Zunstable-options  -

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:start:test_run-pass
Check compiletest suite=run-pass mode=run-pass (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[00:44:19] 
[00:44:19] running 3007 tests
[00:44:32] .......F............................................................................................
[00:45:00] ....................................................................................................
[00:45:13] ....................................................................................................
[00:45:26] ....................................................................................................
[00:45:44] ....................................................................................................
---
[00:52:14] ---- [run-pass] run-pass/allocator-alloc-one.rs stdout ----
[00:52:14] 
[00:52:14] error: compilation failed!
[00:52:14] status: exit code: 101
[00:52:14] command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/run-pass/allocator-alloc-one.rs" "--target=x86_64-unknown-linux-gnu" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/allocator-alloc-one/a" "-Crpath" "-O" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-pass/allocator-alloc-one/auxiliary"
[00:52:14] ------------------------------------------
[00:52:14] 
[00:52:14] ------------------------------------------
[00:52:14] stderr:
[00:52:14] stderr:
[00:52:14] ------------------------------------------
[00:52:14] error[E0433]: failed to resolve. Use of undeclared type or module `Layout`
[00:52:14]    |
[00:52:14]    |
[00:52:14] 17 |         let ptr = Global.alloc_one::<i32>().unwrap_or_else(|_| oom(Layout::new::<i32>()));
[00:52:14]    |                                                                    ^^^^^^ Use of undeclared type or module `Layout`
[00:52:14] error: aborting due to previous error
[00:52:14] 
[00:52:14] For more information about this error, try `rustc --explain E0433`.
[00:52:14] 

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@glandium
Copy link
Contributor Author

So interestingly, unwinding does work:

#![feature(allocator_api)]

fn main() {
    std::alloc::set_hook(Box::new(|layout| { panic!("oom {}", layout.size()) }));
    let result = std::panic::catch_unwind(|| {
        let v = Vec::<u8>::with_capacity(100000000000000);
        println!("{:p}", &v[..]);
    });
    println!("{:?}", result);
}

On my machine, with rustc built with the patches here, this prints the following:

thread 'main' panicked at 'oom 100000000000000', src/main.rs:4:46
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Err(Any)

@sfackler
Copy link
Member

Ah sweet, seems like it may just be an out of date comment then!

@bors
Copy link
Contributor

bors commented May 26, 2018

☔ The latest upstream changes (presumably #51041) made this pull request unmergeable. Please resolve the merge conflicts.

@SimonSapin
Copy link
Contributor

Did you mean to change the book and llvm submodules? r=me as of ccf1c1b87b65342322e99c83e0169d5726c9d41f except for this.

@glandium
Copy link
Contributor Author

Dammit, I keep doing this on rebase.

As discussed in
rust-lang#49668 (comment)
and subsequent, there are use-cases where the OOM handler needs to know
the size of the allocation that failed. The alignment might also be a
cause for allocation failure, so providing it as well can be useful.
@Amanieu
Copy link
Member

Amanieu commented May 29, 2018

Have a look at #30801 which previously implemented OOM handler hooks (which then got removed later, somehow).

Two things stand out in particular:

  • There were concerns that rtabort! might itself allocate memory.
  • You should use an atomic instead of a RwLock, the latter might be lazily initialized and allocate memory on some platforms.

@glandium
Copy link
Contributor Author

The hooks were removed in #42727 without an explanation...

About rtabort, well, whether it allocates or not is not something that needs to be solved here, since this PR doesn't change anything there.

About the RWLock, fair enough.

@glandium
Copy link
Contributor Author

Should I restore the function names that were used before? alloc::set_hook is probably too generic of a name.

@Amanieu
Copy link
Member

Amanieu commented May 29, 2018

About rtabort, well, whether it allocates or not is not something that needs to be solved here, since this PR doesn't change anything there.

I specifically used custom implementations (unix windows) instead of rtabort to ensure that they did not allocate memory.

@glandium
Copy link
Contributor Author

I specifically used custom implementations (unix windows) instead of rtabort to ensure that they did not allocate memory.

As I said, I'm not adding the call to rtabort. It's already there. I'm merely moving it. If you feel strongly about this, please file a separate issue.

@SimonSapin
Copy link
Contributor

@bors r+

Looks good!

@bors
Copy link
Contributor

bors commented May 30, 2018

📌 Commit a4d899b has been approved by SimonSapin

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 30, 2018
@bors
Copy link
Contributor

bors commented May 30, 2018

⌛ Testing commit a4d899b with merge 4f99f37...

bors added a commit that referenced this pull request May 30, 2018
OOM handling changes

As discussed in #49668 (comment) and subsequent.

This does have codegen implications. Even without the hooks, and with a handler that ignores the arguments, the compiler doesn't eliminate calling `rust_oom` with the `Layout`. Even if it managed to eliminate that, with the hooks, I don't know if the compiler would be able to figure out it can skip it if the hook is never set.

A couple implementation notes:
- I went with explicit enums rather than bools because it makes it clearer in callers what is being requested.
- I didn't know what `feature` to put the hook setting functions behind. (and surprisingly, the compile went through without any annotation on the functions)
- There's probably some bikeshedding to do on the naming.

Cc: @SimonSapin, @sfackler
@bors
Copy link
Contributor

bors commented May 30, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: SimonSapin
Pushing 4f99f37 to master...

@bors bors merged commit a4d899b into rust-lang:master May 30, 2018
@glandium glandium deleted the oom branch May 30, 2018 23:21
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Jun 13, 2018
Bug 1458161 added a rust OOM handler based on an unstable API that was
removed in 1.27, replaced with something that didn't allow to get the
failed allocation size.

Latest 1.28 nightly (2018-06-13) has
rust-lang/rust#50880,
rust-lang/rust#51264 and
rust-lang/rust#51241 merged, which allow to
hook the OOM handler and get the failed allocation size again.

Because this is still an unstable API, we explicitly depend on strict
versions of rustc. We also explicitly error out if automation builds
end up using a rustc version that doesn't allow us to get the allocation
size for rust OOM, because we don't want that to happen without knowing.

--HG--
extra : rebase_source : 6c097151046d088cf51f4755dd69bde97bb8bd8b
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Oct 3, 2019
Bug 1458161 added a rust OOM handler based on an unstable API that was
removed in 1.27, replaced with something that didn't allow to get the
failed allocation size.

Latest 1.28 nightly (2018-06-13) has
rust-lang/rust#50880,
rust-lang/rust#51264 and
rust-lang/rust#51241 merged, which allow to
hook the OOM handler and get the failed allocation size again.

Because this is still an unstable API, we explicitly depend on strict
versions of rustc. We also explicitly error out if automation builds
end up using a rustc version that doesn't allow us to get the allocation
size for rust OOM, because we don't want that to happen without knowing.

UltraBlame original commit: 5182bca90d0609f182d5a7b6b48ed2ffbbce32c2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Oct 3, 2019
Bug 1458161 added a rust OOM handler based on an unstable API that was
removed in 1.27, replaced with something that didn't allow to get the
failed allocation size.

Latest 1.28 nightly (2018-06-13) has
rust-lang/rust#50880,
rust-lang/rust#51264 and
rust-lang/rust#51241 merged, which allow to
hook the OOM handler and get the failed allocation size again.

Because this is still an unstable API, we explicitly depend on strict
versions of rustc. We also explicitly error out if automation builds
end up using a rustc version that doesn't allow us to get the allocation
size for rust OOM, because we don't want that to happen without knowing.

UltraBlame original commit: 5182bca90d0609f182d5a7b6b48ed2ffbbce32c2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Oct 3, 2019
Bug 1458161 added a rust OOM handler based on an unstable API that was
removed in 1.27, replaced with something that didn't allow to get the
failed allocation size.

Latest 1.28 nightly (2018-06-13) has
rust-lang/rust#50880,
rust-lang/rust#51264 and
rust-lang/rust#51241 merged, which allow to
hook the OOM handler and get the failed allocation size again.

Because this is still an unstable API, we explicitly depend on strict
versions of rustc. We also explicitly error out if automation builds
end up using a rustc version that doesn't allow us to get the allocation
size for rust OOM, because we don't want that to happen without knowing.

UltraBlame original commit: 5182bca90d0609f182d5a7b6b48ed2ffbbce32c2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants