Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for MIR simplification opportunities #111442

Open
1 of 14 tasks
cjgillot opened this issue May 10, 2023 · 9 comments
Open
1 of 14 tasks

Tracking issue for MIR simplification opportunities #111442

cjgillot opened this issue May 10, 2023 · 9 comments
Labels
A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such

Comments

@cjgillot
Copy link
Contributor

cjgillot commented May 10, 2023

This issue is designed to be a list of ideas to simplify MIR during optimization passes. Contributors are welcome to enhance this list.

Note: not all optimizations on this list are desirable. Some are put there to track why they were not implemented.


Instruction simplification

Avoid calling Index::index for slices of arrays and subslices, and replace if with a direct projection:

  • slice[constant1..constant2] to use Subslice { from: constant1, to: constant2, from_end: false};
  • slice[..constant2] to use Subslice { from: 0, to: constant2, from_end: false};
  • slice[constant1..] to use Subslice { from: constant1, to: 0, from_end: true};
  • slice[..] to directly reuse slice;
  • slice[constant1..slice.len() - constant2] to use Subslice { from: constant1, to: constant2, from_end: true};
  • slice[..slice.len() - constant2] to use Subslice { from: 0, to: constant2, from_end: true}.
  • replace manual pointer offset computations with Index projection Tracking issue for MIR simplification opportunities #111442 (comment)

Simple operations:

Aggregates:

Perform simplifications listed in clippy's perf lints #111442 (comment).

Compile-time evaluation

Control-flow graph

@cjgillot cjgillot added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-mir-opt Area: MIR optimizations labels May 10, 2023
@saethlin
Copy link
Member

saethlin commented May 11, 2023

Is this the issue where I put all the examples of goofy MIR I see?

Just in case, I'll start with this: https://godbolt.org/z/bPPnKrnPe

pub fn ouch() {
    panic!("ow")
}
fn ouch() -> () {
    let mut _0: ();
    let mut _1: !;
    let mut _2: std::fmt::Arguments<'_>;
    let mut _3: &[&str];
    let mut _4: &[&str; 1];
    scope 1 (inlined Arguments::<'_>::new_const) {
        debug pieces => _3;
        let mut _5: std::option::Option<&[core::fmt::rt::Placeholder]>;
        let mut _6: &[core::fmt::rt::Argument<'_>];
        let mut _7: &[core::fmt::rt::Argument<'_>; 0];
    }

    bb0: {
        _4 = const _;
        _3 = _4 as &[&str] (Pointer(Unsize));
        _5 = Option::<&[core::fmt::rt::Placeholder]>::None;
        _7 = const _;
        _6 = _7 as &[core::fmt::rt::Argument<'_>] (Pointer(Unsize));
        _2 = Arguments::<'_> { pieces: _3, fmt: move _5, args: move _6 };
        _1 = panic_fmt(move _2);
    }
}

If I remove the Ty::is_scalar check from DataflowConstProp this example gets 2 statements removed, but it could theoretically be reduced to just a call to panic_fmt being passed an Operand::Const. I think the blocker is related to doing const propagation from Aggregates to Aggregates.

@saethlin
Copy link
Member

Comparing integers should be easy, but it isn't:

pub fn cmp(x: &u8, y: &u8) -> bool {
    x < y
}
fn cmp(_1: &u8, _2: &u8) -> bool {
    debug x => _1;
    debug y => _2;
    let mut _0: bool;
    let mut _3: &&u8;
    let mut _4: &&u8;
    let _5: &u8;
    scope 1 (inlined cmp::impls::<impl PartialOrd for &u8>::lt) {
        debug self => _3;
        debug other => _4;
        let mut _6: &u8;
        let mut _7: &u8;
        scope 2 (inlined cmp::impls::<impl PartialOrd for u8>::lt) {
            debug self => _6;
            debug other => _7;
            let mut _8: u8;
            let mut _9: u8;
        }
    }

    bb0: {
        _3 = &_1;
        _5 = _2;
        _4 = &_5;
        _6 = deref_copy (*_3);
        _7 = deref_copy (*_4);
        _8 = (*_6);
        _9 = (*_7);
        _0 = Lt(move _8, move _9);
        return;
    }
}

We should be able to emit just

bb0: {
    _0 = Lt(*_1, *_2);
    return;
}

@matthiaskrgr
Copy link
Member

Clippy also has a couple of peformance lints, might be worth looking into if we can turn some of them into mir mutations and downgrade the respective lints to style or something.

https://rust-lang.github.io/rust-clippy/master/index.html (Lint groups: filter for "perf")

for example:
converting

pub fn split(s: String) {
   let x = s.split("a");
}

to split by a char.

pub fn split(s: String) {
   let x = s.split('a');
}
--- a_string.mir	2023-05-14 11:38:15.942618912 +0200
+++ a.mir	2023-05-14 11:38:23.439260694 +0200
@@ -3,7 +3,7 @@
 fn split(_1: String) -> () {
     debug s => _1;                       // in scope 0 at a.rs:3:14: 3:15
     let mut _0: ();                      // return place in scope 0 at a.rs:3:25: 3:25
-    let _2: std::str::Split<'_, &str>;   // in scope 0 at a.rs:4:9: 4:10
+    let _2: std::str::Split<'_, char>;   // in scope 0 at a.rs:4:9: 4:10
     let mut _3: &str;                    // in scope 0 at a.rs:4:13: 4:25
     let _4: &str;                        // in scope 0 at a.rs:4:13: 4:25
     let mut _5: &std::string::String;    // in scope 0 at a.rs:4:13: 4:25
@@ -21,13 +21,10 @@

     bb1: {
         _3 = _4;                         // scope 0 at a.rs:4:13: 4:25
-        _2 = core::str::<impl str>::split::<'_, &str>(move _3, const "a") -> [return: bb2, unwind: bb4]; // scope 0 at a.rs:4:13: 4:25
+        _2 = core::str::<impl str>::split::<'_, char>(move _3, const 'a') -> [return: bb2, unwind: bb4]; // scope 0 at a.rs:4:13: 4:25
                                          // mir::Constant
                                          // + span: a.rs:4:15: 4:20
-                                         // + literal: Const { ty: fn(&str, &str) -> std::str::Split<'_, &str> {core::str::<impl str>::split::<'_, &str>}, val: Value(<ZST>) }
-                                         // mir::Constant
-                                         // + span: a.rs:4:21: 4:24
-                                         // + literal: Const { ty: &str, val: Value(Slice(..)) }
+                                         // + literal: Const { ty: fn(&str, char) -> std::str::Split<'_, char> {core::str::<impl str>::split::<'_, char>}, val: Value(<ZST>) }
     }

     bb2: {

@scottmcm
Copy link
Member

scottmcm commented May 14, 2023

Someone might have already started on a pass for this one, but I hit a perfect exemplar of it. (EDIT: #107009 maybe?)

After MIR inlining, it's very common for there to be two basic blocks that set a discriminant which both goto one that immediately switches on that same discriminant.

For example, checked_ilog2 is

        pub const fn checked_ilog2(self) -> Option<u32> {
            if let Some(x) = <$NonZeroT>::new(self) {
                Some(x.ilog2())
            } else {
                None
            }
        }

So in MIR that ends up https://rust.godbolt.org/z/z8Eo1jvjc

    bb5: {
        StorageLive(_6);                 // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:30: 82:48
        _6 = NonZeroU32(_1);             // scope 4 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:39: 82:46
        _2 = Option::<NonZeroU32>::Some(move _6); // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:25: 82:49
        StorageDead(_6);                 // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:48: 82:49
        goto -> bb1;                     // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:80:21: 85:22
    }

    bb6: {
        _2 = const Option::<NonZeroU32>::None; // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:84:25: 84:29
        goto -> bb1;                     // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:80:21: 85:22
    }

    bb1: {
        _3 = discriminant(_2);           // scope 2 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/uint_macros.rs:845:20: 845:27
        switchInt(move _3) -> [1: bb2, otherwise: bb3]; // scope 2 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/uint_macros.rs:845:20: 845:27
    }

So it would be great to skip that step, getting just

    bb5: {
        StorageLive(_6);                 // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:30: 82:48
        _6 = NonZeroU32(_1);             // scope 4 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:39: 82:46
        _2 = Option::<NonZeroU32>::Some(move _6); // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:25: 82:49
        StorageDead(_6);                 // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:82:48: 82:49
        goto -> bb2;                     // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:80:21: 85:22
    }

    bb6: {
        _2 = const Option::<NonZeroU32>::None; // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:84:25: 84:29
        goto -> bb3;                     // scope 3 at /rustc/2c41369acc445d04129db40ba998dd7a89fb0d2e/library/core/src/num/nonzero.rs:80:21: 85:22
    }

(And thus probably letting SimplifyCfg then collapse more blocks because of the gotos and probably remove that bb1 entirely.)

@scottmcm
Copy link
Member

As a follow-up to the previous one, then it would be good to remove the make-variant-then-extract-value pattern from it.

For example, https://github.com/rust-lang/rust/pull/111553/files#diff-17f803bde55e41d6de373d5b0333848821d7bd01bccada3de9babd3082f783a5R104-R106 has the following:

        _14 = Result::<u32, Infallible>::Ok(_12); // scope 11 at $SRC_DIR/core/src/convert/mod.rs:LL:COL
        _15 = move ((_14 as Ok).0: u32); // scope 12 at $SRC_DIR/core/src/result.rs:LL:COL

Which doesn't need the temporary at all and can just be

        _15 = _12;

@saethlin
Copy link
Member

I think that case could be handled by InstCombine similar to combine_place_projection.

@scottmcm
Copy link
Member

This comment is roughly the same as #111442 (comment) + #111442 (comment), but reading PR #95198 made me think of it because some of the methods implementations could be simplified to wrappers of other ones to avoid some unsafe, but the MIR doesn't optimize down, so without that it's hard to say whether the safe version is better.

For example, https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=d16f00639736c5494b373afafebab3f6

pub fn demo<T>(x: &[T]) -> Option<&T> {
    Some(x.split_first()?.0)
}

currently has three switchInts, and it looks feasible to be able to simplify it down to the one necessary one.

@scottmcm
Copy link
Member

Inspired by the note in the OP about slices,

pub fn demo<T>(x: &[T], i: usize) -> Option<&T> {
    x.get(i)
}

currently involves https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=827f469245f2580d549fd198bfaf66ad

        _7 = &raw const (*_1);
        _8 = _7 as *const T (PtrToPtr);
        _6 = Offset(_8, _2);
        _5 = &(*_6);

which I think could be collapsed to

        _5 = &(*_1)[_2]

@scottmcm
Copy link
Member

scottmcm commented Jun 1, 2023

From https://github.com/rust-lang/rust/pull/111553/files#diff-befce8f95e24ad68a8d50bb771d171fb7784b05cb3ed60ea9d2c9b3ce61a86f6R49-R53,

        switchInt(move _7) -> [0: bb3, otherwise: bb2];
    }

    bb2: {
        assert(!const true, "attempt to compute `{} + {}`, which would overflow", const _, const 1_u32) -> bb3;

could, since _7 is a bool, be simplified to

        assert(!_7, "attempt to compute `{} + {}`, which would overflow", const _, const 1_u32) -> bb3;
    }

(And in general could be simplified by introducing a local and comparison, instead of needing the extra BB.)

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such
Projects
None yet
Development

No branches or pull requests

5 participants