Skip to content

Commit

Permalink
Auto merge of #45075 - alexcrichton:inline-less, r=michaelwoerister
Browse files Browse the repository at this point in the history
rustc: Don't inline in CGUs at -O0

This commit tweaks the behavior of inlining functions into multiple codegen
units when rustc is compiling in debug mode. Today rustc will unconditionally
treat `#[inline]` functions by translating them into all codegen units that
they're needed within, marking the linkage as `internal`. This commit changes
the behavior so that in debug mode (compiling at `-O0`) rustc will instead only
translate `#[inline]` functions into *one* codegen unit, forcing all other
codegen units to reference this one copy.

The goal here is to improve debug compile times by reducing the amount of
translation that happens on behalf of multiple codegen units. It was discovered
in #44941 that increasing the number of codegen units had the adverse side
effect of increasing the overal work done by the compiler, and the suspicion
here was that the compiler was inlining, translating, and codegen'ing more
functions with more codegen units (for example `String` would be basically
inlined into all codegen units if used). The strategy in this commit should
reduce the cost of `#[inline]` functions to being equivalent to one codegen
unit, which is only translating and codegen'ing inline functions once.

Collected [data] shows that this does indeed improve the situation from [before]
as the overall cpu-clock time increases at a much slower rate and when pinned to
one core rustc does not consume significantly more wall clock time than with one
codegen unit.

One caveat of this commit is that the symbol names for inlined functions that
are only translated once needed some slight tweaking. These inline functions
could be translated into multiple crates and we need to make sure the symbols
don't collideA so the crate name/disambiguator is mixed in to the symbol name
hash in these situations.

[data]: #44941 (comment)
[before]: #44941 (comment)
  • Loading branch information
bors committed Oct 9, 2017
2 parents d7acd29 + 4b2bdf7 commit 72d6501
Show file tree
Hide file tree
Showing 21 changed files with 202 additions and 83 deletions.
2 changes: 2 additions & 0 deletions src/librustc/session/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1100,6 +1100,8 @@ options! {DebuggingOptions, DebuggingSetter, basic_debugging_options,
"generate a graphical HTML report of time spent in trans and LLVM"),
thinlto: bool = (false, parse_bool, [TRACKED],
"enable ThinLTO when possible"),
inline_in_all_cgus: Option<bool> = (None, parse_opt_bool, [TRACKED],
"control whether #[inline] functions are in all cgus"),
}

pub fn default_lib_output() -> CrateType {
Expand Down
54 changes: 38 additions & 16 deletions src/librustc_trans/back/symbol_names.rs
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,10 @@
//! DefPaths which are much more robust in the face of changes to the code base.
use monomorphize::Instance;
use trans_item::{TransItemExt, InstantiationMode};

use rustc::middle::weak_lang_items;
use rustc::middle::trans::TransItem;
use rustc::hir::def_id::DefId;
use rustc::hir::map as hir_map;
use rustc::ty::{self, Ty, TyCtxt, TypeFoldable};
Expand Down Expand Up @@ -150,7 +152,10 @@ pub fn provide(providers: &mut Providers) {
fn get_symbol_hash<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,

// the DefId of the item this name is for
def_id: Option<DefId>,
def_id: DefId,

// instance this name will be for
instance: Instance<'tcx>,

// type of the item, without any generic
// parameters substituted; this is
Expand All @@ -160,7 +165,7 @@ fn get_symbol_hash<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,

// values for generic type parameters,
// if any.
substs: Option<&'tcx Substs<'tcx>>)
substs: &'tcx Substs<'tcx>)
-> u64 {
debug!("get_symbol_hash(def_id={:?}, parameters={:?})", def_id, substs);

Expand All @@ -170,7 +175,7 @@ fn get_symbol_hash<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
// the main symbol name is not necessarily unique; hash in the
// compiler's internal def-path, guaranteeing each symbol has a
// truly unique path
hasher.hash(def_id.map(|def_id| tcx.def_path_hash(def_id)));
hasher.hash(tcx.def_path_hash(def_id));

// Include the main item-type. Note that, in this case, the
// assertions about `needs_subst` may not hold, but this item-type
Expand All @@ -186,19 +191,36 @@ fn get_symbol_hash<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
}

// also include any type parameters (for generic items)
if let Some(substs) = substs {
assert!(!substs.has_erasable_regions());
assert!(!substs.needs_subst());
substs.visit_with(&mut hasher);

// If this is an instance of a generic function, we also hash in
// the ID of the instantiating crate. This avoids symbol conflicts
// in case the same instances is emitted in two crates of the same
// project.
if substs.types().next().is_some() {
hasher.hash(tcx.crate_name.as_str());
hasher.hash(tcx.sess.local_crate_disambiguator().as_str());
assert!(!substs.has_erasable_regions());
assert!(!substs.needs_subst());
substs.visit_with(&mut hasher);

let mut avoid_cross_crate_conflicts = false;

// If this is an instance of a generic function, we also hash in
// the ID of the instantiating crate. This avoids symbol conflicts
// in case the same instances is emitted in two crates of the same
// project.
if substs.types().next().is_some() {
avoid_cross_crate_conflicts = true;
}

// If we're dealing with an instance of a function that's inlined from
// another crate but we're marking it as globally shared to our
// compliation (aka we're not making an internal copy in each of our
// codegen units) then this symbol may become an exported (but hidden
// visibility) symbol. This means that multiple crates may do the same
// and we want to be sure to avoid any symbol conflicts here.
match TransItem::Fn(instance).instantiation_mode(tcx) {
InstantiationMode::GloballyShared { may_conflict: true } => {
avoid_cross_crate_conflicts = true;
}
_ => {}
}

if avoid_cross_crate_conflicts {
hasher.hash(tcx.crate_name.as_str());
hasher.hash(tcx.sess.local_crate_disambiguator().as_str());
}
});

Expand Down Expand Up @@ -309,7 +331,7 @@ fn compute_symbol_name<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, instance: Instance
// and should not matter anyhow.
let instance_ty = tcx.erase_regions(&instance_ty);

let hash = get_symbol_hash(tcx, Some(def_id), instance_ty, Some(substs));
let hash = get_symbol_hash(tcx, def_id, instance, instance_ty, substs);

SymbolPathBuffer::from_interned(tcx.def_symbol_name(def_id)).finish(hash)
}
Expand Down
6 changes: 3 additions & 3 deletions src/librustc_trans/collector.rs
Original file line number Diff line number Diff line change
Expand Up @@ -401,9 +401,9 @@ fn collect_items_rec<'a, 'tcx: 'a>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
}

fn record_accesses<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
caller: TransItem<'tcx>,
callees: &[TransItem<'tcx>],
inlining_map: &mut InliningMap<'tcx>) {
caller: TransItem<'tcx>,
callees: &[TransItem<'tcx>],
inlining_map: &mut InliningMap<'tcx>) {
let is_inlining_candidate = |trans_item: &TransItem<'tcx>| {
trans_item.instantiation_mode(tcx) == InstantiationMode::LocalCopy
};
Expand Down
113 changes: 56 additions & 57 deletions src/librustc_trans/partitioning.rs
Original file line number Diff line number Diff line change
Expand Up @@ -279,75 +279,74 @@ fn place_root_translation_items<'a, 'tcx, I>(tcx: TyCtxt<'a, 'tcx, 'tcx>,
let mut internalization_candidates = FxHashSet();

for trans_item in trans_items {
let is_root = trans_item.instantiation_mode(tcx) == InstantiationMode::GloballyShared;
match trans_item.instantiation_mode(tcx) {
InstantiationMode::GloballyShared { .. } => {}
InstantiationMode::LocalCopy => continue,
}

if is_root {
let characteristic_def_id = characteristic_def_id_of_trans_item(tcx, trans_item);
let is_volatile = is_incremental_build &&
trans_item.is_generic_fn();
let characteristic_def_id = characteristic_def_id_of_trans_item(tcx, trans_item);
let is_volatile = is_incremental_build &&
trans_item.is_generic_fn();

let codegen_unit_name = match characteristic_def_id {
Some(def_id) => compute_codegen_unit_name(tcx, def_id, is_volatile),
None => Symbol::intern(FALLBACK_CODEGEN_UNIT).as_str(),
};
let codegen_unit_name = match characteristic_def_id {
Some(def_id) => compute_codegen_unit_name(tcx, def_id, is_volatile),
None => Symbol::intern(FALLBACK_CODEGEN_UNIT).as_str(),
};

let make_codegen_unit = || {
CodegenUnit::new(codegen_unit_name.clone())
};
let make_codegen_unit = || {
CodegenUnit::new(codegen_unit_name.clone())
};

let codegen_unit = codegen_units.entry(codegen_unit_name.clone())
.or_insert_with(make_codegen_unit);

let (linkage, visibility) = match trans_item.explicit_linkage(tcx) {
Some(explicit_linkage) => (explicit_linkage, Visibility::Default),
None => {
match trans_item {
TransItem::Fn(ref instance) => {
let visibility = match instance.def {
InstanceDef::Item(def_id) => {
if def_id.is_local() {
if tcx.is_exported_symbol(def_id) {
Visibility::Default
} else {
internalization_candidates.insert(trans_item);
Visibility::Hidden
}
let codegen_unit = codegen_units.entry(codegen_unit_name.clone())
.or_insert_with(make_codegen_unit);

let (linkage, visibility) = match trans_item.explicit_linkage(tcx) {
Some(explicit_linkage) => (explicit_linkage, Visibility::Default),
None => {
match trans_item {
TransItem::Fn(ref instance) => {
let visibility = match instance.def {
InstanceDef::Item(def_id) => {
if def_id.is_local() {
if tcx.is_exported_symbol(def_id) {
Visibility::Default
} else {
internalization_candidates.insert(trans_item);
Visibility::Hidden
}
} else {
Visibility::Hidden
}
InstanceDef::FnPtrShim(..) |
InstanceDef::Virtual(..) |
InstanceDef::Intrinsic(..) |
InstanceDef::ClosureOnceShim { .. } |
InstanceDef::DropGlue(..) |
InstanceDef::CloneShim(..) => {
bug!("partitioning: Encountered unexpected
root translation item: {:?}",
trans_item)
}
};
(Linkage::External, visibility)
}
TransItem::Static(node_id) |
TransItem::GlobalAsm(node_id) => {
let def_id = tcx.hir.local_def_id(node_id);
let visibility = if tcx.is_exported_symbol(def_id) {
Visibility::Default
} else {
internalization_candidates.insert(trans_item);
}
InstanceDef::FnPtrShim(..) |
InstanceDef::Virtual(..) |
InstanceDef::Intrinsic(..) |
InstanceDef::ClosureOnceShim { .. } |
InstanceDef::DropGlue(..) |
InstanceDef::CloneShim(..) => {
Visibility::Hidden
};
(Linkage::External, visibility)
}
}
};
(Linkage::External, visibility)
}
TransItem::Static(node_id) |
TransItem::GlobalAsm(node_id) => {
let def_id = tcx.hir.local_def_id(node_id);
let visibility = if tcx.is_exported_symbol(def_id) {
Visibility::Default
} else {
Visibility::Hidden
};
(Linkage::External, visibility)
}
}
};

codegen_unit.items_mut().insert(trans_item, (linkage, visibility));
roots.insert(trans_item);
}
};
if visibility == Visibility::Hidden {
internalization_candidates.insert(trans_item);
}

codegen_unit.items_mut().insert(trans_item, (linkage, visibility));
roots.insert(trans_item);
}

// always ensure we have at least one CGU; otherwise, if we have a
Expand Down
37 changes: 32 additions & 5 deletions src/librustc_trans/trans_item.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ use monomorphize::Instance;
use rustc::hir;
use rustc::hir::def_id::DefId;
use rustc::middle::trans::{Linkage, Visibility};
use rustc::session::config::OptLevel;
use rustc::traits;
use rustc::ty::{self, Ty, TyCtxt, TypeFoldable};
use rustc::ty::subst::{Subst, Substs};
Expand All @@ -44,7 +45,20 @@ pub use rustc::middle::trans::TransItem;
pub enum InstantiationMode {
/// There will be exactly one instance of the given TransItem. It will have
/// external linkage so that it can be linked to from other codegen units.
GloballyShared,
GloballyShared {
/// In some compilation scenarios we may decide to take functions that
/// are typically `LocalCopy` and instead move them to `GloballyShared`
/// to avoid translating them a bunch of times. In this situation,
/// however, our local copy may conflict with other crates also
/// inlining the same function.
///
/// This flag indicates that this situation is occuring, and informs
/// symbol name calculation that some extra mangling is needed to
/// avoid conflicts. Note that this may eventually go away entirely if
/// ThinLTO enables us to *always* have a globally shared instance of a
/// function within one crate's compilation.
may_conflict: bool,
},

/// Each codegen unit containing a reference to the given TransItem will
/// have its own private copy of the function (with internal linkage).
Expand Down Expand Up @@ -154,18 +168,31 @@ pub trait TransItemExt<'a, 'tcx>: fmt::Debug {
fn instantiation_mode(&self,
tcx: TyCtxt<'a, 'tcx, 'tcx>)
-> InstantiationMode {
let inline_in_all_cgus =
tcx.sess.opts.debugging_opts.inline_in_all_cgus.unwrap_or_else(|| {
tcx.sess.opts.optimize != OptLevel::No
});

match *self.as_trans_item() {
TransItem::Fn(ref instance) => {
if self.explicit_linkage(tcx).is_none() &&
common::requests_inline(tcx, instance)
{
InstantiationMode::LocalCopy
if inline_in_all_cgus {
InstantiationMode::LocalCopy
} else {
InstantiationMode::GloballyShared { may_conflict: true }
}
} else {
InstantiationMode::GloballyShared
InstantiationMode::GloballyShared { may_conflict: false }
}
}
TransItem::Static(..) => InstantiationMode::GloballyShared,
TransItem::GlobalAsm(..) => InstantiationMode::GloballyShared,
TransItem::Static(..) => {
InstantiationMode::GloballyShared { may_conflict: false }
}
TransItem::GlobalAsm(..) => {
InstantiationMode::GloballyShared { may_conflict: false }
}
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

// ignore-tidy-linelength
// compile-flags:-Zprint-trans-items=eager
// compile-flags:-Zinline-in-all-cgus

//~ TRANS_ITEM fn core::ptr[0]::drop_in_place[0]<drop_in_place_intrinsic::StructWithDtor[0]> @@ drop_in_place_intrinsic0[Internal]
struct StructWithDtor(u32);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

// ignore-tidy-linelength
// compile-flags:-Zprint-trans-items=eager
// compile-flags:-Zinline-in-all-cgus

#![deny(dead_code)]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

// ignore-tidy-linelength
// compile-flags:-Zprint-trans-items=eager
// compile-flags:-Zinline-in-all-cgus

#![deny(dead_code)]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

// ignore-tidy-linelength
// compile-flags:-Zprint-trans-items=eager
// compile-flags:-Zinline-in-all-cgus

#![deny(dead_code)]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

// ignore-tidy-linelength
// compile-flags:-Zprint-trans-items=eager
// compile-flags:-Zinline-in-all-cgus

#![deny(dead_code)]

Expand Down
1 change: 1 addition & 0 deletions src/test/codegen-units/item-collection/tuple-drop-glue.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

// ignore-tidy-linelength
// compile-flags:-Zprint-trans-items=eager
// compile-flags:-Zinline-in-all-cgus

#![deny(dead_code)]

Expand Down
1 change: 1 addition & 0 deletions src/test/codegen-units/item-collection/unsizing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

// ignore-tidy-linelength
// compile-flags:-Zprint-trans-items=eager
// compile-flags:-Zinline-in-all-cgus

#![deny(dead_code)]
#![feature(coerce_unsized)]
Expand Down
1 change: 1 addition & 0 deletions src/test/codegen-units/partitioning/extern-drop-glue.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
// We specify -Z incremental here because we want to test the partitioning for
// incremental compilation
// compile-flags:-Zprint-trans-items=lazy -Zincremental=tmp/partitioning-tests/extern-drop-glue
// compile-flags:-Zinline-in-all-cgus

#![allow(dead_code)]
#![crate_type="lib"]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
// We specify -Z incremental here because we want to test the partitioning for
// incremental compilation
// compile-flags:-Zprint-trans-items=lazy -Zincremental=tmp/partitioning-tests/inlining-from-extern-crate
// compile-flags:-Zinline-in-all-cgus

#![crate_type="lib"]

Expand Down
1 change: 1 addition & 0 deletions src/test/codegen-units/partitioning/local-drop-glue.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
// We specify -Z incremental here because we want to test the partitioning for
// incremental compilation
// compile-flags:-Zprint-trans-items=lazy -Zincremental=tmp/partitioning-tests/local-drop-glue
// compile-flags:-Zinline-in-all-cgus

#![allow(dead_code)]
#![crate_type="lib"]
Expand Down
Loading

0 comments on commit 72d6501

Please sign in to comment.