Skip to content

Commit

Permalink
C++ no longer leak arrow headers into our own headers, include minima…
Browse files Browse the repository at this point in the history
…l set (#3041)

* Part of #2919
* Fixes #2873

### What

We now use our own status type everywhere, allowing us to predeclare
everything arrow.
On top of that I made sure that we include a much more minimal set of
arrow headers for the serialization code itself.

There's a static assertion in the unit test that ensures that we don't
start leaking arrow headers into the future

Haven't measured, but it feels like this also improved compile times of
the sdk which were already pretty good.

-----------

The one thing that's noted in #2873 but is not happening here yet (?) is
splitting out the utility methods. The idea would be this (actual
generated code I already have on a test branch!) :
```cpp
        struct LineStrip2D {
// .........
            /// Creates a Rerun DataCell from an array of LineStrip2D components.
            static Result<rerun::DataCell> to_data_cell(
                const LineStrip2D* instances, size_t num_instances
            );
        };

        /// Serialization utilities for arrays of LineStrip2D
        namespace LineStrip2DSerializationUtils {
            /// Returns the arrow data type this type corresponds to.
            static const std::shared_ptr<arrow::DataType>& to_arrow_datatype();

            /// Creates a new array builder with an array of this type.
            static Result<std::shared_ptr<arrow::ListBuilder>> new_arrow_array_builder(
                arrow::MemoryPool* memory_pool
            );

            /// Fills an arrow array builder with an array of this type.
            static Error fill_arrow_array_builder(
                arrow::ListBuilder* builder, const LineStrip2D* elements, size_t num_elements
            );
        } // namespace LineStrip2DSerializationUtils
```

The problem is that this makes calling these helpers unnecessary
complicated sicne I need to append `SerializationUtils` to the typename
of some previously static method calls which is suprisingly cumbersome 🤔
EDIT: Briefly chatted with Emil about it. Let's do it, it's actually not
that hard

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3041) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3041)
- [Docs
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fdont-leak-arrow-headers/docs)
- [Examples
preview](https://rerun.io/preview/pr%3Aandreas%2Fcpp%2Fdont-leak-arrow-headers/examples)
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
  • Loading branch information
Wumpf authored Aug 19, 2023
1 parent de75d3a commit a55e884
Show file tree
Hide file tree
Showing 157 changed files with 2,336 additions and 1,424 deletions.
2 changes: 1 addition & 1 deletion crates/re_types/source_hash.txt

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

206 changes: 206 additions & 0 deletions crates/re_types_builder/src/codegen/cpp/array_builder.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
use proc_macro2::{Ident, TokenStream};
use quote::{format_ident, quote};

use crate::{Object, ObjectSpecifics, Objects, Type};

use super::{
forward_decl::{ForwardDecl, ForwardDecls},
includes::Includes,
quote_fqname_as_type_path, quote_integer,
};

pub fn arrow_array_builder_type(typ: &Type, objects: &Objects) -> Ident {
arrow_array_builder_type_and_declaration(typ, objects, &mut ForwardDecls::default())
}

fn arrow_array_builder_type_and_declaration(
typ: &Type,
objects: &Objects,
declarations: &mut ForwardDecls,
) -> Ident {
match typ {
Type::Int8
| Type::Int16
| Type::Int32
| Type::Int64
| Type::UInt8
| Type::UInt16
| Type::UInt32
| Type::UInt64
| Type::Float16
| Type::Float32
| Type::Float64 => {
let klass = match typ {
Type::Int8 => "Int8",
Type::Int16 => "Int16",
Type::Int32 => "Int32",
Type::Int64 => "Int64",
Type::UInt8 => "UInt8",
Type::UInt16 => "UInt16",
Type::UInt32 => "UInt32",
Type::UInt64 => "UInt64",
Type::Float16 => "Float16",
Type::Float32 => "Float",
Type::Float64 => "Float64",
_ => {
unreachable!();
}
};
let klass_type = format_ident!("{klass}Type");

declarations.insert(
"arrow",
ForwardDecl::TemplateClass(format_ident!("NumericBuilder")),
);
declarations.insert("arrow", ForwardDecl::Class(klass_type.clone()));

let ident = format_ident!("{klass}Builder");
declarations.insert(
"arrow",
ForwardDecl::Alias {
from: ident.clone(),
to: quote!(NumericBuilder<#klass_type>),
},
);
ident
}
Type::String => {
let ident = format_ident!("StringBuilder");
declarations.insert("arrow", ForwardDecl::Class(ident.clone()));
ident
}
Type::Bool => {
let ident = format_ident!("BooleanBuilder");
declarations.insert("arrow", ForwardDecl::Class(ident.clone()));
ident
}
Type::Array { .. } => {
let ident = format_ident!("FixedSizeListBuilder");
declarations.insert("arrow", ForwardDecl::Class(ident.clone()));
ident
}
Type::Vector { .. } => {
let ident = format_ident!("ListBuilder");
declarations.insert("arrow", ForwardDecl::Class(ident.clone()));
ident
}
Type::Object(fqname) => {
arrow_array_builder_type_object(&objects[fqname], objects, declarations)
}
}
}

pub fn arrow_array_builder_type_object(
obj: &Object,
objects: &Objects,
declarations: &mut ForwardDecls,
) -> Ident {
if obj.is_arrow_transparent() {
arrow_array_builder_type_and_declaration(&obj.fields[0].typ, objects, declarations)
} else {
let class_ident = match obj.specifics {
ObjectSpecifics::Struct => format_ident!("StructBuilder"),
ObjectSpecifics::Union { .. } => format_ident!("DenseUnionBuilder"),
};

declarations.insert("arrow", ForwardDecl::Class(class_ident.clone()));
class_ident
}
}

pub fn quote_arrow_array_builder_type_instantiation(
typ: &Type,
objects: &Objects,
cpp_includes: &mut Includes,
is_top_level_type: bool,
) -> TokenStream {
let builder_type = arrow_array_builder_type(typ, objects);

match typ {
Type::UInt8
| Type::UInt16
| Type::UInt32
| Type::UInt64
| Type::Int8
| Type::Int16
| Type::Int32
| Type::Int64
| Type::Bool
| Type::Float16
| Type::Float32
| Type::Float64
| Type::String => {
quote!(std::make_shared<arrow::#builder_type>(memory_pool))
}
Type::Vector { elem_type } => {
let element_builder = quote_arrow_array_builder_type_instantiation(
&elem_type.clone().into(),
objects,
cpp_includes,
false,
);
quote!(std::make_shared<arrow::#builder_type>(memory_pool, #element_builder))
}
Type::Array { elem_type, length } => {
let quoted_length = quote_integer(length);
let element_builder = quote_arrow_array_builder_type_instantiation(
&elem_type.clone().into(),
objects,
cpp_includes,
false,
);
quote!(std::make_shared<arrow::#builder_type>(memory_pool, #element_builder, #quoted_length))
}
Type::Object(fqname) => {
let object = &objects[fqname];

if !is_top_level_type {
// Propagating error here is hard since we're in a nested context.
// But also not that important since we *know* that this only fails for null pools and we already checked that now.
// For the unlikely broken case, Rerun result will give us a nullptr which will then
// fail the subsequent actions inside arrow, so the error will still propagate.
let quoted_fqname = quote_fqname_as_type_path(cpp_includes, fqname);
quote!(#quoted_fqname::new_arrow_array_builder(memory_pool).value)
} else if object.is_arrow_transparent() {
quote_arrow_array_builder_type_instantiation(
&object.fields[0].typ,
objects,
cpp_includes,
false,
)
} else {
let field_builders = object.fields.iter().map(|field| {
quote_arrow_array_builder_type_instantiation(
&field.typ,
objects,
cpp_includes,
false,
)
});

match object.specifics {
ObjectSpecifics::Struct => {
quote! {
std::make_shared<arrow::#builder_type>(
to_arrow_datatype(),
memory_pool,
std::vector<std::shared_ptr<arrow::ArrayBuilder>>({ #(#field_builders,)* })
)
}
}
ObjectSpecifics::Union { .. } => {
quote! {
std::make_shared<arrow::#builder_type>(
memory_pool,
std::vector<std::shared_ptr<arrow::ArrayBuilder>>({
std::make_shared<arrow::NullBuilder>(memory_pool), #(#field_builders,)*
}),
to_arrow_datatype()
)
}
}
}
}
}
}
}
71 changes: 60 additions & 11 deletions crates/re_types_builder/src/codegen/cpp/forward_decl.rs
Original file line number Diff line number Diff line change
@@ -1,28 +1,77 @@
use std::collections::{BTreeMap, BTreeSet};

use proc_macro2::TokenStream;
use proc_macro2::{Ident, TokenStream};
use quote::{format_ident, quote};

use super::NEWLINE_TOKEN;

/// A C++ forward declaration.
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)]
#[allow(dead_code)]
#[derive(Debug, Clone)]
pub enum ForwardDecl {
Struct(String),
Class(String),
Class(Ident),
TemplateClass(Ident),

/// Aliases are only identified by their `from` name!
Alias {
from: Ident,
to: TokenStream,
},
}

impl PartialEq for ForwardDecl {
fn eq(&self, other: &Self) -> bool {
match (self, other) {
(Self::Class(l0), Self::Class(r0))
| (Self::TemplateClass(l0), Self::TemplateClass(r0)) => l0 == r0,
(Self::Alias { from: l_from, .. }, Self::Alias { from: r_from, .. }) => {
// Ignore `to` for equality
l_from == r_from
}
_ => false,
}
}
}

impl Eq for ForwardDecl {}

impl PartialOrd for ForwardDecl {
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
Some(self.cmp(other))
}
}

impl Ord for ForwardDecl {
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
match (self, other) {
(ForwardDecl::TemplateClass(a), ForwardDecl::TemplateClass(b))
| (ForwardDecl::Class(a), ForwardDecl::Class(b))
| (ForwardDecl::Alias { from: a, .. }, ForwardDecl::Alias { from: b, .. }) => {
a.to_string().cmp(&b.to_string())
}
(ForwardDecl::TemplateClass(_), _) => std::cmp::Ordering::Less,
(_, ForwardDecl::TemplateClass(_)) => std::cmp::Ordering::Greater,

(ForwardDecl::Class(_), _) => std::cmp::Ordering::Less,
(_, ForwardDecl::Class(_)) => std::cmp::Ordering::Greater,
}
}
}

impl quote::ToTokens for ForwardDecl {
fn to_tokens(&self, tokens: &mut TokenStream) {
match self {
ForwardDecl::Struct(name) => {
let name_ident = format_ident!("{name}");
quote! { struct #name_ident; }
}
ForwardDecl::Class(name) => {
let name_ident = format_ident!("{name}");
quote! { class #name_ident; }
quote! { class #name; }
}
ForwardDecl::TemplateClass(name) => {
quote! {
template<typename T> class #name;
#NEWLINE_TOKEN
#NEWLINE_TOKEN
}
}
ForwardDecl::Alias { from, to } => {
quote! { using #from = #to; }
}
}
.to_tokens(tokens);
Expand Down
4 changes: 2 additions & 2 deletions crates/re_types_builder/src/codegen/cpp/includes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ use quote::quote;

use crate::objects::is_testing_fqname;

use super::{NEWLINE_TOKEN, SYS_INCLUDE_PATH_PREFIX_TOKEN, SYS_INCLUDE_PATH_SUFFIX_TOKEN};
use super::{ANGLE_BRACKET_LEFT_TOKEN, ANGLE_BRACKET_RIGHT_TOKEN, NEWLINE_TOKEN};

/// Keeps track of necessary includes for a file.
pub struct Includes {
Expand Down Expand Up @@ -92,7 +92,7 @@ impl quote::ToTokens for Includes {
let hash = quote! { # };
let system = system.iter().map(|name| {
// Need to mark system includes with tokens since they are usually not idents (can contain slashes and dots)
quote! { #hash include #SYS_INCLUDE_PATH_PREFIX_TOKEN #name #SYS_INCLUDE_PATH_SUFFIX_TOKEN #NEWLINE_TOKEN }
quote! { #hash include #ANGLE_BRACKET_LEFT_TOKEN #name #ANGLE_BRACKET_RIGHT_TOKEN #NEWLINE_TOKEN }
});
let local = local.iter().map(|name| {
quote! { #hash include #name #NEWLINE_TOKEN }
Expand Down
Loading

1 comment on commit a55e884

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Rust Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.25.

Benchmark suite Current: a55e884 Previous: de75d3a Ratio
mono_points_arrow/generate_message_bundles 26668090 ns/iter (± 2299022) 20586373 ns/iter (± 877019) 1.30

This comment was automatically generated by workflow using github-action-benchmark.

Please sign in to comment.