-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rust: define shared Commit
type
#4351
Conversation
Summary: This module performs conversions from legacy on-disk data formats. It will correspond to both the `data_compat` and `dataclass_compat` modules from Python TensorBoard. For now, we have just one function, to determine the initial summary metadata of a time series. And, for now, we only implement this for scalars, to keep things simple. Test Plan: Unit tests included. wchargin-branch: rust-data-compat wchargin-source: ef510b422b00a66a23efc1406947844fa8d1a798
Summary: Our errors now have nice string formatting, easier `From` conversions, and `std::error::Error` implementations. Test Plan: Included some tests for the `Display` implementations. They’re often not necessary—one benefit of deriving traits is that you can be confident in the implementation without manually testing it. But sometimes, if the format string is non-trivial, it can be nice to actually see the full text written out. wchargin-branch: rust-use-thiserror wchargin-source: 0f8009c3b96619f2eb4649353487dc996b45a712
Summary: We’d defined a `Step` [newtype] in the `reservoir` module, but that type will be useful more broadly. We thus move it into a new `types` module, and define `WallTime` and `Tag` siblings, which we’ll need shortly. [newtype]: https://doc.rust-lang.org/rust-by-example/generics/new_types.html Test Plan: Most behavior is just `#[derive]`d; unit tests included for the rest. wchargin-branch: rust-types-module wchargin-source: b0f5475b247ef8f9cb8bf2e0bf94d2b648b8009e
…rigin/wchargin-rust-use-thiserror' and 'origin/wchargin-rust-types-module' into HEAD
wchargin-branch: rust-run-loader wchargin-source: 35359d6e233139973edac8b2c1440cc48ed4e710
wchargin-branch: rust-data-compat wchargin-source: 467974fa4e0e55dea0872bb1394a768a38811ac5
wchargin-branch: rust-run-loader wchargin-source: d190e3d45c304d976a3fc14a73d99572a6ad377f
Summary: This patch refactors `run`’s `StagePayload` type into an `EventValue` type in `data_compat`. This way, we can define enriching conversions from `EventValue`s to committed values in a layer that’s not specific to the run loader. We also implement `initial_metadata` for the `GraphDef` variant, for symmetry and to make the abstraction clearer. The run loader still doesn’t handle graphs. Test Plan: Unit tests expanded for the new code, and continue to cover the old. wchargin-branch: rust-promote-eventvalue wchargin-source: 96af44556af064cf9ef933154ca3a522a8829798
wchargin-branch: rust-run-loader wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9 # Conflicts: # tensorboard/data/server/data_compat.rs
wchargin-branch: rust-run-loader wchargin-source: 170c77aae53df6716520df779e4838d3fc3b93b9
wchargin-branch: rust-promote-eventvalue wchargin-source: 106b4519aba1a966d8273b6b2509f1798131dc5a
Summary: This patch introduces the *commit*, which is the link between loading threads and serving threads. The commit structure is designed for concurrency: many threads can read from it at once, and when the commit is updated, a loader acquires exclusive access only to its own run. This improves latency for both readers and writers. The commit itself doesn’t have much logic; it’s mostly just the correct composition of substructures, most notably the reservoir basins. See also the RustBoard technical design doc (#4350) for discussion. Test Plan: There’s one method on `commit::TimeSeries`; it has tests. wchargin-branch: rust-commit-type wchargin-source: d69b4e0034b68d1d8dadc6aa0f6dc3cc48efccde
wchargin-branch: rust-run-loader wchargin-source: 793ff3e287c2571fd1bc06d5c1940ed3b3bdc796
wchargin-branch: rust-promote-eventvalue wchargin-source: 432923653efb6fe768bdd507acac833672ec3e06
wchargin-branch: rust-commit-type wchargin-source: 1850f510bac1f515fa4b7f73c1ec9a7a47fea283
wchargin-branch: rust-promote-eventvalue wchargin-source: dd7917e2bd9084f8ff048657d94572168a221a7e # Conflicts: # tensorboard/data/server/data_compat.rs # tensorboard/data/server/run.rs
wchargin-branch: rust-promote-eventvalue wchargin-source: dd7917e2bd9084f8ff048657d94572168a221a7e
wchargin-branch: rust-commit-type wchargin-source: 38179a4b9af7538f133ee04f771ac895a8db21a9
wchargin-branch: rust-commit-type wchargin-source: b62b0669a50b5086a1f6af18effe62fbfd1e5431
let mut m: HashMap<Run, i32> = HashMap::new(); | ||
m.insert(Run("train".to_string()), 1); | ||
m.insert(Run("test".to_string()), 2); | ||
// We can call `get` given only a `&str`, not an owned `Run`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you said on the other PR that Borrow
is what enables this. Do you mind explaining the intuition I should use for why that's the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. To start, the docs for Borrow
have this as an example
case, which I found helpful.
Probably the most foolproof way to see what’s happening is to look at
the type signature for HashMap<K, V>::get
:
pub fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V>
where
K: Borrow<Q>,
Q: Hash + Eq,
So here we’re calling get
with an &str
, so Q = str
, so it needs to
be the case that impl Borrow<str> for Run
to satisfy the where
constraint. And internally, this calls k.eq(x.0.borrow())
where
(presumably) x.0
is the map-owned key, so it boils down to checking
whether "train" == Run("train").borrow()
.
Does this help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks! I didn't read far enough down the Borrow
page to see the HashMap example but now it's quite clear, my bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem! Pointing you to the right docs is a good use of my time. :-)
/// tensors and blob sequences will come soon. | ||
#[derive(Debug, Default)] | ||
pub struct RunData { | ||
/// The time of the first event recorded for this run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I've wanted for a while to opportunistically parse this from the filename so we don't have to open a file for each run in order to serve the initial list of runs, which is especially bad for remote filesystems. WDYT? Wouldn't be implemented here of course, but might affect the semantics we describe for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’d be fine with me, yep. I think that fits naturally into the logdir
loader structure.
#[derive(Debug)] | ||
pub struct TimeSeries<V> { | ||
/// Summary metadata for this time series. | ||
pub metadata: Box<pb::SummaryMetadata>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, what's the rationale for this being Box
ed? The Book suggests 3 main reasons one might use Box
and I'm guessing the reason here is "When you have a large amount of data and you want to transfer ownership but ensure the data won’t be copied when you do so" but then looking at SummaryMetadata
I just see data that's already on the heap (except one i32): https://cs.opensource.google/tensorflow/tensorboard/+/master:tensorboard/data/server/tensorboard.pb.rs;l=294;drc=34fe0bfd9d41c4e0da175df16cf716436a4ae263
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re right that it’s not that bad. Here are the ones that I’m
concerned about, with output from rustc -Zprint-type-sizes
:
Type size information
print-type-size type: `proto::tensorboard::summary::Value`: 536 bytes, alignment: 8 bytes
print-type-size field `.node_name`: 24 bytes
print-type-size field `.tag`: 24 bytes
print-type-size field `.metadata`: 104 bytes
print-type-size field `.value`: 384 bytes
print-type-size type: `proto::tensorboard::TensorProto`: 376 bytes, alignment: 8 bytes
print-type-size field `.tensor_shape`: 32 bytes
print-type-size field `.tensor_content`: 24 bytes
print-type-size field `.half_val`: 24 bytes
print-type-size field `.float_val`: 24 bytes
print-type-size field `.double_val`: 24 bytes
print-type-size field `.int_val`: 24 bytes
print-type-size field `.string_val`: 24 bytes
print-type-size field `.scomplex_val`: 24 bytes
print-type-size field `.int64_val`: 24 bytes
print-type-size field `.bool_val`: 24 bytes
print-type-size field `.dcomplex_val`: 24 bytes
print-type-size field `.resource_handle_val`: 24 bytes
print-type-size field `.variant_val`: 24 bytes
print-type-size field `.uint32_val`: 24 bytes
print-type-size field `.uint64_val`: 24 bytes
print-type-size field `.dtype`: 4 bytes
print-type-size field `.version_number`: 4 bytes
print-type-size type: `proto::tensorboard::SummaryMetadata`: 104 bytes, alignment: 8 bytes
print-type-size field `.plugin_data`: 48 bytes
print-type-size field `.display_name`: 24 bytes
print-type-size field `.summary_description`: 24 bytes
print-type-size field `.data_class`: 4 bytes
print-type-size end padding: 4 bytes
So yes, “large amount of data” can definitely go higher than 104 bytes,
but that’s the only reason here. It should also make hash map resizes of
the TagStore
a bit faster, though that shouldn’t matter too much.
Boxing all three of these together improved end-to-end performance by
~25%, and I would believe that most of that is due to TensorProto
and
Value
since those directly affect the reservoir size, but I also don’t
mind doing this. If we want to unbox this, that’s okay with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, it was primarily a curiosity question - fine by me to leave it boxed. I didn't realize it'd be 24 bytes per Vec aka per String so it does add up to more than I'd have thought; at least one reason is that in my head I'm still thinking about immutable string types where there isn't an extra 8 bytes for capacity.
I doubt that it makes a big difference in our case, but in general it would seem like some applications with a lot of immutable strings would save space overhead by having an immutable owned string representation with pointer + length only? I guess that must just not come up very often since I don't see much evidence that this exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right: you basically have:
- growable vector:
Vec<u8>
,String
; 3 words (ptr, len, cap) - slice reference:
&[u8]
,&str
; 2 words (ptr, len) - owned slice:
Box<[u8]>
,Box<str>
; 2 words (ptr, len)
where in all case the string layout is the same as the vector
layout, just with a “valid UTF-8” invariant.
So if you want “immutable owned string with no capacity penalty”, you
can take a Box<str>
. And you can actually mutate it, too; you just
can’t resize it without reallocating.
Some protobuf-like APIs use this form: iirc, capnproto arrays aren’t
resizable after you construct them. But that’s a lot less ergonomic when
it comes to code that constructs the things, so I’m okay with prost
choosing to use Vec
s and String
s. The TensorProto
case is
definitely unfortunate, though. If protobuf had repeated oneofs, we
wouldn’t have this problem, since an enum
of a bunch of Vec<_>
s is
still just 32 bytes (24 + 8 for discriminant).
Rust also does have fixed-size arrays, so a Box<[u8; 12]>
is even
smaller since the size is known at compile time. I was considering using
that for the TFRecord reader header rather than a Vec<u8>
, but it’d be
a bit annoying, so would definitely need to be benchmarked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, of course, not all languages with immutable strings fit them into 8
or 16 bytes:
>>> import sys
>>> sys.getsizeof("")
49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I think Box<str>
is what I was looking for, and now that you mention it I see there's even https://doc.rust-lang.org/std/string/struct.String.html#method.into_boxed_str.
It hadn't occurred to me since str
on its own seems a little exotic, but this explanation helps. What didn't help was the fact that https://doc.rust-lang.org/book/ch04-03-slices.html starts by saying "Another data type that does not have ownership is the slice" when I think what they're really doing is referring to &[_]
as a "slice" informally and eliding the distinction between that and [_]
. So whether or not a "slice" in the strict meaning of [_]
has ownership depends on the type of pointer being used to hold it (namely if it's &
, &mut
, or Box
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think these are all non-blocking, but happy to change the
Box<T>
in a follow-up if you’d like.
let mut m: HashMap<Run, i32> = HashMap::new(); | ||
m.insert(Run("train".to_string()), 1); | ||
m.insert(Run("test".to_string()), 2); | ||
// We can call `get` given only a `&str`, not an owned `Run`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. To start, the docs for Borrow
have this as an example
case, which I found helpful.
Probably the most foolproof way to see what’s happening is to look at
the type signature for HashMap<K, V>::get
:
pub fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V>
where
K: Borrow<Q>,
Q: Hash + Eq,
So here we’re calling get
with an &str
, so Q = str
, so it needs to
be the case that impl Borrow<str> for Run
to satisfy the where
constraint. And internally, this calls k.eq(x.0.borrow())
where
(presumably) x.0
is the map-owned key, so it boils down to checking
whether "train" == Run("train").borrow()
.
Does this help?
/// tensors and blob sequences will come soon. | ||
#[derive(Debug, Default)] | ||
pub struct RunData { | ||
/// The time of the first event recorded for this run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’d be fine with me, yep. I think that fits naturally into the logdir
loader structure.
#[derive(Debug)] | ||
pub struct TimeSeries<V> { | ||
/// Summary metadata for this time series. | ||
pub metadata: Box<pb::SummaryMetadata>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re right that it’s not that bad. Here are the ones that I’m
concerned about, with output from rustc -Zprint-type-sizes
:
Type size information
print-type-size type: `proto::tensorboard::summary::Value`: 536 bytes, alignment: 8 bytes
print-type-size field `.node_name`: 24 bytes
print-type-size field `.tag`: 24 bytes
print-type-size field `.metadata`: 104 bytes
print-type-size field `.value`: 384 bytes
print-type-size type: `proto::tensorboard::TensorProto`: 376 bytes, alignment: 8 bytes
print-type-size field `.tensor_shape`: 32 bytes
print-type-size field `.tensor_content`: 24 bytes
print-type-size field `.half_val`: 24 bytes
print-type-size field `.float_val`: 24 bytes
print-type-size field `.double_val`: 24 bytes
print-type-size field `.int_val`: 24 bytes
print-type-size field `.string_val`: 24 bytes
print-type-size field `.scomplex_val`: 24 bytes
print-type-size field `.int64_val`: 24 bytes
print-type-size field `.bool_val`: 24 bytes
print-type-size field `.dcomplex_val`: 24 bytes
print-type-size field `.resource_handle_val`: 24 bytes
print-type-size field `.variant_val`: 24 bytes
print-type-size field `.uint32_val`: 24 bytes
print-type-size field `.uint64_val`: 24 bytes
print-type-size field `.dtype`: 4 bytes
print-type-size field `.version_number`: 4 bytes
print-type-size type: `proto::tensorboard::SummaryMetadata`: 104 bytes, alignment: 8 bytes
print-type-size field `.plugin_data`: 48 bytes
print-type-size field `.display_name`: 24 bytes
print-type-size field `.summary_description`: 24 bytes
print-type-size field `.data_class`: 4 bytes
print-type-size end padding: 4 bytes
So yes, “large amount of data” can definitely go higher than 104 bytes,
but that’s the only reason here. It should also make hash map resizes of
the TagStore
a bit faster, though that shouldn’t matter too much.
Boxing all three of these together improved end-to-end performance by
~25%, and I would believe that most of that is due to TensorProto
and
Value
since those directly affect the reservoir size, but I also don’t
mind doing this. If we want to unbox this, that’s okay with me.
Summary:
This patch introduces the commit, which is the link between loading
threads and serving threads. The commit structure is designed for
concurrency: many threads can read from it at once, and when the commit
is updated, a loader acquires exclusive access only to its own run. This
improves latency for both readers and writers.
The commit itself doesn’t have much logic; it’s mostly just the correct
composition of substructures, most notably the reservoir basins.
See also the RustBoard technical design doc (#4350) for discussion.
Test Plan:
There’s one method on
commit::TimeSeries
; it has tests.wchargin-branch: rust-commit-type