Fast paths for DataTimeline modifications #56

elidupree · 2019-03-11T01:28:23Z

Fast paths for DataTimeline modifications

Currently, the main DataTimeline type looks something like this:

struct SimpleTimeline<T> {
  changes: VecDeque<(EventHandle, T)>,
  other_dependent_events: BTreeSet<EventHandle>,
}

This has quite a bit of overhead for something that has an instance for every object in the whole simulation. In particular, it will do a lot of allocations when all we're doing is ordinary accesses and modifications to the T. (There's also memory usage overhead, but it's harder for me to be certain exactly how much memory usage is a problem.)

Naturally, we can't get quite as good as the non-TimeSteward version:

struct SimpleObjectThatsNotATimeline<T> {
  current_value: T, // current and ONLY value
}

Because we're TimeSteward, it needs to be more like this:

struct SimpleTimeline<T> {
  current_value: T,
  some_data_about_the_history: ?????,
}

But there are a few assumptions we can make.

First, the core TimeSteward assumption: the number of changes that will ever be reversed is significantly less than O(n). And similarly, the number of non-present-time updates is significantly less than O(n).

So we want a fast path for accesses and modifies at the present time. They should act upon current_value, and also do the absolute minimum work necessary to make it possible to later go back and recover the history.

Side note: checking whether we are in the present

Might it be possible to know whether we are in the present BEFORE even accessing a DataTimeline? The current prototype could certainly know this. If you're in a single-threaded TimeSteward, it's easy to have the Accessor have a boolean flag saying that we are in the present. In a multithreaded TimeSteward… Well, from the DataTimeline perspective, all that matters is whether we locally in the present. And for the "locked regions" model of multithreading, it's possible that almost everything can be happening in the local present (running events that are certain they will never access/modify a timeline that was already modified at a later historical time).

If we can't, we would still want as much of a fast-path as we can get:

struct SimpleTimeline<T> {
  current_value: T,
  last_change: EventHandle,
  more_data_about_the_history: ?????,
}

fn do_something<T>(when: EventHandle, timeline: &mut SimpleTimeline<T>, something: Something) {
  if when.extended_time() > timeline.last_change.extended_time() {
    // fast path
    do_something_in_the_present(when, timeline, something);
  }
  else {
    do_something_in_the_past(when, timeline, something);
  }
}

We would like to say that this is a single comparison. Unfortunately, ExtendedTime comparisons are actually up to 4 numerical comparisons. If the highest term (the base time) is different, it's only one comparison, but it isn't uncommon for a timeline to get multiple changes in the same event, which is technically at the same time as itself, meaning that you would have to do a comparison on all four numbers.

However, it's likely that we can simply pass a bool from the accessor (so only one conditional) – or, perhaps better yet, have the accessor type know at compile time that we are in the present (zero conditionals!).

Storing the history

The basic requirements are that storing n items of history should be O(n), and recovering them should be, let's say, O(n) if it's convenient, but at least not TOO much higher than that. Within those conditions, the highest priority is to make storing the history extremely fast (including the cost of discarding the history later). Other priorities include minimizing memory usage of the history, making the recovery reasonably fast, and minimizing use of unsafe code.

Something like this:

fn do_something_in_the_present <T>(when: EventHandle, timeline: &mut SimpleTimeline<T>, something: Something) {
  let history_that_needs_to_be_stored = something.apply (when, &mut timeline.current_value);
  write history_that_needs_to_be_stored somewhere in memory, without doing anything expensive like allocations, hashing, etc.
}

Here's my idea: we write the history into a global page of history. Perhaps each history type H has pages something like a ArrayVec<[H; 1024]>, and in particular, there's a current global page that can be written into (global to the TimeSteward object, that is, not the whole program). When you need to write some history, you write the history into the current global page, along with a pointer to your previous piece of history. So this behaves like a singly linked list of immutable history items. If you later need to recover the history, you follow the pointers in sequence. SimpleTimeline itself only needs to maintain a pointer to the head.

When a page fills up, we place it into a global list of ALL pages, in order by the maximum historical time of any data within it. Eventually the client code permits us to forget all information prior to a particular historical time, so we are able to discard any pages older than that, without having to visit each individual DataTimeline. For history types that implement Drop, we might even be able to pass the pages to a different thread before discarding them.

The basic concept is something like this:

struct SimpleTimeline<T> {
  current_value: T,
  history_head: Option<*const HistoryNode>,
}

struct HistoryNode {
  data: H,
  previous: Option<*const HistoryNode>,
}

type Page = ArrayVec<HistoryNode>;

struct TimeSteward {
  ...
  current_page: Box<Page>,
  old_pages: BTreeMap<Time, Box<Page>>,
}

fn do_something_in_the_present <T>(time_steward: & TimeSteward, when: EventHandle, timeline: &mut SimpleTimeline<T>, something: Something) {
  let history_that_needs_to_be_stored = something.apply (when, &mut timeline.current_value);
  
  if time_steward.current_page.is_full() {
    let max = time_steward.current_page.iter().map (|node| node.time()).max();
    let retiring_page = mem::replace (&mut time_steward.current_page, Box::new (ArrayVec::new()));
    time_steward.old_pages.insert (max, retiring_page);
  }
  time_steward.current_page.push(HistoryNode {
    data: history_that_needs_to_be_stored,
    previous: timeline.history_head,
  })
  timeline.history_head = Some (time_steward.current_page.last().unwrap() as *const HistoryNode);
}


fn recover_history <T>(earliest_time_to_recover: ExtendedTime, timeline: &mut SimpleTimeline<T>) -> Vec<H> {
  let mut result = Vec::new();
  let previous = timeline.history_head;
  while let Some(node) = previous {
    unsafe {
      // very bad – see discussion below
      if node.extended_time() < earliest_time_to_recover { break; }
      result.push (node.data);
      previous = node.previous;
    }
  }
}

Naturally, the above code is incomplete and has some problems.

Gotcha 1: Accessing freed memory

We eventually discard the pages whose historical times are before a certain time. Thus, when we recover the history, we need to avoid looking further back than earliest_time_to_recover, because those pages may have been freed. The above code looks one node further back than that.

In particular, we are not permitted to access a node to check what time it's at, because we have to check the time before we are permitted to access it at all.

One idea is that, when we say Option<*const HistoryNode>, we could actually use something more like Option<(ExtendedTime, *const HistoryNode)>. That way, each pointer is annotated with the information needed to tell whether it's safe to access it. Each history known effectively has its time stored alongside the SUBSEQUENT node, instead of itself. Fortunately, this works smoothly with the rest of the algorithms. Unfortunately, it still uses unsafe code.

However, we can also get rid of the unsafe code: TimeSteward can have a HashMap of old pages, indexed by their address in memory, rounded off to about their size (if they're not already aligned to their size). That way, you can use a *const HistoryNode to look up the page it's in and infer the index within that page, rather than dereferencing it directly.

This approach is acceptable performance-wise because the HashMap only needs to be updated O(1) times per entire page, and the lookups only happen during history recovery, which is allowed to be less efficient.

Important note: this doesn't single-handedly take care of the problem, because a page could be deallocated and then another one reallocated in its place, so you could be looking at the WRONG valid data unless there's an additional check against that. That case can occur if history node A looks back to history node B such that B.time is before the forget-before time, B's page was deallocated after A was created, and another page was allocated after that deallocation. We would like to have a way to make the new page be recognizable as "not a possible predecessor of A". So, we probably want to do the HashMap thing AND the "store times next to the subsequent node" thing, so if your pointer is annotated with a time before the forget-before time, you just don't try to look it up at all.

If we could assume that pages are created in a total order, we could also annotate each entire page with a sequential index, and know that entries on one page can only validly point to entries on the same page or a page that was created before it. However, we might not want to guarantee this, either because of multithreading or something else.

Gotcha 2: Multiple history types

The above algorithm only accounts for one type of history data. Current SimpleTimeline has two types of history – "data was changed by this event" and "data was accessed by this event". In the future, there might be even more – perhaps different entries for for replacing different fields, to minimize the amount of memory stored for each change.

Simplest plan: Make H an enum. This has memory overhead, but otherwise is fine.

More efficient(?) plan: Have separate pages for each type of history, allowing them to be packed densely, and have one linked list for each type (meaning each SimpleTimeline struct would contain multiple head pointers). If we assume that most objects will witness at least one of every type of history, this isn't much overhead. But if there are many transient objects and many rare history types (such as changing individual fields of a large struct? hmm), this could be a waste of memory.

Spookiest plan: Have separate pages for each type of history, allowing them to be packed densely, and only have a single linked list that straddles the various pages. This means that the previous-pointers might be pointing to different types! "Conveniently", the type can be looked up in the global HashMap of pages! (This dynamic typing has a bit of overhead, but the lookups only happen during history recovery, which is allowed to be less efficient. When you're STORING the history, you know all the necessary types at compile time.)

All of these plans could also be tricky to do fully generically within the rust type system, but I'm not going to worry about that for now (the scope of this post is mostly about what would be logically possible, not how to force the details past the compiler).

Gotcha 3: How to store times

An ExtendedTime may be around 32 bytes. It would be nice not to store that much in every history node.

Fortunately, there's a straightforward approach to compressing them: Instead of using the time directly, use an EventHandle, and look up the time in the event! That way, the ExtendedTime is only stored once per event. (Each event probably messes with multiple timelines, so there are many times fewer events than history nodes.)

Unfortunately, EventHandles might be Arc, and therefore have a runtime cost to reference count. It would be cool if dropping pages of Plain Old Data types was just deallocation.

Fortunately, we could put the events in pages, just like the history nodes!

Unfortunately, that re-raises the problem of accessing deallocated-rellocated event pages, since our previous plan was to guard against that by checking the times, but now, the times are hidden behind the very pointer we want to check for validity.

Fortunately, there are a couple of ways to annotate the pointers to protect against this, using only 4-8 additional bytes instead of a full ExtendedTime. It's hard for me to weigh this memory cost against the overhead of Arc.

Recovering the history

If a DataTimeline's history gets recovered at all, we may need to do a lot of queries into that history as the simulation catches up back to the present. So we don't want every query to be O(history length).

Luckily, this is the rare case that is permitted to be moderately inefficient! TimeSteward could have something like this:

struct TimeSteward {
  ...
  recovered_histories: HashMap<*const DataTimeline, BTreeMap<ExtendedTime, DetailedHistoryInfo>>
}

When you are executing events in the past, every time you look up a DataTimeline, you first check whether it exists within recovered_histories. If the recovered_histories entry goes back as far as you are looking, use that data instead of the present-time data. If it doesn't, pop from the DataTimeline's singly linked list, inserting the entries into recovered_histories, until you can do the above.

We would also need to remember when to discard this data – I suppose we could have another BTreeMap of recovered histories sorted by the most recent data in them, so that we can discard them when forget_before() is called. Again, it is okay for this to be inefficient because it will have few entries and forget_before() is called infrequently.

Note that using *const DataTimeline here assumes that DataTimelines can't be moved, which is currently de-facto true but isn't currently guaranteed to remain true in the future. Although it might end up guaranteed, since there are a lot of reasons it might be desirable for TimeSteward to be able to store references to individual DataTimelines. Another alternative is for each DataTimeline to have a unique identifier, which is fine except for the memory cost of storing the identifiers with every object.

The text was updated successfully, but these errors were encountered:

elidupree · 2019-03-13T17:51:52Z

Refinement: we don't need to use raw pointers or extra data for history links. If each page gets a serial number and is (for instance) 1024 history items long, then we can link to it with a usize, using the bottom 10 bits for the position within the page, and the rest of the bits for the serial number. First look up the page number in the TimeSteward, then look up the individual item in the page.

elidupree · 2019-03-13T18:24:36Z

We can't really do the same thing for events. Discarding history worked because history only goes up to the present time. But we create events in the future. So pages might have far-future events on them, preventing them from getting deleted.

If that can be worked around somehow, we can do the same thing for events, making EventHandle a glorified usize.

Pros:

no need for reference counting of events (reduced time cost and memory cost)
obsolete events can be freed while there are still links to them (reduced memory cost)
much simpler allocation of events (reduced time cost, and probably an improvement in memory usage as well because of fragmentation)

Cons:

dereferencing an event handle is slightly more expensive in time
dereferencing an event handle can fail inconsistently (this would be a pro if it was guaranteed to fail for events before the forget-before time, but that won't be guaranteed, at least not without a time cost)

Seems clearly advantageous if dereferencing events isn't very common. When do we dereference events?

In order to sort future events and know what's coming next (log n per event, but we might use something like a radix trie for that, which would remove this case)
When an event actually comes up to be executed (assumption: no more than approximately once per event)
When going back in history (assumption: much less than once per event)
When client could does it explicitly (none of my example simulations have done this yet)

elidupree added Optimization API-affecting labels Mar 11, 2019

elidupree changed the title ~~Fast paths for DataTimeline notifications~~ Fast paths for DataTimeline modifications Mar 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast paths for DataTimeline modifications #56

Fast paths for DataTimeline modifications #56

elidupree commented Mar 11, 2019 •

edited

Loading

elidupree commented Mar 13, 2019

elidupree commented Mar 13, 2019

Fast paths for DataTimeline modifications #56

Fast paths for DataTimeline modifications #56

Comments

elidupree commented Mar 11, 2019 • edited Loading

Fast paths for DataTimeline modifications

Side note: checking whether we are in the present

Storing the history

Gotcha 1: Accessing freed memory

Gotcha 2: Multiple history types

Gotcha 3: How to store times

Recovering the history

elidupree commented Mar 13, 2019

elidupree commented Mar 13, 2019

elidupree commented Mar 11, 2019 •

edited

Loading