Track evaluation dependencies and cache results #1517

jonatanklosko · 2022-11-08T23:46:49Z

This adds a mechanism for tracking how cells depend on each other in terms of variables, imports, aliases, modules, process dictionary, etc. Based on this information cells are marked as "stale" and reevaluated only if necessary.

Note: this requires Elixir v1.14.2 to work as expected for variables.

Motivation

Currently, whenever a cell is evaluated, all subsequent cells are marked as stale and require reevaluation. This happens regardless of whether those cells depend on the evaluated cell. This simple approach ensures reproducability by always evaluating cells sequentially.

The main issue with this "greedy" approach is that a cell may do a long computation and changing anything above it require running the long computation again.

Idea

We now track which identifiers each cell references and defines (or redefines), then when a cell is reevaluated we know which cells it affects and we mark only those as "stale".

At evaluation level, instead of storing full evaluation context (all variables/aliases after an evaluation), we store diffs (new variables/aliases defined during an evaluation). Then, when evaluating a cell, we combine all the diffs from previous cells into full evaluation context. For example:

# Cell 1
x = 1

# Cell 2
y = 1

# Cell 3
x + y

The diffs for cells 1 and 2 are [x: 1] and [y: 1] respectively. Now, when we change the first cell to x = 2, the diff becomes [x: 2]. Then to evaluate cell 3 we merge [x: 2] with [y: 1] and have [x: 2, y: 1] as the context, without reevaluating cell 2.

Implementation details

Session data

On the Livebook side, each cell has an additional information:

%{
  ...,
  identifiers_used: list(identifier :: term()) | :unknown,
  identifiers_defined: %{(identifier :: term()) => version :: term()},
}

An identifier can be anything, a variable name, a module name, a fixed term such as :pdict. Each defined identifier has a version, which again can be anything, an hash digest, a random id, a fixed value.

This information is used when computing which cells are stale. To determine cell validity we already compute snapshots, but now a cell snapshot looks only at the parent cells that define identifiers used by that cell, and the identifier versions.

Evaluator

On the Runtime side (specifically in the evaluator), after an evaluation we determine the identifiers it depends on, mostly by using a compilation tracer. The identifiers are reported/tracked with varying granularity, for example we have {:variable, name}, {:module, name} to track individual variables/modules, but we also have a single identifier :pdict to atomically track the process dictionary.

Depending on the identifier type, we approach the "version" differently:

for variables it's a random id (reevaluating a cell like x = 1 changes the snapshots anyway)
for modules we compute MD5
for pdict and imports we compute phash2
for aliases we use the alias expanded value
for requires we use a fixed :ok version

lib/livebook/runtime/evaluator.ex

lib/livebook/runtime/evaluator/tracer.ex

Track evaluation dependencies and cache results

6f522a2

jonatanklosko force-pushed the jk-evaluation-caching branch from c33d246 to 97540fb Compare November 8, 2022 23:53

Fix compilation on Elixir < v1.14.2

dd6cc71

jonatanklosko force-pushed the jk-evaluation-caching branch from 97540fb to dd6cc71 Compare November 9, 2022 00:00

josevalim reviewed Nov 9, 2022

View reviewed changes

lib/livebook/runtime/evaluator.ex Outdated Show resolved Hide resolved

josevalim reviewed Nov 9, 2022

View reviewed changes

lib/livebook/runtime/evaluator/tracer.ex Outdated Show resolved Hide resolved

Updates

2ec7805

josevalim approved these changes Nov 9, 2022

View reviewed changes

jonatanklosko added 2 commits November 9, 2022 14:32

Bump CI

e344e2d

Fix deprecation

a42612b

jonatanklosko merged commit 484e471 into main Nov 9, 2022

jonatanklosko deleted the jk-evaluation-caching branch November 9, 2022 13:42

jonatanklosko mentioned this pull request Nov 15, 2022

Fix reading inputs rendered in the same evaluation #1531

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track evaluation dependencies and cache results #1517

Track evaluation dependencies and cache results #1517

jonatanklosko commented Nov 8, 2022 •

edited

Loading

Track evaluation dependencies and cache results #1517

Track evaluation dependencies and cache results #1517

Conversation

jonatanklosko commented Nov 8, 2022 • edited Loading

Motivation

Idea

Implementation details

Session data

Evaluator

jonatanklosko commented Nov 8, 2022 •

edited

Loading