This program executes and monitors another program, recording its inputs and outputs using $LD_PRELOAD
.
These inputs and outputs can be joined in a provenance graph.
The provenance graph tells us where a particular file came from.
The provenance graph can help us re-execute the program, containerize the program, turn it into a workflow, or tell us which version of the data did this program use.
-
Install Nix with flakes. This can be done on any Linux (including Ubuntu, RedHat, Arch Linux, not just NixOS), MacOS X, or even Windows Subsystem for Linux.
-
If you don't already have Nix on your system, use the Determinate Systems installer.
-
If you already have Nix (but not NixOS), enable flakes by adding the following line to
~/.config/nix/nix.conf
or/etc/nix/nix.conf
:experimental-features = nix-command flakes
-
If you already have Nix and are running NixOS, enable flakes with by adding
nix.settings.experimental-features = [ "nix-command" "flakes" ];
to your configuration.
-
-
If you want to avoid a time-consuming build, add our public cache.
nix profile install --accept-flake-config nixpkgs#cachix cachix use charmonium
If you want to build from source (e.g., for security reasons), skip this step.
-
Run
nix profile install github:charmoniumQ/PROBE#probe-bundled
. -
Now you should be able to run
probe record [-f] [-o probe_log] <cmd...>
, e.g.,probe record ./script.py --foo bar.txt
. See below for more details. -
To view the provenance, run
probe dump [-i probe_log]
. See below for more details. -
Run
probe --help
for more details.
The simplest invocation of the probe
cli is:
probe record <CMD...>
This will run <CMD...>
under the benevolent supervision of libprobe, outputting the probe record to a temporary directory. Upon the process exiting, probe
it will transcribe the record directory and write a probe log file named probe_log
in the current directory.
If you run this again you'll notice it throws an error that the output file already exists, solve this by passing -o <PATH>
to specify a new file to write the log to, or by passing -f
to overwrite the previous log.
probe record
does not pass your command through a shell, any subshell or environment substitutions will still be performed by your shell before the arguments are passed to probe
. But it won't understand flow control statements like if
and for
, shell builtins like cd
, or shell aliases/functions.
If you need these you can either write a shell script and invoke probe record
on that, or else run:
probe record bash -c '<SHELL_CODE>'
Any flag after the first positional argument is treated as an argument to the command, not probe
.
This creates a file called probe_log
. If you already have that file from a previous recording, give probe record -f
to overwrite.
If you get tired of typing probe record ...
in front of every command you wish to record, consider recording your entire shell session:
$ probe record bash
bash$ ls -l
bash$ # do other commands
bash$ exit
$ probe dump
<dumps history for entire bash session>
That's a huge work in progress.
Try exporting to different formats.
probe export --help
-
Follow the previous step to install Nix.
-
Acquire the source code:
git clone https://github.com/charmoniumQ/PROBE && cd PROBE
-
Run
nix develop
. This will leave you in a Nix development shell, with all the development tools you need to develop and build PROBE. It is like a virtualenv, in that it is isolated from your system's pre-existing tools. In the development shell, we all have the same version of Python with all the same packages. You can exit it by dypingexit
. -
From within the development shell, type
just compile
. This compiles the Rust, C, and generated-Python components. If you hack on either, runjust compile
again before continuing. -
The manually-written Python scripts should already be added to the
$PYTHONPATH
. You should be able to edit them in place. -
Run
probe <args...>
orpython -m probe_py.manual.cli <args...>
to invoke the Rust or Python code respectively. -
Before submitting a PR, run
just pre-commit
which will run pre-commit checks.
libprobe
: Library that implements interposition (C, Make, Python; happens to be manual and code-gen).libprobe/include
: Headers that will be used by the Rust wrapper to read PROBE data.libprobe/src
: Main C sources oflibprobe
.libprobe/generator
: Python and C-template code-generator.libprobe/generated
: (Generated, not committed to Git) output of code-generation.libprobe/Makefile
: Makefile that runs all oflibprobe
; runjust compile-cli
to invoke.
cli-wrapper
: (Cargo workspace) code that wraps libprobe.cli-wrapper/cli
: (Cargo crate) main CLI.cli-wrapper/lib
: (Cargo crate) supporting library functions.cli-wrapper/macros
: (Cargo crate) supporting macros; they use structs fromlibprobe/include
to create Rust structs and Python dataclasses.cli-wrapper/frontend.nix
: Nix code that builds the Cargo workspace; Gets included inflake.nix
.
probe_py
: Python Code that implements analysis of PROBE data (happens to be manual and code-gen), should be added to$PYTHONPATH
bynix develop
probe_py/probe_py
: Main package to be imported or run.probe_py/pyproject.toml
: Definition of main package and dependencies.probe_py/tests
: Python unittests, i.e.,from probe_py import foobar; test_foobar()
; Runjust test-py
.probe_py/mypy_stubs
: "Stub" files that tell Mypy how to check untyped library code. Should be added to$MYPYPATH
bynix develop
.
tests
: End-to-end opaque-box tests. They will be run with Pytest, but they will not test Python directly; they should alwayssubprocess.run(["probe", ...])
. Additionally, some tests have to be manually invoked.docs
: Documentation and papers.benchmark
: Programs and infrastructure for benchmarking.benchmark/REPRODUCING.md
: Read this first!
flake.nix
: Nix code that defines packages and the devshell.setup_devshell.sh
: Helps instantiate Nix devshell.Justfile
: "Shortcuts" for defining and running common commands (e.g.,just --list
).