cd docker_build
docker build -t aat .
# clone this repo
git clone --recursive https://github.com/Toizi/attacking_anti_tamper.git
# to build applications with self-checking,
# clone the self-checking repo and check out the right branch
cd attacking_anti_tamper
git clone https://github.com/Toizi/self-checksumming.git
cd self-checksumming && git checkout guard_obfuscation
Prerequisites:
- cmake
- ninja
- gcc/clang
- g++/clang++
On debian: sudo apt update -y && sudo apt install -y clang clang++ ninja-build cmake
- Download current DynamoRIO release from their release page, link to tested version
- Extract to parent directory
- Adjust
DynamoRIO_DIR
path intracer/build.sh
to fit downloaded package version here when using a different version than the one tested - Build desired configuration with
tracer/build.sh {Release|Debug|RelWithDebInfo}
- Resulting files will be put into
build_tracer_{CONFIG}/linux
- Install Triton's dependencies (required version might be lower)
- Build desired configuration with
taint_cpp/build.sh {Release|Debug|RelWithDebInfo}
- Resulting files will be put into
build_taint_cpp_{CONFIG}/linux
Use the run.sh
(deletes and create directories in the $(pwd)
and run_manual.sh
scripts to run an executable while collecting trace data.
Example:
./run.sh ls -la
Will print some info about the trace, print tracer_run_success
and exit.
The directory instrace_logs/
contains the resulting artifacts used by the taint module.
instrace_logs
├── instrace.log <== the instruction trace
├── modules <== modules at the entry point (binary)
│ ├── 0x00007fefa6403000-0x00007fefa6422000-main_ls <== main module
│ ├── 0x00007fefa6621000-0x00007fefa6623000-other_ls
│ ├── 0x00007fefa6623000-0x00007fefa6624000-other_ls
│ ├── 0x00007fefa6624000-0x00007fefa6625000-other_ls
│ ├── 0x00007fefa6626000-0x00007fefa6a26000-other_
│ .................................................
│ └── 0xffffffffff600000-0xffffffffff601000-other_
├── modules.txt <== list of modules + address (txt)
├── saved_contexts.bin <== saved context dumps
└── saved_memories.bin <== saved memory dumps
Get help src/taint_main -h
.
Run attack using previously generated instrace_logs
directory
src/taint_main --json-output patches.json instrace_logs
This creates patches.json
with a simple format.
{
"patches": [
{ "address": 140667263185782, "asm_string": "jmp 0x7fefa640f3d0", "data_hex": "eb58" },
{ "address": 140667263200819, "asm_string": "mov rdx, r15", "data_hex": "4c89fa90" },
...
]
}
Put data_hex
at address
to place the asm_string
instruction there.
./run.py
is the main entry point for manual execution of the attack.
If you built the taint_cpp
and tracer
modules in Release
mode it should
work out of the box, otherwise you might have to adjust some paths in
the setup_environment
function.
The only arguments that are needed for execution are
- the binary to be executed (positional arg)
--args
arguments that will be passed to the program-i/--input
data that will be written to stdin of the process
There are a lot more arguments but most are fairly specific to the evaluation process used in the thesis.
./eval.py
is a script to run the attack on a whole batch of programs at once.
Since it is unpractical to specify the characteristics of each program, i.e.
which files and arguments it needs or what return code signals success, this
is specified in a json file using the --cmdline-input
switch.
To evaluate the mibench samples, the following command was used. See self-checksumming README on how to compile them.
# build dir is good to have in case the process is interrupted, otherwise /tmp is used
mkdir eval_build_dir
./eval.py self-checksumming/samples/protection_dataset_bin/seed_2 --crack-function mibench_dummy --cmdline-input self-checksumming/samples/cmdline-args/cmdline.json -j 3 --build-dir ./eval_build_dir --use-existing-results -o full_report.json`
Some interesting observations:
- the files in the input directory are expected to have the format
file_name+{obfuscation}.{coverage}
. If multiple obfuscations are used, they can be combined with a dash-
such asfile_name+indir.10-virt.20
. Additionally it is expected that afile_name+none.0
file exists for eachfile_name
--crack-function
is used to specify the function that will be tampered to trigger tamper responses in protected binaries and after the patches from the attack have been applied to check that they are disabled. In this case, the dummy function that is inserted by the self-checking is used.-j 3
allows to specify how many processes to spawn in parallel. While the memory requirements are fairly low, the disk space requirements can get pretty high, depending on the samples that are evaluated. Therefore a low count is used despite having a lot more cores available--use-existing-results
allows to rerun some experiments or add new files to the input directory and just restart with the same command line as used before. Note that the script is not able to determine whether a run was interrupted. Thus you might have to clean up some directories in yourbuild-dir
manually in case you interrupted an execution where a report file was generated.
To generate graphs from a report file from ./eval.py
, use ./eval_report.py
and supply the report path as argument. If no output directory was specified
with -o/--output
, an eval
directory will be created in the directory of the
input report.
The generated reports will have two groupings. One is grouped by obfuscation
configuration and the other by sample.