Author: Pierre Guillou Date: December 2022
This document contains a list of tips and pieces of advice that can help TTK developers to increase their effectiveness.
Here is a list of software applications of interest:
- CMake is the build tool used by TTK. By default, it generates Makefile recipes.
- Ninja is a faster and more modern alternative to Makefiles. It is parallel by default.
- clang-format is the tool used to format TTK's C++ code.
- clang-check allows to quickly check for compiler warnings or errors with Clang messages (often more helpful than GCC ones).
- clang-tidy performs extended checks on C++ code.
- clangd is a LSP server that integrates
clang-format
andclang-check
. - ccache accelerates the build process by storing object files into a separate cache.
- perf is a Linux tool that collects performance statistics at run-time.
- hotspot helps at visualizing
perf
traces. - heaptrack collects and displays the run-time memory footprint.
These environment variables are used by CMake to customize the build process:
CMAKE_GENERATOR=Ninja
replaces GNU Make by Ninja as build tool.CMAKE_COLOR_DIAGNOSTICS=ON
enables coloration on compiler warnings with Ninja (coloration is enabled by default with GNU Make).CMAKE_EXPORT_COMPILE_COMMANDS=ON
generates a compilation database in the build directory. This compilation database can be fed to useful tools such asclang-check
,clang-tidy
orclangd
.CMAKE_CC_COMPILER_LAUNCHER=ccache
andCMAKE_CXX_COMPILER_LAUNCHER=ccache
make CMake useccache
to store generated binary objects in a cache to make faster builds after clean.
These TTK-specific CMake keys can be tweaked to customize the TTK build.
CMAKE_CXX_FLAGS
set-ggdb -fno-omit-frame-pointer
adds debug data into TTK libraries. A debugger (such as GDB) needs it to provide accurate information.TTK_TIME_TARGETS
prints the time taken by the compilation of every translation unit (.cpp
file). Quite useful for learning where the build time is spent.TTK_REDUCE_TEMPLATE_INSTANTIATIONS
uses a reduced list of template instantiations for the VTK layer. May not work with allttk-data
state files.TTK_SCRIPTS_PATH
is only used by theDimensionReduction
module by now. It points to the folder that should contain thedimensionReduction.py
Python script file (at run-time). By default, it points to the install prefix but it can be modified to point to the relevant source directory (to skip the installation process).
Each time a PR is opened on GitHub, the CI checks that the provided
code is correctly formatted using clang-format
. The formatting
step is often forgotten in the PR, so here is a one-liner that uses
Git to format everything that have changed since the last commit in
the reference repository (here the origin
remote).
git-clang-format origin/dev
clang-check is a CLang tool that quickly checks the syntax of
C/C++ code without generating assembly to provide warnings that often
are more understandable that GCC ones. To use it, generate a
compilation database (CMAKE_EXPORT_COMPILE_COMMANDS=ON
) in the build
directory, then call clang-check
on the failing C++ source file (.cpp
):
clang-check -p build core/vtk/ttkPersistenceDiagram.cpp
Here, the -p build
passes the path to the build directory to
clang-check
.
clang-tidy is quite similar but offers more extensive checks of
C/C++ code but takes more time to complete. It is use the same way as
clang-check
. There's a list of enabled and disabled warnings in the
.clang-tidy file.
CI workflows can be launched in GitHub forks (here the public
remote) of the TTK repository by pushing tags.
For launching the check_code
workflow, use
git tag -f check && git push -f public check
For launching the test_build
workflow, use
git tag -f test_build && git push -f public test_build
For launching the packaging
workflow, use
git tag -f package && git push -f public package
To avoid cluttering the /usr/local
prefix, I advise to install
ParaView in a prefix inside the user's $HOME
. To do so, first set
the CMAKE_INSTALL_PREFIX
to the desired prefix (for instance
$HOME/install
) during ParaView configuration. Once ParaView is built
and installed inside this prefix, setting these environment variables
makes ParaView available to the shell.
PV_PREFIX=$HOME/install/
export PATH=$PV_PREFIX/bin:$PATH
export CMAKE_PREFIX_PATH=$PV_PREFIX/lib/cmake:$CMAKE_PREFIX_PATH
export LD_LIBRARY_PATH=$PV_PREFIX/lib:$LD_LIBRARY_PATH
export PYTHONPATH=$PV_PREFIX/lib/python<ver>/site-packages:$PYTHONPATH
Similarly, TTK can also be installed in a separate prefix OR we can
skip the installation process and use directly the binaries located
inside the build directory (here $HOME/ttk/build
). These environment
variables helps ParaView to find the TTK plugin:
TTK_BUILD_DIR=$HOME/ttk/build
export PATH=TTK_BUILD_DIR/bin:$PATH
export LD_LIBRARY_PATH=$TTK_BUILD_DIR/lib:$LD_LIBRARY_PATH
export PYTHONPATH=$TTK_BUILD_DIR/lib/python<ver>/site-packages:$PYTHONPATH
export PV_PLUGIN_PATH=$TTK_BUILD_DIR/lib/TopologyToolKit:$PV_PLUGIN_PATH
The TTK_SCRIPTS_PATH
CMake key should also point to the source
directory core/base/dimensionReduction/
to find the
dimensionReduction.py
Python script at run-time.
GDB is the default debugger of the GNU Project. It it mostly
used to stop the execution of a program at any point to examine all
the local variables. For TTK, it is possible to use GDB at the module
level by including debugging information and reducing the compiler
optimization level. Edit the local CMakeLists.txt
file inside the
TTK directory and add this line:
target_compile_options(<module> PUBLIC -ggdb -O0)
GDB can now track individual variables in the program.
GDB supports breakpoints, i.e. stopping the execution of the program
on a specific line of code. However, it is quite hard to use this
feature with TTK when used through ParaView, as TTK is dynamically
loaded during the ParaView execution. Besides, GDB breakpoints often
slow down the execution. Fortunately, there is a way to insert a
breakpoint inside the C++ source code: just use the kill
C function
with the SIGINT
signal, as displayed below.
// for SIGINT
#include <csignal>
int foo() {
kill(getpid(), SIGINT);
}
The debugger will stop the execution at the kill
line and give back
control to the user, who can now examine the variables and continue
the execution, either line by line or untile the termination of the
program.
The Address Sanitizer tracks memory accesses at run-time and
helps solving segfaults. Inside TTK, it can be enabled for individual
modules. In the CMakeLists.txt
file inside the TTK
directory, add these lines:
target_compile_options(<module> PUBLIC -fsanitize=address)
target_link_options(<module> PUBLIC -fsanitize=address)
The LD_PRELOAD
environment variable should be set to preload the
ASan libraries at run-time.
The Address Sanitizer really shines in conjunction with GDB. With
those lines in $HOME/.gdbinit
:
set environment ASAN_OPTIONS=abort_on_error=1:detect_leaks=0
set environment LD_PRELOAD=/usr/lib/libasan.so
GDB will be able to stop at the first ASan error. The user then can examine the backtrace. Note that ParaView takes a very long time to load with ASan pre-loaded. It is often more convenient to debug a Python script launched by the Python interpreter than a ParaView state file.
Launch your program inside GDB with gdb -ex run --args <your command>
.
new
, delete
and their C counterparts malloc
and free
are not
needed anymore in modern C++. They can easily be replaced by
std::vector
, std::shared_ptr
or std::unique_ptr
which prevent
memory leaks. Sometimes the heap object can even be placed on the
stack. TTK is currently mostly free of manual memory management,
please keep it this way.
To better know where a TTK module spends computation time, the perf tool can be used to collect traces from performance counters on an execution of any module.
For instance, in ttk-data
, the command
perf record --call-graph dwarf paraview --state=states/dragon.pvsm
records in the perf.data
file the traces of the execution of the
dragon.pvsm
state file (also working with Python scripts and
standalones).
If traces seem to be missing data, don't forget to build TTK with
debugging info (-ggdb -fno-omit-frame-pointer
).
To visualize the traces, call hotspot directly inside the
directory that contains the perf.data
. Hotspot provides interesting
visualizations of how the CPU cycles are spent with the Flamegraph.
To know the memory footprint, use heaptrack. This time, the traces generator and the visualization application are the same.
heaptrack paraview --state=states/dragon.pvsm
will record the memory consumption of the state file, then open the visualization application.
git remote
allow a local Git repository to point to several
remotes. It can be useful for fetching TTK latest commits from the
reference repository (https://github.com/topology-tool-kit/ttk) and
pushing them into a GitHub fork.
To do so, use
git remote add <remote-name> git@github:<user>/<repo>
to add another remote to the local repository.
Then, git fetch --all
will synchronize & fetch the latest commits
from GitHub into the local repository. git pull <remote> <branch>
will synchronize a local branch with its remote.