Skip to content

Latest commit

 

History

History
316 lines (218 loc) · 10 KB

README.md

File metadata and controls

316 lines (218 loc) · 10 KB

Genny 🧞‍

Genny is a workload-generator library and tool. It is implemented using C++17.

Build and Install

Here're the steps to get Genny up and running locally:

  1. Install the development tools for your OS.

    • Ubuntu 18.04: sudo apt install build-essential
    • Red Hat/CentOS 7/Amazon Linux 2: sudo yum groupinstall "Development Tools"
    • Arch: Everything should already be set up.
    • macOS: xcode-select --install
    • Windows: https://visualstudio.microsoft.com/
  2. Make sure you have a C++17 compatible compiler and Python 3.7 or newer. The ones from mongodbtoolchain v3 are safe bets if you're unsure. (mongodbtoolchain is internal to MongoDB).

  3. ./scripts/lamp [--linux-distro ubuntu1804/rhel7/amazon2/arch] once you have Python 3.7+ installed.

    This command downloads Genny's toolchain, compiles Genny, and installs Genny to dist/. You can rerun this command at any time to rebuild Genny. If your OS isn't the supported, please let us know in #workload-generation slack or on GitHub.

    Note that the --linux-distro argument is not needed on macOS.

    You can also specify --build-system make if you prefer to build using make rather than ninja. Building using make may make some IDEs happier.

    If you get python errors from lamp such as this:

    TypeError: __init__() got an unexpected keyword argument 'capture_output'

    ensure you have a modern version of python3. On a Mac, run brew install python3 (assuming you have homebrew installed) and then restart your shell.

IDEs and Whatnot

We follow CMake and C++17 best-practices so anything that doesn't work via "normal means" is probably a bug.

We support using CLion and any conventional editors or IDEs (VSCode, emacs, vim, etc.). Before doing anything cute (see CONTRIBUTING.md), please do due-diligence to ensure it's not going to make common editing environments go wonky.

If you're using CLion, make sure to set CMake options (in settings/preferences) so it can find the toolchain.

The cmake command is printed when lamp runs, you can copy and paste the options into Clion. The options should look something like this:

-G some-build-system \
-DCMAKE_PREFIX_PATH=/data/mci/gennytoolchain/installed/x64-osx-shared \
-DCMAKE_TOOLCHAIN_FILE=/data/mci/gennytoolchain/scripts/buildsystems/vcpkg.cmake \
-DVCPKG_TARGET_TRIPLET=x64-osx-static

If you run ./scripts/lamp -b make it should set up everything for you. You just need to set the "Generation Path" to your build directory.

Lint Workload YAML Files and Generate Test Reports

Please refer to src/python/README.md for more information on how to lint YAML files and generating test reports.

Running Genny Self-Tests

Genny has self-tests using Catch2. You can run them with the following command:

# Build Genny first: `./scripts/lamp [...]`
./scripts/lamp cmake-test

For more fine-tuned testing (eg. running a single test or excluding some) you can manually invoke the test binaries:

# Build Genny first: `./scripts/lamp [...]`
./build/src/gennylib/gennylib_test "My testcase"

Read more about what parameters you can pass here.

Benchmark Tests

The above cmake-test line also runs so-called "benchmark" tests. They can take a while to run and may fail occasionally on local developer machines, especially if you have an IDE or web browser open while the test runs.

If you want to run all the tests except perf tests you can manually invoke the test binaries and exclude perf tests:

# Build Genny first: `./scripts/lamp [...]`
./build/src/gennylib/gennylib_test '~[benchmark]'

Actor Integration Tests

The Actor tests use resmoke to set up a real MongoDB cluster and execute the test binary. The resmoke yaml config files that define the different cluster configurations are defined in src/resmokeconfig.

resmoke.py can be run locally as follows:

# Set up virtualenv and install resmoke requirements if needed.
# From Genny's top-level directory.
python /path/to/resmoke.py --suite src/resmokeconfig/genny_standalone.yml

Each yaml configuration file will only run tests that are associated with their specific tags. (Eg. genny_standalone.yml will only run tests that have been tagged with the "[standalone]" tag.)

When creating a new Actor, create-new-actor.sh will generate a new test case template to ensure the new Actor can run against different MongoDB topologies, please update the template as needed so it uses the newly-created Actor.

Patch-Testing and Evergreen

When restarting any of Genny's Evergreen self-tests, make sure you restart all the tasks not just failed tasks. This is because Genny's tasks rely on being run in dependency-order on the same machine. Rescheduled tasks don't re-run dependent tasks.

Debugging

IDEs can debug Genny if it is built with the Debug build type:

./scripts/lamp -DCMAKE_BUILD_TYPE=Debug

Running Genny Workloads

First install mongodb and start a mongod process:

brew install mongodb
mongod --dbpath=/tmp

Then build Genny (see above for details):

And then run a workload:

./build/src/driver/genny run                                        \
    --workload-file       ./src/workloads/scale/InsertRemove.yml    \
    --metrics-format      csv                                       \
    --metrics-output-file build/genny-metrics.csv                   \
    --mongo-uri           'mongodb://localhost:27017'

Logging currently goes to stdout, and time-series metrics data is written to the file indicated by the -o flag (./genny-metrics.csv in the above example).

Post-processing of metrics data is done by Python scripts in the src/python directory. See the README there.

Creating New Actors

To create a new Actor, run the following:

./scripts/create-new-actor.sh NameOfYourNewActor

Workload YAMLs

Workload YAMLs live in src/workloads and are organized by "theme". Theme is a bit of an organic (contrived) concept so please reach out to us on Slack or mention it in your PR if you're not sure which directory your workload YAML belongs in.

All workload yamls must have an Owners field indicating which github team should receive PRs for the YAML. The files must end with the .yml suffix. Workload YAML itself is not currently linted but please try to make the files look tidy.

Workload Phase Configs

If your workload YAML files get too complex or if you would like to reuse parts of a workload in another one, you can define one or more of your phases in a separate YAML file.

The phase configurations live in src/phases. There's roughly one sub-directory per theme, similar to how src/workloads is organized.

For an example external phase config, please see the ExternalPhaseConfig section of the HelloWorld.yml workload.

A couple of tips on defining external phase configs:

  1. Most existing workloads define their options at the Actor level, which is one level above Phases. Because Genny recursively traverses up the YAML to find an option, most of the options can be pushed down and defined at the phase level instead. The notable exceptions are Name, Type, and Threads, which must be defined on Actor.

  2. genny evaluate /path/to/your/workload is your friend. evaluate prints out the final YAML workload with all external phase definitions inlined.

Patch-Testing Genny Changes with Sys-Perf / DSI

Install the evergreen command-line client and put it in your PATH.

Create a patch build from the mongo repository

cd mongo
evergreen patch -p sys-perf

You will see an output like this:

            ID : 5c533a2732f4174bbcb8bb2e
       Created : 35.656ms ago
    Description : <none>
         Build : https://evergreen.mongodb.com/patch/5c533a2732f4174bbcb8bb2e
      Finalized : No

Copy the value of the "ID" field and a browser window to the "Build" URL.

Then, set the Genny module in DSI with your local Genny repo.

cd genny
evergreen set-module -m dsi -i <ID> # Use the build ID from the previous step.

In the browser window, select the workloads you wish to run. Good examples are Linux Standalone / big_update and Linux Standalone / insert_remove.

The task will compile mongodb and will then run your workloads. Expect to wait around 25 minutes.

NB: After the task runs you can call set-module again with more local changes. You can restart the workloads from the Evergreen web UI. This lets you skip the 20 minute wait to recompile the server.

Code Style and Limitations

Don't get cute.

Please see CONTRIBUTING.md for code-style etc. Note that we're pretty vanilla.

Generating Doxygen Documentation

We use Doxygen with a configuration that relies on llvm for its parsing and graphviz for generating diagrams. As such you must compile Doxygen with the appropriate support for these:

brew install graphviz
brew install doxygen --with-llvm --with-graphviz

Then generate and open Doxygen docs with the following:

doxygen .doxygen
open build/docs/html/index.html

Sanitizers

Genny is periodically manually tested to be free of unknown sanitizer errors. These are not currently run in a CI job. If you are adding complicated code and are afraid of undefined behavior or data-races etc, you can run the clang sanitizers yourself easily.

Run ./scripts/lamp --help for information on what sanitizers there are.

To run with ASAN:

./scripts/lamp -b make -s asan
./scripts/lamp cmake-test
# Pick a workload YAML that uses your Actor below
ASAN_OPTIONS="detect_container_overflow=0" ./build/src/driver/genny run ./src/workloads/docs/HelloWorld.yml

The toolchain isn't instrumented with sanitizers, so you may get false-positives for Boost, hence the ASAN_OPTIONS flag.